Understanding Fax Imaging

An in-depth look at how fax images are constructed and an efficient way to decode and display them.

Several articles have already appeared describing how electronic facsimile works; what I plan to do in this article is go into detail about working with fax image data and provide an efficient way of decoding those images.

With the low cost of fax modems in the past couple of years, a great many of today's faxed documents never touch actual paper; they are sent and received purely as electronic images between two computers. Recent advances in software have brought an unprecedented level of convenience to composing and sending these images, but it's the receiving end that this article is going to focus on. For programmers who want to write their own fax software, or those who want to write their own post-processing code, I will explain the whys and hows of fax data encoding as well as efficient code to decompress them.

A little background is necessary to understand how the fax standard came about. Before the days of the digital fax, there was the analog fax; used mostly for transmitting weather pictures, the original fax was typically sent over radio waves and used audio modulation to encode intensity levels. Being an analog signal, the resulting images had various levels of gray and were quite pleasing to the eye. The problems with these images were the long transmission time and fidelity of the signal they required. In the early 1970's various digital imaging methods were being experimented with, but the cost of the equipment was too expensive to be practical. In the late 1970's, when microprocessors, memory and modems were beginning to be reasonably priced, the digital fax was practical to implement. The CCITT (International Telegraph and Telephone Consultative Committee) met and ratified the Group 3 fax transmission standard. This standard specifies a method of transmitting black and white image data, in a compressed form, over standard telephone lines. The specification outlined how the image data is created, and the audio signal protocol between the sender and receiver.

Data compression was a concern for faxes from the beginning because scanned image data tends to be rather large. Consider that a 200x200 dpi 8.5x11" page (1728x2200 pixels) scanned as 1 bit per pixel consumes 475200 bytes. To transmit this much data at 9600 baud would take almost seven minutes. Considerable cost savings can be had by reducing the amount of data to transmit.

After analyzing many typical printed pages, a scheme was invented to handle both compressing the data and recovering from transmission errors. The 'fax' way of looking at a printed page is to see a series of horizontal 'runs' on each line, alternating between black and white. Each line is encoded independently and a sync pattern is added to the end. This scheme allows each line to be decoded independently so, in case of an error in transmission, only the erroneous lines will be wrong and the rest of the document can continue without problems by resynching on the next valid line. By encoding the images as run-length data, this allowed for error recovery, but did not do a great job of compressing the data. Statistical evidence showed that the typical page was composed of many small black runs and fewer long white runs. By using a modified Huffman encoding scheme, the statistical probability of these run lengths could be used to achieved a reasonable compression ratio: frequently occurring run lengths where assigned the shortest codes and infrequently occurring run lengths were assigned the longest codes. The statistical information used for encoding fax images is a fixed table so that it does not have be transmitted with each page, but being fixed information it will not always be optimal for all images.

The CCITT later improved on the original one-dimensional encoding scheme by adding a two-dimensional option; this option allows better compression ratios by encoding each line as the changes from the previous line. To allow for resynchronization after errors, small groups of lines are encoded this way so that an error will not corrupt the whole page. These two schemes were given the name G3 encoding which stands for the third group meeting of the CCITT. A short while later, another scheme named G4 was created for use on error-free channels. G4 is basically the same the the G3 2-dimensional scheme except that it codes the entire document two-dimensionally with no resynchronization breaks; since each line is dependent on the one above it, a single error will propagate all the way down the page. G4 was originally meant to be used over ISDN, but has gotten widespread use as a compact image archiving format for storage of bilevel images.

Because of the complexity of the two-dimensional image encoding scheme, I will cover the one-dimensional scheme in this article and leave describing the two-dimensional method for the future.

Specifics of fax data encoding
1) Each page starts with one EOL (end of line = 000000000001) and ends with six EOLS (called an RTC - return to control).
2) Each line is assumed to start with a white run, if the image line starts with black, a white run of zero is encoded.
3) Each run of pixels is encoded as zero or more make-up codes (multiples of 64) followed by a terminating code (0-63).
4) Each line alternates encoding runs of white and black pixels and is terminated by zero or more fill bits (zeros) followed by an EOL.

Fill bits are used to ensure that the receiving equipment has enough time to print each line. At the start of a fax transmission session, the receiver tells the transmitter how fast it can print each line; with this information and the data transmission rate, the transmitter must ensure that each line contains enough bits to give the receiver time enough to print it. Any lines that are too short at padded with zeros placed in front of the EOL code.

The following example encoded line demonstrates all of these concepts at once. The line is the first line of the image and is five black pixels followed by 1723 white pixels. The receiver has indicated that it needs 10ms to print a line and the transmission is taking place at 4800 baud. Therefore, each line must be a minimum of 48 bits in length to give the receiver time to print it.

EOL W0 B5 FILL EOL
000000000001 00110101 0011 000000000000 000000000001 = 48 bits
00 13 53 00 00 01 as hexadecimal bytes

The encoding of the runs of pixels as Huffman (minimum redundancy) codes presents a problem for writing an efficient decoder since the codes do not fall neatly on byte boundaries. The creation of a Huffman code table is based on the probabilities of each code and assigns the shortest code to the most probable symbol and the longest code to the least probable. The codes are also designed to not need separators; each code is uniquely decodable. The codes can be decoded by picking up one bit at a time and traversing the Huffman 'tree'; branching left for a zero and right for a one. This solution is the most obvious one from the creation of the codes, but is definitely not the most efficient. The nature of Huffman codes allows a more elegant solution to this problem. Since the codes are designed to be unique sequences of bits when decoded from start to finish, we can use this fact to construct a table for decoding each code in one iteration instead of one iteration per bit. By using the code as an address to index into our table, we can decode the bit sequence in one step. Suppose we look at the white terminating codes from table-1 and design a decoder table for them. These codes are from four to eight bits in length. If we were to left justify the code (place the first bit of the code in the MSB of a word) and use it as an address to index our lookup table, we would see that for codes shorter than eight bits, there would need to be repeated entries in our table for all possible bit patterns. For example, if the code we are decoding is for a white run length of four, the bit sequence would be 1011XXXX where the last four bits are actually the beginning of the next code word. Therefore, our table would have to have sixteen identical entries all with a code value of four and a length of four bits. This would lead one to believe that a table of this type representing all of the fax codes would require a great deal of space because of all the repeated entries, but the way I devised the decode tables they require a total of only 4352 bytes.

The heart of efficient decoding of the variable length fax codes is looking at the code table the 'right' way. According to one of the IBM researchers (who shall remain nameless) who helped invent the fax standard, the fax Huffman codes were designed to fit within 8 bits and have a variable number of leading zeros. If you look at table-1 you will see that this is true, but decoding them based on this fact would lead to an inefficient decoder. If the codes are looked at as having 2 to 13 bits in length, a table to describe them all would require 2^13 or 8192 pairs of entries. I have studied these tables for quite some time and what I came up with is a non-obvious grouping of the codes which leads to three distinct, small tables and a much more efficient decoder (see table 1). The three groups the codes fall into are:
1) White short - Up to nine bits in length, which represent terminating and make-up codes.
2) Black short - Up to six bits in length, which represent some of the black terminating codes.
3) Long white/black - Up to 13 bits in length and at least the first 4 bits are zero, which represent some of all the code types.

Files and Bit-Order
Fax data is typically received from the UART with the first received bit stored in the LSB of the byte. Typical fax programs just dump this data to disk and end up storing the bits in the same order. This data order would seem to be easy to work with on Intel little-endian machines, but to use our code-as-address scheme, it requires that the first bit of the code reside in the MSB of the byte. As can be seen in listing-1, the DoMirror() routine takes care of reversing the bit order of the data bytes. Since the program is very dependent on byte ordering, I defined macros to allow it to work properly on little and big-endian machines (e.g. PowerPC).

The Huffman Decoder
Listing 1 shows the function ClimbWhite(); this and ClimbBlack() do the decoding of each black and white run-length. These functions are almost identical except that they reference different lookup tables. The macro MOTOSHORT is defined in FAX.H and it reads 16 bits from the data stream in big-endian order (MSB first). The purpose of the macro is to allow the same code to be compiled on both big and little-endian machines.

The while() loop collects makeup codes (lengths > 63) until a terminating code (lengths < 64) is reached. To decode a run-length, 16-bits are read from the input data and shifted so that the first bit of the code is positioned at the MSB of the short variable. 16 is a very convenient word size because each of the codes can be up to 9 bits in length and up to 7 bits offset into a byte.

After the codeword is read, the upper 4 bits are tested to see if they are zeroes; this step determines which lookup table will be used. If the first 4 bits are zeroes, the current pointer is shifted forward by 4 bits, and another 16 bits are read. The actual bit length of the code is then referenced from the lookup table, as well as the graphic run length. This process repeats, with the run lengths summed together, until a terminating code is encountered.

The main fax decode routine scans each line in this manner until the RTC (six EOLs) is encountered. Each line alternates between white and black until an EOL or 1728 pixels are decoded.

The Viewer
Discussion of fax image encoding would not be complete without a way to show it off. Included in this article is a simple viewer capable of loading and displaying WinFAX(tm) files. I chose WinFAX since it is a popular Windows fax program and has a very simple file format; all files are stored with the 1-dimensional coding scheme and for various versions of the program, the header can be skipped by starting at offset 177. Not all fax files will read correctly with this assumption, but to keep the program simple, I decided not to add a complex header munching function. Once a fax file is selected, the program displays the page so as to fit completely within the client area of the window. The performance of this program is reasonable considering that it is written in C. In this article, I have revealed several techniques for improving the performance of decoding Huffman data, but there are other techniques for improving the performance of displaying the image data. For an idea of what can be done about the performance, download my document management program 'Image Edition' from our web site and see for yourself.

Going Further
Once you compile and run the viewer program you will notice that the time to read and decode the fax file is quite fast, but the display update time is not. The blame for this rests entirely with Microsoft's StretchBlt() API. It is really amazing how slow this API is; this is true for OS/2, Win3.1, Win95 and WinNT. The original programmer at Microsoft used a very poor algorithm for the StretchBlt() and this code was used in both Windows and OS/2. To this day, no one at either IBM nor Microsoft has gone back to correct this mistake. StretchBlt() is just one area of the program that can be improved, below is a summary of what else can be done to improve the program:

1) Scale the drawing of the image before using GDI functions; for Win32, use the CreateDIBSection() API to return a pointer to the bitmaps memory to eliminate the need to use SetBitmapBits().
2) Use the known statistical information about fax data (black spots on a white page) to optimize the decoding and drawing functions.
3) The file loading section makes some big assumptions about the input file type and header size. The value of 177 I used as an offset is a kludge; add a real header muncher.
4) Add scroll bar and zoom logic to the viewer to make it friendlier.
5) The current viewer assumes that all faxes are 204x192 dpi; Add logic to check for 204x96 dpi pages also.
6) The viewer was designed for Win32, but with some minor changes it would be easy to make it work in OS/2 or a 16-bit Windows environment.
7) Write time critical sections of the program in assembler; I did.

The Sky is the Limit
Hopefully I have inspired some of you to dig deeper into the workings of fax imaging. If given the space in future issues, I will reveal other useful methods of working with Huffman data and ways of improving the performance of displaying bilevel images. For an idea of performance tuning to the Nth degree, download a free copy of my imaging software, Image Edition from our web site.


/***************************************************************
*                                                              *
* FUNCTION : ClimbWhite(unsigned char *, int *, int *)         *
*                                                              *
* PURPOSE : Retrieve the next Huffman code.                    *
*                                                              *
***************************************************************/

short ClimbWhite(unsigned char *buf, int *bitnum, int *offset)
{
register short sCode = 0;
/* Start out with total = 0 */
register int iOff, iBit, iLen;
register unsigned short usBits;

   iBit = *bitnum; /* Faster to use a direct vars */
   iOff = *offset;

   iLen = 64; /* force first pass through loop */

   while (iLen > 63) /* Until a terminating code is found */
      {
      usBits = MOTOSHORT((buf+iOff)) << iBit;
      if (usBits < 0x1ff) /* Long codeword? */
         {
         iBit += 4; /* Skip the first 4 bits and re-get it */
         iOff += iBit >> 3;
         iBit &= 7;
         usBits = MOTOSHORT((buf+iOff)) << iBit;
         usBits &= 0xff80;
         usBits >>= 6;
         iBit += black_l[usBits] - 4;
         if (iBit < 0) /* special case for filler bits */
            {
            iBit += 8;
            iOff--;
            }
         iOff += iBit >> 3;
         iBit &= 7;
         iLen = black_l[usBits + 1];
         }
      else /* Short codeword */
         {
         usBits &= 0xff80; /* Trim off the rightmost 7 bits */
         usBits >>= 6; /* Get an index = code*2 */
         iBit += white_s[usBits]; /* Add code length to offset */
         iOff += iBit >> 3;
         iBit &= 7;
         iLen = white_s[usBits+1]; /* Get the run length */
         } /* Short codes */
      sCode += iLen;
      } /* while */
  
*offset = iOff;
   *bitnum = iBit;
   return sCode;
} /* ClimbWhite() */

Table 1
Terminating Codes
run length white code black code
      0    00110101   0000110111
      1    000111     010
      2    0111       11
      3    1000       10
      4    1011       011
      5    1100       0011
      6    1110       0010
      7    1111       00011
      8    10011      000101
      9    10100      000100
      10   00111      0000100
      11   01000      0000101
      12   001000     0000111
      13   000011     00000100
      14   110100     00000111
      15   110101     000011000
      16   101010     0000010111
      17   101011     0000011000
      18   0100111    0000001000
      19   0001100    0000100111
      20   0001000    00001101000
      21   0010111    00001101100
      22   0000011    00000110111
      23   0000011    00000101000
      24   0101000    00000010111
      25   0101011    00000011000
      26   0010011    000011001010
      27   0100100    000011001011
      28   0011000    000011001100
      29   00000010   000011001101
      30   00000011   000001101000
      31   00011010   000001101001
      32   00011011   000001101010
      33   00010010   000001101011
      34   00010011   000011010010
      35   00010100   000011010011
      36   00010101   000011010100
      37   00010110   000011010101
      38   00010111   000011010110
      39   00101000   000011010111
      40   00101001   000001101100
      41   00101010   000001101101
      42   00101011   000011011010
      43   00101100   000011011011
      44   00101101   000001010100
      45   00000100   000001010101
      46   00000101   000001010110
      47   00001010   000001010111
      48   00001011   000001100100
      49   01010010   000001100101
      50   01010011   000001010010
      51   01010100   000001010011
      52   01010101   000000100100
      53   00100100   000000110111
      54   00100101   000000111000
      55   01011000   000000100111
      56   01011001   000000101000
      57   01011010   000001011000
      58   01011011   000001011001
      59   01001010   000000101011
      60   01001011   000000101100
      61   00110010   000001011010
      62   00110011   000001100110
      63   00110100   000001100111

Make-up Codes
run length white code black code
      64   11011      000000111
      128  10010      000011001000
      192  010111     000011001001
      256  0110111    000001011011
      320  00110110   000000110011
      384  00110111   000000110100
      448  01100100   000000110101
      512  01100101   0000001101100
      576  01101000   0000001101101
      640  01100111   0000001001010
      704  011001100  0000001001011
      768  011001101  0000001001100
      832  011010010  0000001001101
      896  011010011  0000001110010
      960  011010100  0000001110011
      1024 011010101  0000001110100
      1088 011010110  0000001110101
      1152 011010111  0000001110110
      1216 011011000  0000001110111
      1280 011011001  0000001010010
      1344 011011010  0000001010011
      1408 011011011  0000001010100
      1472 010011000  0000001010101
      1536 010011001  0000001011010
      1600 010011010  0000001011011
      1664 011000     0000001100100
      1728 010011011  0000001100101

Extended Make-up codes (white + black)
      1792 00000001000
      1856 00000001100
      1920 00000001101
      1984 000000010010
      2048 000000010011
      2112 000000010100
      2176 000000010101
      2240 000000010110
      2304 000000010111
      2368 000000011100
      2432 000000011101
      2496 000000011110
      2560 000000011111


Back