An in-depth look at how fax images are constructed
and an efficient way to decode and display them.
Several articles have already appeared describing
how electronic facsimile works; what I plan to do in this article
is go into detail about working with fax image data and provide
an efficient way of decoding those images.
With the low cost of fax modems in the past couple
of years, a great many of today's faxed documents never touch actual
paper; they are sent and received purely as electronic images between
two computers. Recent advances in software have brought an unprecedented
level of convenience to composing and sending these images, but
it's the receiving end that this article is going to focus on. For
programmers who want to write their own fax software, or those who
want to write their own post-processing code, I will explain the
whys and hows of fax data encoding as well as efficient code to
decompress them.
A little background is necessary to understand how
the fax standard came about. Before the days of the digital fax,
there was the analog fax; used mostly for transmitting weather pictures,
the original fax was typically sent over radio waves and used audio
modulation to encode intensity levels. Being an analog signal, the
resulting images had various levels of gray and were quite pleasing
to the eye. The problems with these images were the long transmission
time and fidelity of the signal they required. In the early 1970's
various digital imaging methods were being experimented with, but
the cost of the equipment was too expensive to be practical. In
the late 1970's, when microprocessors, memory and modems were beginning
to be reasonably priced, the digital fax was practical to implement.
The CCITT (International Telegraph and Telephone Consultative Committee)
met and ratified the Group 3 fax transmission standard. This standard
specifies a method of transmitting black and white image data, in
a compressed form, over standard telephone lines. The specification
outlined how the image data is created, and the audio signal protocol
between the sender and receiver.
Data compression was a concern for faxes from the
beginning because scanned image data tends to be rather large. Consider
that a 200x200 dpi 8.5x11" page (1728x2200 pixels) scanned
as 1 bit per pixel consumes 475200 bytes. To transmit this much
data at 9600 baud would take almost seven minutes. Considerable
cost savings can be had by reducing the amount of data to transmit.
After analyzing many typical printed pages, a scheme
was invented to handle both compressing the data and recovering
from transmission errors. The 'fax' way of looking at a printed
page is to see a series of horizontal 'runs' on each line, alternating
between black and white. Each line is encoded independently and
a sync pattern is added to the end. This scheme allows each line
to be decoded independently so, in case of an error in transmission,
only the erroneous lines will be wrong and the rest of the document
can continue without problems by resynching on the next valid line.
By encoding the images as run-length data, this allowed for error
recovery, but did not do a great job of compressing the data. Statistical
evidence showed that the typical page was composed of many small
black runs and fewer long white runs. By using a modified Huffman
encoding scheme, the statistical probability of these run lengths
could be used to achieved a reasonable compression ratio: frequently
occurring run lengths where assigned the shortest codes and infrequently
occurring run lengths were assigned the longest codes. The statistical
information used for encoding fax images is a fixed table so that
it does not have be transmitted with each page, but being fixed
information it will not always be optimal for all images.
The CCITT later improved on the original one-dimensional
encoding scheme by adding a two-dimensional option; this option
allows better compression ratios by encoding each line as the changes
from the previous line. To allow for resynchronization after errors,
small groups of lines are encoded this way so that an error will
not corrupt the whole page. These two schemes were given the name
G3 encoding which stands for the third group meeting of the CCITT.
A short while later, another scheme named G4 was created for use
on error-free channels. G4 is basically the same the the G3 2-dimensional
scheme except that it codes the entire document two-dimensionally
with no resynchronization breaks; since each line is dependent on
the one above it, a single error will propagate all the way down
the page. G4 was originally meant to be used over ISDN, but has
gotten widespread use as a compact image archiving format for storage
of bilevel images.
Because of the complexity of the two-dimensional
image encoding scheme, I will cover the one-dimensional scheme in
this article and leave describing the two-dimensional method for
the future.
Specifics of fax data encoding
1) Each page starts with one EOL (end of line = 000000000001) and
ends with six EOLS (called an RTC - return to control).
2) Each line is assumed to start with a white run, if the image
line starts with black, a white run of zero is encoded.
3) Each run of pixels is encoded as zero or more make-up codes (multiples
of 64) followed by a terminating code (0-63).
4) Each line alternates encoding runs of white and black pixels
and is terminated by zero or more fill bits (zeros) followed by
an EOL.
Fill bits are used to ensure that the receiving equipment
has enough time to print each line. At the start of a fax transmission
session, the receiver tells the transmitter how fast it can print
each line; with this information and the data transmission rate,
the transmitter must ensure that each line contains enough bits
to give the receiver time enough to print it. Any lines that are
too short at padded with zeros placed in front of the EOL code.
The following example encoded line demonstrates all
of these concepts at once. The line is the first line of the image
and is five black pixels followed by 1723 white pixels. The receiver
has indicated that it needs 10ms to print a line and the transmission
is taking place at 4800 baud. Therefore, each line must be a minimum
of 48 bits in length to give the receiver time to print it.
EOL W0 B5 FILL EOL
000000000001 00110101 0011 000000000000 000000000001 = 48 bits
00 13 53 00 00 01 as hexadecimal bytes
The encoding of the runs of pixels as Huffman
(minimum redundancy) codes presents a problem for writing an efficient
decoder since the codes do not fall neatly on byte boundaries. The
creation of a Huffman code table is based on the probabilities of
each code and assigns the shortest code to the most probable symbol
and the longest code to the least probable. The codes are also designed
to not need separators; each code is uniquely decodable. The codes
can be decoded by picking up one bit at a time and traversing the
Huffman 'tree'; branching left for a zero and right for a one. This
solution is the most obvious one from the creation of the codes,
but is definitely not the most efficient. The nature of Huffman
codes allows a more elegant solution to this problem. Since the
codes are designed to be unique sequences of bits when decoded from
start to finish, we can use this fact to construct a table for decoding
each code in one iteration instead of one iteration per bit. By
using the code as an address to index into our table, we can decode
the bit sequence in one step. Suppose we look at the white terminating
codes from table-1 and design a decoder table for them. These codes
are from four to eight bits in length. If we were to left justify
the code (place the first bit of the code in the MSB of a word)
and use it as an address to index our lookup table, we would see
that for codes shorter than eight bits, there would need to be repeated
entries in our table for all possible bit patterns. For example,
if the code we are decoding is for a white run length of four, the
bit sequence would be 1011XXXX where the last four bits are actually
the beginning of the next code word. Therefore, our table would
have to have sixteen identical entries all with a code value of
four and a length of four bits. This would lead one to believe that
a table of this type representing all of the fax codes would require
a great deal of space because of all the repeated entries, but the
way I devised the decode tables they require a total of only 4352
bytes.
The heart of efficient decoding of the variable length
fax codes is looking at the code table the 'right' way. According
to one of the IBM researchers (who shall remain nameless) who helped
invent the fax standard, the fax Huffman codes were designed to
fit within 8 bits and have a variable number of leading zeros. If
you look at table-1 you will see that this is true, but decoding
them based on this fact would lead to an inefficient decoder. If
the codes are looked at as having 2 to 13 bits in length, a table
to describe them all would require 2^13 or 8192 pairs of entries.
I have studied these tables for quite some time and what I came
up with is a non-obvious grouping of the codes which leads to three
distinct, small tables and a much more efficient decoder (see table
1). The three groups the codes fall into are:
1) White short - Up to nine bits in length, which represent terminating
and make-up codes.
2) Black short - Up to six bits in length, which represent some
of the black terminating codes.
3) Long white/black - Up to 13 bits in length and at least the first
4 bits are zero, which represent some of all the code types.
Files and Bit-Order
Fax data is typically received from the UART with the first received
bit stored in the LSB of the byte. Typical fax programs just dump
this data to disk and end up storing the bits in the same order.
This data order would seem to be easy to work with on Intel little-endian
machines, but to use our code-as-address scheme, it requires that
the first bit of the code reside in the MSB of the byte. As can
be seen in listing-1, the DoMirror() routine takes care of reversing
the bit order of the data bytes. Since the program is very dependent
on byte ordering, I defined macros to allow it to work properly
on little and big-endian machines (e.g. PowerPC).
The Huffman Decoder
Listing 1 shows the function ClimbWhite(); this and ClimbBlack()
do the decoding of each black and white run-length. These functions
are almost identical except that they reference different lookup
tables. The macro MOTOSHORT is defined in FAX.H and it reads 16
bits from the data stream in big-endian order (MSB first). The purpose
of the macro is to allow the same code to be compiled on both big
and little-endian machines.
The while() loop collects makeup codes (lengths >
63) until a terminating code (lengths < 64) is reached. To decode
a run-length, 16-bits are read from the input data and shifted so
that the first bit of the code is positioned at the MSB of the short
variable. 16 is a very convenient word size because each of the
codes can be up to 9 bits in length and up to 7 bits offset into
a byte.
After the codeword is read, the upper 4 bits are
tested to see if they are zeroes; this step determines which lookup
table will be used. If the first 4 bits are zeroes, the current
pointer is shifted forward by 4 bits, and another 16 bits are read.
The actual bit length of the code is then referenced from the lookup
table, as well as the graphic run length. This process repeats,
with the run lengths summed together, until a terminating code is
encountered.
The main fax decode routine scans each line in this
manner until the RTC (six EOLs) is encountered. Each line alternates
between white and black until an EOL or 1728 pixels are decoded.
The Viewer
Discussion of fax image encoding would not be complete without a
way to show it off. Included in this article is a simple viewer
capable of loading and displaying WinFAX(tm) files. I chose WinFAX
since it is a popular Windows fax program and has a very simple
file format; all files are stored with the 1-dimensional coding
scheme and for various versions of the program, the header can be
skipped by starting at offset 177. Not all fax files will read correctly
with this assumption, but to keep the program simple, I decided
not to add a complex header munching function. Once a fax file is
selected, the program displays the page so as to fit completely
within the client area of the window. The performance of this program
is reasonable considering that it is written in C. In this article,
I have revealed several techniques for improving the performance
of decoding Huffman data, but there are other techniques for improving
the performance of displaying the image data. For an idea of what
can be done about the performance, download my document management
program 'Image Edition' from our web site and see for yourself.
Going Further
Once you compile and run the viewer program you will notice that
the time to read and decode the fax file is quite fast, but the
display update time is not. The blame for this rests entirely with
Microsoft's StretchBlt() API. It is really amazing how slow this
API is; this is true for OS/2, Win3.1, Win95 and WinNT. The original
programmer at Microsoft used a very poor algorithm for the StretchBlt()
and this code was used in both Windows and OS/2. To this day, no
one at either IBM nor Microsoft has gone back to correct this mistake.
StretchBlt() is just one area of the program that can be improved,
below is a summary of what else can be done to improve the program:
1) Scale the drawing of the image before using GDI
functions; for Win32, use the CreateDIBSection() API to return a
pointer to the bitmaps memory to eliminate the need to use SetBitmapBits().
2) Use the known statistical information about fax data (black spots
on a white page) to optimize the decoding and drawing functions.
3) The file loading section makes some big assumptions about the
input file type and header size. The value of 177 I used as an offset
is a kludge; add a real header muncher.
4) Add scroll bar and zoom logic to the viewer to make it friendlier.
5) The current viewer assumes that all faxes are 204x192 dpi; Add
logic to check for 204x96 dpi pages also.
6) The viewer was designed for Win32, but with some minor changes
it would be easy to make it work in OS/2 or a 16-bit Windows environment.
7) Write time critical sections of the program in assembler; I did.
The Sky is the Limit
Hopefully I have inspired some of you to dig deeper into the workings
of fax imaging. If given the space in future issues, I will reveal
other useful methods of working with Huffman data and ways of improving
the performance of displaying bilevel images. For an idea of performance
tuning to the Nth degree, download a free copy of my imaging software,
Image Edition from our web site.
short ClimbWhite(unsigned char *buf, int *bitnum, int *offset)
{
register short sCode = 0;
register int iOff, iBit, iLen;
register unsigned short usBits;
iBit = *bitnum;
iOff = *offset;
iLen = 64;
while (iLen > 63)
{
usBits = MOTOSHORT((buf+iOff))
<< iBit;
if (usBits < 0x1ff)
{
iBit += 4;
iOff += iBit
>> 3;
iBit &=
7;
usBits = MOTOSHORT((buf+iOff))
<< iBit;
usBits &=
0xff80;
usBits >>=
6;
iBit += black_l[usBits]
- 4;
if (iBit <
0)
{
iBit
+= 8;
iOff--;
}
iOff += iBit
>> 3;
iBit &=
7;
iLen = black_l[usBits
+ 1];
}
else
{
usBits &=
0xff80;
usBits >>=
6;
iBit += white_s[usBits];
iOff += iBit
>> 3;
iBit &=
7;
iLen = white_s[usBits+1];
}
sCode += iLen;
} *offset = iOff;
*bitnum = iBit;
return sCode;
}
Table 1
Terminating Codes
run length white code black code
0 00110101
0000110111
1 000111 010
2 0111 11
3 1000 10
4 1011 011
5 1100 0011
6 1110 0010
7 1111 00011
8 10011 000101
9 10100 000100
10 00111 0000100
11 01000 0000101
12 001000 0000111
13 000011 00000100
14 110100 00000111
15 110101 000011000
16 101010 0000010111
17 101011 0000011000
18 0100111 0000001000
19 0001100 0000100111
20 0001000 00001101000
21 0010111 00001101100
22 0000011 00000110111
23 0000011 00000101000
24 0101000 00000010111
25 0101011 00000011000
26 0010011 000011001010
27 0100100 000011001011
28 0011000 000011001100
29 00000010 000011001101
30 00000011 000001101000
31 00011010 000001101001
32 00011011 000001101010
33 00010010 000001101011
34 00010011 000011010010
35 00010100 000011010011
36 00010101 000011010100
37 00010110 000011010101
38 00010111 000011010110
39 00101000 000011010111
40 00101001 000001101100
41 00101010 000001101101
42 00101011 000011011010
43 00101100 000011011011
44 00101101 000001010100
45 00000100 000001010101
46 00000101 000001010110
47 00001010 000001010111
48 00001011 000001100100
49 01010010 000001100101
50 01010011 000001010010
51 01010100 000001010011
52 01010101 000000100100
53 00100100 000000110111
54 00100101 000000111000
55 01011000 000000100111
56 01011001 000000101000
57 01011010 000001011000
58 01011011 000001011001
59 01001010 000000101011
60 01001011 000000101100
61 00110010 000001011010
62 00110011 000001100110
63 00110100 000001100111
Make-up Codes
run length white code black code
64 11011 000000111
128 10010 000011001000
192 010111 000011001001
256 0110111 000001011011
320 00110110 000000110011
384 00110111 000000110100
448 01100100 000000110101
512 01100101 0000001101100
576 01101000 0000001101101
640 01100111 0000001001010
704 011001100 0000001001011
768 011001101 0000001001100
832 011010010 0000001001101
896 011010011 0000001110010
960 011010100 0000001110011
1024 011010101 0000001110100
1088 011010110 0000001110101
1152 011010111 0000001110110
1216 011011000 0000001110111
1280 011011001 0000001010010
1344 011011010 0000001010011
1408 011011011 0000001010100
1472 010011000 0000001010101
1536 010011001 0000001011010
1600 010011010 0000001011011
1664 011000 0000001100100
1728 010011011 0000001100101
Extended Make-up codes (white + black)
1792 00000001000
1856 00000001100
1920 00000001101
1984 000000010010
2048 000000010011
2112 000000010100
2176 000000010101
2240 000000010110
2304 000000010111
2368 000000011100
2432 000000011101
2496 000000011110
2560 000000011111
Webdesign
by Deep Magic Studios
- HanaHo Games, Inc. Copyright © 2002 |