|
As far as creating fast code, there is much to be
said for knowing your compiler/machine well and producing highly
optimized code, but with a little knowlege that I will present here,
you can turn ordinary/slow code into fast code. What follows
is a set of general rules and examples which will guide you around
the most common performance pitfalls.
1) Choose the right algorithm - Before
you spend time optimizing or profiling that critical section of
code, make sure that there isn't a better way to do it. Changing
algorithms can have much more of a dramatic effect on your code's
speed than even rewriting it in assembly language. Don't bother
tweaking assembler instructions either until you are sure you have
the right algorithm; you may end up throwing away that code and
having to start over.
2) The 80/20 rule - In some cases
this may be the 90/10 rule. Essentially this means that your
program will spend the majority of its time (80%) in a very small
portion of the code (20%). I call this 'coding by statistics'
and this is how I was able to get so much speed out of my CPU emulators
- by finding which code was the most frequently used and optimizing
the hell out of it. Certainly writing all of your code to
be as efficient as possible is a good idea, but you needn't bother
wasting time on a section which will only get executed once or infrequently.
That is why only parts of HiVE are written in assembler:
CPU emulators and the graphics code. I could write the entire
program in assembly language, but the result would be very little
change in speed for a large investment in time.
3) Keep system / stdlib calls to a minimum
- Programmers who do not know/care about what is going
on inside the machine sometimes make the mistake of assuming that
any C function/keyword takes about the same amount of time to execute.
The fact is that all function calls require a pushing of parameters
then a call/return which takes considerably more time than inline
code. System calls are even worse because they sometimes must
pass through CPU call gates and security schemes which slow things
down even more. Shown below are two examples of what NOT to
do:
a) for (i=0; i
< arraysize; i++)
{
write(handle, mydata[i], sizeof(mydata[i]));
}
The problem with the above code is that each
call to the write() function wastes a lot of time (probably milliseconds)
and should be consolidated into one large write or buffer the data
in RAM first, then write it in large blocks (> 64K). If
you see any application which causes the disk activity light to
blink rapidly, then it is poorly written and wasting lots of time
making system calls.
b) memset(mypointer,
value, 1);
memcpy(mypointer1, mypointer2, sizeof(int));
To many programmers, the problem
with the code above may seem quite obvious, but I have seen code
exactly like this written by a programmer at IBM. For moving
small blocks or setting a few bytes, don't use the stdlib functions
because the overhead of the pushing of parameters and call/return
take MUCH longer than just assigning a value with a pointer.
4) Use optimized assembly language where
it makes sense - Even the best compilers in the world will
not be able to beat the optimizer between your ears. Besides
optimization, assembly language provides access to instructions
that would never be generated by a compiler. For example,
the 680x0 processors have bitfield instructions which are excellent
for graphics and data compression, but would never be used by a
C compiler because the C language does not have efficient means
to work with bits and bytes. Yet, using these instructions
in the right way can make dramatic speed improvements in your graphics
or data compression routines.
5) The less memory you touch, the faster
you go - Ever since the days of the 'XT-Turbo' machines,
the CPU in a PC has been working with data a whole lot faster than
the RAM/ROM can supply it. Consider a 200Mhz Pentium machine
with 60ns DRAM. Instructions can execute in 1 clock cycle
which is 5ns, while memory access time is at best 60ns. The
external cache RAM may be 10ns, but it will not be in use all of
the time. Using less memory, even if it makes the code more
complex, can have dramatic effects on the performance. Sometimes
there is no choice, but other times through a little compression
or clever means you can reduce the memory which gets modified in
your 'inner loop'.
Webdesign
by Deep Magic Studios
- HanaHo Games, Inc. Copyright © 2002 |