|
For many C/C++ programmers, assembly language
programming seems like a waste of time, what with all those wonderful
optimizing compilers. Well...maybe on other platforms, but
on Intel based machines, hand crafted assembler code is quite a
bit different than the slop being produced by compilers. Let's
take a look at the example below. This output was generated
by the Microsoft Visual C++ v5.0 compiler with optimization set
for maximum speed coded specifically for a Pentium:
Notice the wasteful use of registers when they
are not needed, as well as the costly prefix bytes to work with
word sized registers in a 32-bit code segment, and a few stalls
of the two pipelines due to register contention.
mov
cl, BYTE PTR _ucByte2$[esp-4]
push ebx
mov bl, BYTE PTR _ucByte1$[esp]
xor ax, ax
xor dx, dx
push esi
mov esi, DWORD PTR _regs$[esp+4]
mov al, cl
mov dl, bl
add eax, edx
mov dl, BYTE PTR [esi+12]
and dl, 208
;
test al, al
mov BYTE PTR [esi+12], dl
jne SHORT $L22070
Here is the same function done by hand-crafted code, notice
no size prefix bytes needed and no register contention pipeline
stalls:
xor
edx,edx
xor eax,eax
push esi
push ebx
mov dl, _ucByte2$[esp-4]
mov esi, _regs$[esp+4]
mov al, _ucByte1$[esp]
add eax,edx
and byte ptr [esi+12], 208
test al,al
jne short $L22070
The benefit of coding it in assembler is usually
a minimum of a 2X performance gain. For code which does bit-twiddling
or graphics, the gains can be considerably greater. It is
rarely necessary to write the entire application in assembler, but
selective use of the inline assembler for critical sections can
work wonders on the performance of your program.
Webdesign
by Deep Magic Studios
- HanaHo Games, Inc. Copyright © 2002 |