Hi all, Some time ago I was converted polytex to plain c. In polytex asm source, parameters passed with modifying the code in memory. This sparked some thoughts in my mind. I'm doing research about how old programs was written on very limited machines (like 486 or even 6502) partly because code efficiency tends to increase when going past. I think these programing practices are passing unknown in favor of high level coding. I want to reveal (and maybe document later) this tricks even today's operating systems started to declare some of this practices as illegal. As he did this style of programming before, I have some questions specially to Ken, but don't hestitate to share your thougths: * why do you use parameter passing with code modifying instead of register or stack parameter passing? it should be faster than other methods because this asm source was the inner loop of texture mapping * do you know other performance-increasing code modify examples? it can be for even 8-bit cpus
CraigFatman at
Re: Self modifying code
Sometime I'd examined Build Engine source to find the algorithms Ken used for texturing purposes; I tried to employ similar self-modifying routines in my test application. Usage of self-modifying x86 code benefits a lot, sometimes it can work even faster than SSE2-optimized code. However, adapting code to newer instruction sets and effective multithreading are now the primary ways to provide best performance.
Awesoken at
.. even today's operating systems started to declare some of this practices as illegal.
If you are talking about the "Data Execution Prevention" feature of the recent x86 chips, that can by bypassed with a call to VirtualProtect().
why do you use parameter passing with code modifying instead of register or stack parameter passing?
I use self-modifying code in the following cases: 1. To save on registers. Any register that is read-only can be replaced with an immediate value in the loop, and modified before it. Some values need to be modified once per polygon; others once per frame. The less often you modify, the greater the benefit. 2. Shifting or rotating by an immediate is faster than a shifting by 'cl'. Usually these need to be modified once per polygon. 3. To save on code space. For example, you might have a long inner loop, where you might want to select between an 'add' and a 'sub'. Note that this is not an optimization, and not usually worth the trouble.
Self-modifying code can still increase performance on modern machines. Multithreading complicates things as it limits what you can modify. Depending on how you split the work, usually things that change once per frame can still use self-modifying code, while things that change once per polygon cannot.
Levent at
Thanks guys. I was waiting for some speedup talk but certanly not expected faster execution than SIMD :D.
If you are talking about the "Data Execution Prevention" feature of the recent x86 chips, that can by bypassed with a call to VirtualProtect().
Actually I wanted to point that we're losing fast ways to programming and even denying these gains to some degree. Openbsd has no way to bypass this security (called W^X, some other type of DEP) as I know for example. This means if we optimize our program with self modifyinng code, it will not run on this platform with these optimizations. (but I should say that openbsd if awesome development platform to finding bugs) Anyway, I expect that performance gain (or loss) with self modifying code is highly fluctate with cpu type, speed, mem speed, code etc. This means if we implement some code parts with SMC, we should time this code parts when program started and use it if it is faster than non-SMC optimized code.
TX at
Awesoken said at
If you are talking about the "Data Execution Prevention" feature of the recent x86 chips, that can by bypassed with a call to VirtualProtect().
I seem to recall investigating the DEP situation in JFDuke a while back and discovered the workaround implemented was completely wrong. I never got around to fixing things up the correct way but the gist of it was that you have to actually create function pointers for all assembly based functions which are self modifying and then allocate a new block of memory via VirtualAlloc(), flagged as executable, to place a copy of the self-modifying code in (which the function pointer would then reference instead of the original address for the function).