??? 12/09/08 10:27 Read: times |
#160790 - Assembler and processor size Responding to: ???'s previous message |
Assembler allows a developer to place any compbination of instructions after each other - or even "in" each other if one byte of one instruction is allowed to be a target for a jump...
Most 8-bit processors are quite easy to program in assembler because of a limited number of instructions and access modes etc. People still constantly fail to produce decent assembler code. A processor that consumes different number of clock cycles for different instructions is harder to program than a processor with a fixed number of cycles/instruction. If moving to a processor with DRAM memory, you may continue to get tight code. But to get fast code, you must think about alignment in relation to rows and columns of the memory. And for a 16 or 32-bit processor, the alignment in relation to data size also affects the speed (if unaligned access is supported at all). When the processor gets a cache, then the cache line sizes will introduce yet more differences in memory bandwidth depending on loop sizes and alignments. And with multi-issue processors (multiple instructions started on every clock cycle) you have to spend a significant amount of time trying to permutate all instructions to fill the multiple evaluation units of the processor. Just the little change where Intel allowed a zero-clock swap of two registers in the floating-point stack of the original Pentium was enough to make it almost impossible to beat a good compiler when writing floating-point-intensive x86 code. The human brain don't like to try to keep track of what values are stored at what positions when one swap may be performed for each other fp instruction without extra cost (than a bit of consumption of cache memory). Who wants to spend 5 hours to gain 0.1% extra speed in a math function, and know that the next generation of the compiler may take back that gain. Or the optimization for the next generation processor would result in way larger speed differences and requires new optimizations. So, in the low-end uC world, the assembler programmer may feel like a king. But the number of people who can compete with the best compilers decreases a lot for each improved processor generation/architecture that gets released. We are good at seeing patterns, but not as good at scanning through large number of permuations with a very low probability of introducing errors. |