I must be an assembly programmer, because my immediate reaction to this was: WHY ARE THEY WRITING TO MEMORY ALL THE TIME! The code just looks uncomfortable. Like a word-by-word translation from a foreign language. Like forth written by someone who tried to keep the stack empty and all the data in variables.
Because the Z80 doesn't really have a whole lot of registers that you could use for general storage. Even the 6502 (and the 6809) would try to keep the cycle count down by storing stuff in the zero page (or direct page on the '09).
Z80 assembly tends to be very easy to read because you don't have to keep a mental map of what lives in which register, A really is the accumulator and the only register that you can use to do anything more complex than inc or dec to.
And see how much more effective the 6502 set is when it comes to empowering the various registers. The 6502 does not need 'shift' codes (slow) either.
I've programmed both, and even though they both have their charm I would prefer to code the 6502 for the same problem (and I'd much prefer to use the 6809 over either).
I wrote a lot of Z80, very little 6502. They felt very different to me. Code written by one couldn't be ported to the other, but had to be rewritten.
Didn't have much trouble storing things in B/C/D/E/BC/DE, at least I don't remember it as a problem. The innermost loops had to get top priority when deciding what each register was used for, that's all.
Code written by one couldn't be ported to the other, but had to be rewritten.
That’s not my experience. I once ported a program from the 6800 (the mother of the 6502) to the Z80. It was very straight forward. But to my surprise the version on the much “fancier” Z80 turned out to be both larger and slower despite the Z80 ran a little faster clock speed.
(OP speaking for himself:) When I ported Z80 code to the 6502 literally, the result didn't use the zero page to its full effect, because the zero page was so much bigger than the Z80's extra registers. When I ported 6502 to Z80 literally (I mean code that used the zero page well), too much of the zero-page work had to be replaced with memory work and not enough with the nice fast registers.
Same here, wrote 6502 (C64) and Z80 (CPC) code and preferred the (double-)registers of the Z80 to work with compared to the limited registers of the 6502.
That's not a completely fair comparison. BCDE+HL+IXIY offered more space than the 6502's registers, but less space than its zero page, so the hierarchies differed. In both cases "A, something, main memory", but the somethings differed and it mattered.
You're right, though if I remember correctly the zero page was only slightly faster than memory access.
I'm als influenced a lot by my personal circumstances, as though the Z80 took generally more cycles than the 6502, as I moved from a 1mhz 6502 (C64) to a 4mhz z80 (CPC) in general the z80 felt faster.
Maybe it actually was. Code like (java) for(A a : b) a.c = false; certainly could be written very nicely on the Z80. By keeping a in IX and laying out the data structure as a struct where all of the bools are packed into one byte, the body of that loop could be just one (slow) instruction like RES 3, IX+4. But to get that speed you had to be aware of those possibilities and design the data structure. On the 6502 you'd lay out the data structure differently, playing to that CPU's strengths.
+1 for the 6809. I've worked with both the Z80 and 6809, and while both were long enough ago that I don't remember the details the 6809 really impressed me. I wonder if there's ever been another 8-bit processor that surpassed it.
I remember the 6809 it was really clean and capable and I think supported relocatable code. Very much unlike the funky instruction sets of any of the other 8 bit machines and not the horror show that was x86.
I much preferred the 68000 to the 8086. The regularity of the instruction set made writing a disassembler trivial. Coupled with the single-step bit you could write a very capable monitor, which I did.
Yes, the 6809 could jump and branch relative to the current location, program counter + offset. If all jumps where of that kind a program could be located anywhere in memory.
They were also a way to empty a long section of memory faster than LDDR: swap out the stack pointer to the starting address, set HL to zero, then PUSH HL inside an optionally unrolled loop
No idea about the Z80, but in uni we had to write assembler for a C167 which was a fun architecture and registers basically didn't exist. They were just mappings to memory in a given location. Granted, a few things only worked on registers and not on memory directly. I think something like addressing individual bits, but just reading and writing a value was effectively the same on memory and on a register name.
I had no qualms at all to use memory as variables when convenient. I didn't have to use the stack at all in all the assignments as 16 registers and a handful DB sprinkled through the code for more locations to write to were enough – the disassembler in the debugger didn't like code interspersed with data, though.
As others pointed out, you had to, because of lack of registers.
Also, the article explicitly mentions it does more loads and stores than necessary (”In real life, values from one expression will remain in registers for the next, and so won't need to be reloaded; the examples are all deliberately choosing the worst possible case.”)
Finally, writing to memory wasn’t as bad in those days as it is today (or rather: using registers wasn’t as fast as it is today). Writing to a fixed address, for example, only took twice as long as a register-to-register move (4 cycles vs 2 cycles on a 6502, if I googled that correctly)
You are are correct, direct memory access on those machines were really cheap compared with today and even earlier. Modern day memory bus speeds haven't kept of with processor speeds (this is what sank RISC machines). Old core memory is slow. Drum memory is hella slow.
The other thing is there is a trade off between number of registers and instruction size. With 8 bit machines you see that for instance where only certain addressing modes can be used with certain registers. You don't have enough instruction space to encode for every addressing mode for all the registers you have.