??? 12/15/07 06:46 Read: times |
#148265 - I think you should do it your way ... Responding to: ???'s previous message |
Russ Cooper said:
Richard said:
How does it help you to duplicate resources that are already inherently addressable?
Well, there are two issues here. One is the need for certain registers to be implemented discretely, as opposed to being represented by a location in a RAM. The other is whether those registers that are implemented discretely should ALSO appear in the RAM. It's fairly obvious that the SFRs need to be implemented as discrete registers, because in general, their outputs need to be available continuously, either to configure various features of the peripherals that they control, or to speed up certain instructions. A second, similar reason is that some SFRs contain bits that get manipulated individually, and that would be messy if they were stored in a RAM. Finally, it may make sense for performance reasons to implement SP and DPTR as loadable counters. Putting them in a RAM would clearly preclude that option. I don't see that, as they're also accessible, bitwise, from memory ... if not under the instruction set, certainly in the hardware. In addition to the SFRs, R0 and R1 should also be implemented as discrete registers in order to speed up instructions that use the @Ri addressing mode. Well, I don't see that. The typical ALU operation, as used for address arithmetic would have, at most, two operands, one being an index, and the other being either 0xFFFF or the carry-in, and the other being an index/offset such as in some cases, ultimately destined for PC, and in others, simply destined for AB, the address bus register, which, in the default, gets the content of PC at each cycle end. Now I have to conclude we're seeing quite different ALU architecture. I'm envisioning an overall architecture in which data flows through the ALU twice on every cycle, once with address content and once with data. What happens to the result is determined by the opcode. Thta way, everything works in pretty much the same way all the time. You probably have a different concept in mind, which conclusion is based on your comments about Jan's charts and diagrams. You shouldn't construe what I say as an admonition to do what I'm suggesting, merely to consider it. I'm a believer in simple if not also small, but I don't believe in multiple copies of the same thing if it can at all be avoided, as the timings may differe. The second issue is whether or not the discrete registers should be duplicated in the RAM. Jan convinced me here that the SFRs should not be. I still think that R0 and R1 should be shadowed in the RAM only because it saves a little bit of hardware and doesn't cost anything (the affected RAM locations would be unused otherwise).
Compare is really an XOR, isn't it? If you XOR and then behave as you would in JNZ, wouldn't that do the job? I believe this was a reference to the CJNE instruction. The answer is no. CJNE sets the carry bit to reflect the relative magnitudes of the two operands, so you need to do a subtract and then just throw away the result. Well, I stand corrected, though there's clearly the other option I mentioned, namely a subtract without updating A, which will put the sign bit in the MSB. You can actually set the carry based on that, having the opcode to tell you to do it. If you build an ALU that muxes its inputs from the available resource pool, and is capable of loading as well as of incrementing/decrementing DPTR, and SP, and other destinations, and loading and incrementing PC, it clearly has to contain a 16-bit Adder/Subtractor. If you think about the ALU as the intersection of lots of logic paths, like a railroad yard, and the controlling state machine(s) as providing the steering controls to those logic paths, you'll see that data can be sourced and sunk by the same resources at opposite ends of a single operation. You have expressed this idea many times, and I think I understand it fully. I also think that it leads to a good solution if the goal is to minimize hardware. However, as I mentioned a day or two ago, I want to strike some sort of balance between speed and simplicity. So I'm willing to employ a few extra adders and/or counters and/or whatever as necessary to help performance. On the other hand, I want to be able to implement and debug this thing successfully, so I'm nowhere close to considering the really interesting stuff like pipelining and branch prediction and concurrent instruction execution. We'll leave those things to Silicon Labs, and I'll be happy enough to get "Hello, world" running on my FPGA. -- Russ I understand your desire to forego the more esoteric features such as you mention, but IMHO, the treatment you suggest seems to me to be more complicated, particularly in its timing than less resource-rich versions, even if not as spartan as I've suggested. I like the ALU to do the increment/decrement operations on addresses as well as data. That doesn't mean you have to do that. I think you could benefit from looking at the relative logic burdens, though. After all, what you find out may benefit you at some time in the future. Keep in mind, too, that you don't have to execute the operations in the same way as the '51 does. Only the end result of each operation matters. RE |