I have found an interesting bug in my design that opened my eyes to one important aspect. The problem is related to the memory chips’ /WE (write enable) signal which should be asserted when I want to store data in memory. With nearly every TTL device in my design being synchronous, altering their states on positive clock edges, I have completely overseen the fact, that the memory write enable signal is asynchronous and should be asserted with particular care. Meanwhile, I was asserting it as early as any other control signal which resulted in memories ready for writing data when various random signals were still bouncing on data and address busses. This means I was writing to completely random memory locations. This could happen not only during memory writing instructions but actually any time when the control module was still doing its job and the microcode address was not yet stable. Even if there is some non-zero data setup time on memory devices, I am not going to rely on this and trust that no write will occur. I need a clean solution for this.
Ok, now. The idea is to suppress the memory /WE signal and assert it late enough so that I am sure all other control signals are stable and there is a correct address on the address bus and correct data on the data bus. On the other hand I need to assert it early enough so that I meet the memory timing (data setup) requirements. The answer is simple – use a fraction of a clock cycle for signal propagation and the remaining fraction for asynchronous operations like memory write. The fractions of ¾ and ¼ respectively seem to be a good choice. Now, how will I know when the last clock quarter starts? The answer is – use multiphase clock. To this point I believed that I can do without them. It is obvious to me now that I can’t.
In order to generate a multiphase clock, a ring counter comes handy, as described here or here. A ring counter is basically a shift register with its last non-inverted output fed to its first input, resulting in a pre-set pattern circulating around the “ring”. I could use a 4-bit shift register, load it with a binary pattern 1000 on system startup and clock it with my master (original clock). On the four outputs of the register I would obtain my multiphase clocks, as illustrated below.
The four clocks are at the quarter of frequency of an original clock and have a 25% duty cycle. Whether 25% duty cycle bothers me, I am not sure now but I am afraid it may hit me one day. I am using only positive edges to trigger my devices, so theoretically it should not be a problem. However, generating 50% uniform duty cycle clock is not any more difficult than that. I could load the shift register with a pattern of 1100 on startup or use a Johnson counter. A Johnson counter is also a shift register, but the difference is that its inverted last output is fed to the first input, which results in a couple of nice characteristics. It divides the frequency by a factor of twice the number of stages. This means, that by using a ring counter built of four D-type flip-flops, we obtain four offset clocks at a frequency of 1/8 of the original clock. Moreover, there is no need of initial loading with arbitrary pattern – it is enough to reset the flip-flops to zeros on startup (the inverted output will do the job). Finally, the resulting clocks have a 50% duty cycle, as in the picture below.
Since I don’t need a 1:8 division ratio, I can do with just two D flip flops and generate something like this:
The above seems a best choice to me, and I will incorporate it into the design and model in my simulator to see how it handles the memory writing bug. By decoding a combination of Q0 and Q1 (and by using their inverted values) I have full access to information about the clock quarter I am in, and to triggering signals. And there is no pattern loading thing (which may be corrupted by noise during operation) as with a shift register, and the duty cycle is uniform with no additional assumptions or hardware. And finally, it is ridiculously easy to build:
Of course, my previous maximum clock speed calculation is obsolete with the new clocking approach. It was based on the assumptions that I am using the full clock cycle for signal generation and propagation. Currently, it is only ¾ of a cycle. With 600ns critical path, the full clock cycle should be no shorter that 800ns, resulting in a maximum clock speed of 1.25MHz, not 1.6MHz I previously declared. Still acceptable, but I am pretty sure now I will need to stick some 74-series F devices here and there on the signal critical path once I have the machine running.