Recent exercises with UART revealed two interesting bugs in my design. Both problems were resolved and I am documenting them here for dairy keeping.
Implicit memory reads
It is something I have discovered only now, playing with real devices attached to the CPU’s expansion bus. The problem is related to side effects of implicitly reading semi-random memory locations.
My design features two data buses: 16-bit ALU bus used by all ALU operations and 16-bit register-to-register transfers, as well as 8-bit DATA bus used when referencing memory. Both buses are connected by two bidirectional drivers (74LS245) allowing for the following interconnections controlled by microcode bits:
- low 8 bits of ALU to DATA
- high 8 bits of ALU to DATA
- DATA to low 8 bits of ALU
- DATA to high 8 bits of ALU
- buses disconnected
I use the first four whenever I need to access memory (for loading or storing). The disconnected state is used primarily for register-to-register transfers. It is also used purposefully to save CPU cycles – e.g. when I fetch the next instruction and in the same cycle perform the last ALU or register operation of the instruction. So far so good.
Problems result from the fact that my design assumes certain default microcode values for parts of control circuitry that are not used at a given moment (cycle). If they are not changed in the microcode, they remain at their default values. In particular, the default microcode word (and hardware design) results in:
- buses disconnected
- MAR value on the address bus
- data memory segment is selected (as opposed to code)
- memory is read (/ENMEM signal active)
The above is what normally happens during most register-to-register transfers. When designing this scheme, I assumed that I may read from memory as many times as I want, without any side-effects. I totally forgot at that time that some regions in data memory segment will map to actual devices. While my assumption may be true for memory, it is NOT true for many devices. One of such devices is an 16550 UART which changes it state when it is read (because reading operation changes the state of its internal FIFO buffer). In one of my test programs I did the following:
- write some chars to the serial port (this effectively left the address of UART’s read/write FIFO in MAR register)
- halt and wait for incoming interrupts
Microcode for halt is:
// HALT op($01) 0, *, MDR <- PC 1, *, PC <- MDR - 1 2, *, fetch endop
In this case I am performing two implicit memory (and possibly device) reads. In the first cycle, the value of PC appears on the address bus as this is the only way to pass value of address registers to the ALU (it results from internal architecture of the CPU). Even though I don’t explicitly tell in the microcode to read the memory, there actually is a read-out from the address in PC in the data segment.
In the second cycle there is no explicit address bus (PC is only written to, not read) use and the address fails back to its default value, which is whatever there is in MAR. Since the last value stored in MAR was the address of the FIFO buffer of the UART I had just written to, the NOP operation effectively clears the FIFO. Damn!
I discovered this bug in exactly the same scenario. I wrote a prompt string to the UART and halted the machine to wait for incoming interrupts. My HALT, which is in fact a busy loop, was continuously clearing the FIFO buffer of the UART. Whenever an interrupt arrived, the CPU passed control to the servicing routine but by the time I was ready to read the value from the buffer, the value was long gone.
I decided to fix the problem in microcode, without adding extra hardware. I changed the default microcode word to point to code memory segment instead of data. In my design code memory never maps to devices so it is safe (hardware address decoder is only enabled when data segment is selected). However, in the microcode source I was extensively using the previous assumption and I was providing no “DATA” keyword when I was accessing data segment (using the default value) and only used “CODE” to access code segment. To avoid any further doubts, I modified the microcode source to always explicitly state which memory region I am referring to.
The new version of microcode assembler (with changed default microcode word), and the new microcode source (with code/data memory segments accessed explicitly) are both in downloads. I have also reflected the default value change in my microcode word specification page.
The other problem is related to timing. My CPU uses two clocks – the master clock (CLK) and the bus clock (BUSCLK). Bus clock is used for clocking devices connected to the buses (mostly registers), hence its name. Master clock controls almost everything else. Bus clock is derived from master clock and only goes low in the last quarter of the master clock cycle (25% duty cycle). The BUSCLK signal is created by OR’ing the master clock CLK with its 90-degrees offset. As a consequence of gate propagation delay, bus clock is slightly offset with regards to the master clock. There are also some other clocking signals, like the LDPPC used to clock the “previous program counter” register which are derived from the bus clock and with even greater offset to the master clock. This got me into trouble. PPC was sometimes loaded with proper value, sometimes not. Here is a screenshot of a timing diagram showing a moment when the wheels came off.
The waveform shows instruction boundary. LDPPC was supposed to latch the address value on the address bus (to store previous PC value upon fetching the new instruction). The master clock edge comes first, it increases the PC (/INCPC signal is active low during rising clock edge) and the new value of PC appears on the address bus soon afterwards (note the LSB change on ABUS). By the time LDPPC clocks the PCC register (rising edge), the old value on ABUS is already destroyed.
I implemented a simple fix to this problem. I left the generation of derived clocks as is, and delayed the master clock which is output to the system by adding two identity gates on its path (signal OR’ed with itself). This immediately fixed my issue with PPC (and interrupt handling for which correct PPC value is crucial). The solution is not too elegant, but at it works perfectly fine. Updated schematics with this fix is in downloads.