I am slowly progressing on my simplistic OS/Monitor development. It is only a little fun, mainly because I do not have a functional simulator (i.e. emulator) of the computer and I have to test every bit of assembly code on real hardware. For that I have to constantly switch from Linux (where my development toolchain resides) to Windows (where my EPROM programmer software is), remove the ROM chip, burn it, plug it back into the memory card, boot back to Linux to run the VT100 terminal emulator until I finally give it a try. Because I am mainly dealing with RS/232 driver for 16550 chip and other low level kernel routines, the success ratio is rather low. I really need to think about an emulator, otherwise development pace will be too slow and I am afraid I may get discouraged and the project may loose momentum. Somehow it is difficult to imagine at this point how to do more serious programming (e.g. C compiler retargeting, Minix porting) without a decent emulator.
The OS is still too immature to share it. Nevertheless, I have some progress to report. As a useful side effect of my recent work I managed to enhance the functionality of assembler to make it more usable. It is the same assembler built using flex and yacc, but I have added support for constants, expressions, multiple memory segments, and some more. The assembler still generates absolute binary objects (flat binaries with fixed memory references) in Intel HEX format, but at this point it is sufficient for my needs, and rather convenient (because this way I need no linker and burn the binaries directly to the ROM).
Also, I updated the instruction set and corresponding microcode by adding few instructions which I found particularly handy for current development. I have already used up all 256 opcodes, so from now on any new instruction will have to replace one of the less useful. I initially designed the instruction set quite expansively, assuming orthogonality for most load and store instructions, so this should not be a big problem. I am assuming that porting a C compiler will result in big ISA rework, anyway.
Check downloads page for updated software and microcode. Below is a “what’s new” report.
The assembler currently supports the following items (labels, expressions, literal types and directives):
|label||Label is an alphanumeric string used to define references to locations of code and data segment. Labels that precede instruction mnemonics are are defined with a trailing colon (:). They are composed of letters and digits (first character of the label must be a letter. Labels are case sensitive.
To reference a label, use its name without a colon, e.g.:
|expression||Expressions are used to build any value and were added for programmer’s convenience. The following are valid expressions:
literal (any of the literals presented in the table below)
expression + expression
expression – expression
expression * expression
|<label> equ <expression>||Defines a constant. Constants may be used in expressions. Constant names are case sensitive.
|db <expression>[,<expression>]*||Dumps a series of 8-bit values to current memory segment. Each expression value is cast to 8-bits. String literals may be used with ‘dw’ directive. In such case string literal is emitted directly. The ‘db’ directive is usually labeled for easy referencing. In such case labels are without the colon.
|dw <expression>[,<expression>]*||Equivalent to ‘db’ but values are cast to 16-bits. String literals are illegal with ‘dw’.
|8-bit hex literal||Hex literals are prefixed with ‘0x’. Literals are used in expressions (literal itself is an expression).
|16-bit hex literal||Same as above but word size.
|8-bit binary literal||Binary literals are prefixed with ‘0b’.
|16-bit binary literal||Same as above but word size.
|decimal literal||Decimal literals may be used in expressions, too. Their value is cast to 8- or 16-bits depending on the instruction’s argument size. Decimal casting is signed.
|8-bit char literal||An 8-bit char literal is a string literal one byte in length. It may be used in expressions as 8-bit (ASCII) value or 16-bit value (it is then cast to a word).
|string literal||Used to define a string. It may be used only in ‘db’ directive, and not in expressions. String are enclosed in apostrophes (use double apostrophe to include it into a string):
|.code||Switches code emitter to code segment. The assembler generates output for both code and data segment.|
|.data||Switches to data segment.|
|.org <16-bit hex literal>||Defines the segment’s entry (load) address. It may be used only once per each segment and takes 16-bit hex literal as an argument (expression is not allowed).|
The assembler’s current grammar (autogenerated by yacc) may be found here.
Most of the changes are new instructions added for convenience, and are self explanatory. The only thing really worth documenting here is the new behavior of SYSCALL instruction used to call OS kernel routines. In previous microcode version I assumed that SYSCALL takes no argument and kernel function code to be invoked must be passed by A register. Then the ISR for SYSCALL would map the code in A to proper function call address and branch to it. I decided to move this branching to microcode, in a hope that it will ultimately make kernel routines calling faster, and release A register to be used for kernel function arguments. Now, SYSCALL takes and 8-bit argument (meaning, that there will be maximum of 256 kernel routines exported to user mode programs). In previous version, SYSCALL would read the ISR address from the interrupt vector (IVEC+0x1f) and branch to the ISR. Now, it takes the address from the interrupt vector, but instead of branching to the ISR, it treats this address as a base address of a map of kernel routine addresses (of course, this map must be constructed by the kernel, along with the interrupt vector). By adding an 8-bit offset (multiplied by two) to the base, the microcode obtains the effective address of a kernel routine and branches to it. This way, there is no need for any ISR for SYSCALL whatsoever. The ISR is in fact implemented in microcode. Here is the current microcode source for SYSCALL:
// SYSCALL #i8
0, *, MDR <- MDR ^ MDR
1, *, LO(MDR) <- MEM(PC); CODE // read function code
2, *, PC <- MDR + MDR // multiply by 2 to get map offset and store in PC
3, *, MDR <- MSW // back up MSW before switching to supervisor (to store original CPU mode)
4, *, MAR <- SP // back up SP
5, *, MDR <- -1 + 1; LATCH_S // enable supervisor mode (from this point on SP denotes KSP), MDR not latched
6, *, SP-- // this is KSP
7, *, MEM(SP) <- LO(MDR); DATA; SP-- // store MSW
8, *, MDR <- MAR // store SP
9, *, MEM(SP) <- LO(MDR); DATA; SP--
10, *, MEM(SP) <- HI(MDR); DATA; SP--
11, *, MAR <- IPTR // retrieve base address of syscall functions map from interrupt vector (0x1f)
12, *, HI(MDR) <- MEM(MAR); DATA; MAR++
13, *, LO(MDR) <- MEM(MAR); DATA
14, *, MAR <- MDR + PC // add to previously computed offset to map base address, store in PC
15, *, HI(MDR) <- MEM(MAR); DATA; MAR++ // retrieve function addess
16, *, LO(MDR) <- MEM(MAR); DATA
17, *, PC <- MDR
18, *, MEM(SP) <- LO(A); DATA; SP-- // store A
19, *, MEM(SP) <- HI(A); DATA; SP--
20, *, MEM(SP) <- LO(X); DATA; SP-- // store X
21, *, MEM(SP) <- HI(X); DATA; SP--
22, *, MEM(SP) <- LO(Y); DATA; SP-- // store Y
23, *, MEM(SP) <- HI(Y); DATA; SP--
24, *, MDR <- DP // store DP
25, *, MEM(SP) <- LO(MDR); DATA; SP--
26, *, MEM(SP) <- HI(MDR); DATA; SP--
27, *, MDR <- PPC // store PC (the next instruction's starting address)
28, *, MDR <- MDR + 1
29, *, MEM(SP) <- LO(MDR + 1); DATA; SP--
30, *, MEM(SP) <- HI(MDR + 1); DATA
31, *, fetch // fetch at PC
It is 32 cycles in total (maximum my microcode can take), but compared to the number of cycles the ISR would cost, that’s a profitable change. And, the A register is released for the OS programmer’s use.
I think what I will do next is revert back to hardware for some time. I want to add another device card with RTC and IDE controller and build a clock slowdown circuit for accessing slow devices. Stay tuned.