I have always maintained an interest in programming on CP/M long after it really ceased to be actually useful, and I still have my original copy of the Microsoft FORTRAN compiler that I brought way back when.
So when I discovered that there is a modern ‘C’ compiler in the Debian repositories that could produce code that targeted the Z80 processor I felt I had to try it out!
The compiler in question is the Small Device C Compiler and while using it is a bit different (because it’s main purpose it to produce code for embedded systems that don’t have an operating system), unlike the native Z80 C compilers like ‘Small-C’, ‘BDS C’ or ‘HiTec-C’ whose language syntax can take a bit of getting used to, it is a modern optimising ‘C’ compiler that is capable of producing really rather compact code.
I’m getting a bit ahead of myself here but one of the things I really liked about this compiler is that I you can compile programs using both ‘sdcc’ and ‘gcc’ without having to define separate platform specific sections in the code. Most of the time the differences between the two are not really noticeable so you do need remember that there are going to be some differences. For example integers are only 16-bit values unless you define than as long and this includes any intermediate results which can lead to some unexpected behaviour.
Since I’d not used ‘sdcc’ before the first thing I tried was to compile the ubiquitous ‘hello world’ program.
#include <stdio.h> void main() { int i_count; for (i_count = 0; i_count <= 9; i_count++) { puts("Hello World !!"); } }
I mean how card can that be?
$ sdcc -mz80 sdc-hello.c ?ASlink-Warning-Undefined Global '_putchar' referenced by module 'puts' $ Well that was not really what I expected… Time to do a bit more research (which is probably what you are doing reading this)!
It turns out that since embedded systems generally don’t have an operating system the libraries provided with the compiler don’t support any. In fact as the user you are expected to write all the platform specific code yourself, and since without an operating system you don’t have a file system this also means that you only get ‘puts() and ‘printf()’, and even these routines depend on the user providing their own ‘putchar()’ function in a custom C runtime!
I will admit that discovering that there was no builtin support for CP/M (or MSX) was a bit of a disappointment but it seems that there are ways around this.
Normally producing the C runtime would mean writing some assembler code to interact with whatever serial devices were available on the embedded platform you were using but since we are going to be cross compiling for a CP/M system we can make use of the BDOS and BIOS functions. Also since the code produced by ‘sdcc’ is also Z80 specific I have added in a little bit of code to check that the program is running on a Z80 processor as CP/M could run on both the 8080 and Z80.
The C runtime also needs to set the program address and I found that for all but the most trivial programs to work it needed set aside some storage for the stack. After a bit of experimentation I came up with the following.
.module crt0 .globl _main ; BIOS$CON_Input .equ 0x09 ; Console input. BIOS$CON_Output .equ 0x0C ; Console output. ; BDOS$Print_Str .equ 0x09 ; Print a string to CON:. ; CPM$Base .equ 0x0000 ; Base of CP/M OS. CPM$BDOS .equ 0x0005 CPM$Transfer .equ 0x0100 ; Program load address. ; .area _HEADER (ABS) .org CPM$Transfer ; jp Start ; Overwriting the initial code with succesive pointers ; to each command line parameter allows up to 34 arguments ; be passed to the main program. Err_msg: .str "Z80 processor required!" .db 00,13,10,'$' ; Start: ld a,#0x7f ; Load with largest positive signed value. inc a ; Incrementing should result in an overflow. jp pe,Init ; Z80 processor sets the parity flag to signify overflow (8080 doesn't). ld de,#Err_msg ; Display error message. ld c,#BDOS$Print_Str ; Print string. jp CPM$BDOS ; Jump to BDOS (when BDOS returns program will exit). Note - Can't use ; ; puts() as that routine itself uses Z80 opcodes... Init: ld (Stack),sp ; Save the stack pointer. ld sp,#Stack call _main ; Call main(). ld sp,(Stack) ; Restore original stack pointer. ret ; BIOS: ld hl,(CPM$Base+1) ; Get address of warmstart routine and use hi byte as BIOS address. ld l,c ; Put code (actually the offset into the jump table) in to LSB of address, and the BIOS ld c,e ; offset into the MSB. Put or word byte arguments into BC where the BIOS expects them. ld b,d jp (hl) ; Jump directly to BIOS function... ; _getchar:: ld c,#BIOS$CON_Input ; Get character using a direct BIOS call. call BIOS ld e,a ; Return value in DE (it is an int not a char). xor a ld d,a ret ; _putchar:: ex de,hl ld a,e sub a,#0x0a or a,d jr nz,NoLF push de ld de,#0x000d ; Print a carriadge return before any linefeed. ld c,#BIOS$CON_Output call BIOS pop de NoLF: ld c,#BIOS$CON_Output call BIOS ld de,#0x0000 ; Return 0. ret ; .area _CODE .area _DATA .ds 256 Stack: .dw 0 ;
This code defines some constants and then set the program origin to be the address at which CP/M programs are loaded, before jumping to the start of the code that checks if the processor is a Z80 which is done by taking advantage of the fact that the 8080 processor doesn’t set the parity bit if an overflow occurs. If the processor doesn’t behave like a Z80 the program simply prints an error message using a CP/M BDOS routine. Otherwise we save the stack pointer and set it to the top of the area in memory that we have set aside for the stack before calling ‘_main’ which is the main() function defined in our C program.
To compile our program we first need to use the assembler to assemble the C runtime into a relocatable object file.
$ sdasz80 -o sdc-crt0.s $ This only needs to be done again if the runtime is updated. Note that the relocatable format used is NOT the same at that used by Microsoft MACRO-80, or FORTRAN-80 etc.
We can now compile and link our program using ‘sdcc’. However, unlike ‘gcc’ the compiler does not produce an executable program but instead produces an Intel Hex file suitable for uploading to a development board, so an additional step is required to turn this into a binary file that can be executed by CP/M.
$ sdcc -mz80 --no-std-crt0 --data-loc 0 sdc-crt0.rel sdc-hello.c $ | -mz80 | Tells the compiler to produce code for the Z80 processor. |
| –no-std-crt0 | Do not use the standard ‘C’ runtime. |
| –data-loc 0 | Stores any data immediately after the code (otherwise is is stored starting at address 0x8000 which makes the binary files very large. |
This is analogous to the way that the original CP/M ‘LOAD’ command was used to process the hex files produced by the CP/M assembler and turn them into a binary file that could be executed, but unfortunately the original ‘LOAD’ command can’t handle the hex files produced by the compiler as they are not contiguous and can have out of order records, so we need to use the following command instead.
$ objcopy sdc-hello.ihx -Iihex -Obinary sdc-hello.com $ We can now transfer this binary file to our CP/M system and run it, note that since CP/M file names can only be 8 characters long it will need to be renamed.
A>hello Hello World !! Hello World !! Hello World !! Hello World !! Hello World !! Hello World !! Hello World !! Hello World !! Hello World !! Hello World !! A> Feeling rather pleased with myself, I decided it was time to try something a bit more ambitious!
#include <stdio.h> int main(int argc, char* argv[]) { for (int i_count = 0; i_count < argc; i_count++) { printf("'%s'\n",argv[i_count]); } return 0; }
I compiled it and copied the binary executable file to my CP/M system as before.
$ sdcc -mz80 --no-std-crt0 --data-loc 0 sdc-crt0.rel sdc-args.c $ objcopy sdc-args.ihx -Iihex -Obinary sdc-srgs.com $
However, although it compiled without any errors when I ran it the result was a little disapointing..
A>args one two three A> What is wrong this time?
The problem in this case is that C expects the command line arguments to be passed to the ‘main()’ as parameters and as the custom C runtime we just wrote doesn’t do that so the program doesn’t print anything. To pass the command line arguments to ‘main()’ we need to update the C runtime to do this ourselves. This means getting to grips with the way ‘sdcc’ passes parameters to functions, and actually writing some Z80 assembler to parse the command line.
The table below shows how ‘sdcc’ passes the first two parameters to a function, if there are any additional parameters are passed on the stack.
| In | Out | ||
|---|---|---|---|
| Data Type | Registers | Data Type | Registers |
| char char, char char, int char, long | A_reg A_reg, L_reg A_reg, DE_reg A_reg, stack | char int | A_reg DE_reg |
| int int, char int, int int, long | HL_reg HL_reg, stack? HL_reg, DE_reg HL_reg, stack | ||
| long long, char long, int long, long | HL_reg + DE_reg HL_reg + DE_reg, stack HL_reg + DE_reg, stack HL_reg + DE_reg, stack | ||
Note that this parameter passing convention ONLY applies to ‘sdcc’ version 4.1.12 and later! I think that earlier versions pass all arguments on the stack.
Both arguments are integers as although ‘argv[]’ is an array is is passed by reference so from this table you can hopefully work out that the value of ‘argc’ needs to be in the HL register and a pointer to ‘argv[]’ needs to be passed in the DE register.
This means that the C runtime code will have to scan the CP/M command line, and then create an array of pointers to each argument. Since the CP/M command line is stored in memory at 0x0080, my plan was simply to scan this buffer saving the address of each argument in an array, and inserting a string termination character (null) where the spaces were between each argument in the buffer.
The only question was where to store the array of pointers, as I didn’t want to use up any more memory than I absolutely had to. I knew I wasn’t going to need a lot of space as I didn’t think I’d ever have more than about 24 separate arguments which would only need 48 bytes. At this point I noticed that there were 27 bytes (more than half the space I needed) just sitting there in the "Z80 processor required." error message text which I wasn’t going to need again, and if I simply overwrote the initial bit of code that checked the CPU type and that error message I’d have plenty of space to store all the pointers I needed. (In fact I ended up with enough space to store the pointers for 34 arguments).
I also cheated a bit by overwriting the character count in location 0x0080 with a null character and used this as ‘argv[0]’, as although on a UNIX system this would normally contain the program name, on CP/M we have no way to retrieve this. This is ANSI compliant as it it legitimate to leave ‘argv[0]’ blank if the program name is not available from the host environment.
; ;-- crt0.s ; ; Modified runtime for generic CP/M. ; .module crt0 .globl _main ; BIOS$CON_Input .equ 0x09 ; Console input. BIOS$CON_Output .equ 0x0C ; Console output. ; BDOS$Print_Str .equ 0x09 ; Print a string to CON:. ; CPM$Base .equ 0x0000 ; Base of CP/M OS. CPM$BDOS .equ 0x0005 CPM$Buff .equ 0x0080 ; Start of CP/M file buffer 80..FFH. CPM$Transfer .equ 0x0100 ; Program load address. ; Max_Args .equ 35 ; .area _HEADER (ABS) .org CPM$Transfer ; jp Start ; Overwriting the initial code with succesive pointers ; to each command line parameter allows up to 34 arguments Err_msg: .str "Z80 processor required." .db 00,13,10,'$' ; Start: ld a,#0x7f ; Load with largest positive signed value. inc a ; Incrementing should result in an overflow. jp pe,Parse ; Z80 processor set the parity flag to signify overflow (8080 doesn't). ld de,#Err_msg ; Display error message. ld c,#BDOS$Print_Str ; Print string. jp BDOS ; Jump to BDOS (when BDOS returns program will exit). ; Parse: ld hl,#CPM$Buff ; Pointer to command line. ld bc,(#CPM$Buff) ; Length of command line. xor a ld b,a add hl,bc inc hl ; Offset to end of command line. ld (hl),#0x00 ; Terminate commans line with a null. ld bc,#0 ; Number of arguments. ld hl,#0x80 ; Pointer to command line. ld de,#CPM$Transfer ; Pointer to ARGV. ld (hl),a ; Add delimiter. jr Save Next: inc hl ; Move to next char in buffer ld a,(hl) or a ; Set flags jr z,Done cp #' ' jr nz,Next xor a ld (hl),a ; Terminate parameter. Loop: inc hl ; Move to next char in buffer ld a,(hl) or a ; Set flags jr z,Done cp #' ' jr z,Loop Save: ex de,hl ; Store address of string in argv[]. ld (hl),e inc hl ; Increment pointer to argv[] ld (hl),d inc hl ; Increment pointer to argv[] ex de,hl inc bc ; Increment number of arguments. ld a,c cp #35 jr z,Done ; Stop parsing after 34 arguments! jr Next Done: ld (Stack),sp ; Save the stack pointer. ld sp,#Stack xor a ld h,a ; Number of arguments. ld l,c ld de,#CPM$Transfer ; Pointer to argv[]. call _main ; Call the C main routine ld sp,(Stack) ; Restore original stack pointer ret ; BIOS: ld hl,(CPM$Base+1) ; Get address of warmstart routine and use hi byte as BIOS address. ld l,c ; Put code (actually the offset into the jump table) in to LSB of address, and the BIOS ld c,e ; offset into the MSB. Put or word byte arguments into BC where the BIOS expects them. ld b,d jp (hl) ; Jump directly to BIOS function... ; BDOS: call #CPM$BDOS ; Call BDOS ex de,hl ; Return result in DE ret ; _getchar:: ld c,#BIOS$CON_Input ; Get character using a direct BIOS call. call BIOS ld e,a ; Return value in DE (it is an int not a char). xor a ld d,a ret ; _putchar:: ex de,hl ld a,e sub a,#0x0a or a,d jr nz,NoLF push de ld de,#0x000d ; Print a carriadge return before any linefeed. ld c,#BIOS$CON_Output call BIOS pop de NoLF: ld c,#BIOS$CON_Output call BIOS ld de,#0x0000 ; Return 0. ret ; .area _CODE .area _DATA .ds 256 Stack: .dw 0 ;
Would it work? Well obviously the code above does work, but I won’t pretend it was my first attempt! However, it isn’t quite finished yet, as it does need a little more work to allow the use of global variables.
A>args one two three /4 /5 /6 '' 'one' 'two' 'three' '/4' '/5' '/6' A> So we have a modern C compiler that can cross-compile programs for a CP/M system that unfortunately lacks the ability to access the file system, which not ideal. However, there are third party libraries available and even without them it does allow me to write CP/M programs in a high level language using a modern IDE that would have taken me much longer produce otherwise.
As a bit of a challenge, and because I found that someone had solved the same problem using assembler I decided to see if I could display the calendar for a month using C. If I’d tried writing this program in in assembler it would have taken quite some time, even though I do already have a library of functions to do integer arithmetic and print integers. However, using ‘C’ it only took about 40 minutes.
/* * sdc-calendar.c * * Displays the calendar for a month. * */ #include <stdio.h> #include <stdlib.h> int i_isLeapYear(int i_year) /* * Returns true if the year is a leap year for years > 1752. * */ { return(i_year % 4 == 0 && i_year % 100 != 0 || i_year % 400 == 0); } int i_weekday(int i_day, int i_month, int i_year) /* * Returns the day of the week (Sun = 0, Mon = 1, etc) for a given date * for 1 <= i_month <= 12, i_year > 1752. * * https://en.wikipedia.org/wiki/Determination_of_the_day_of_the_week#Sakamoto's_methods * */ { int i_lookup[] = {0, 3, 2, 5, 0, 3, 5, 1, 4, 6, 2, 4}; /* Don't use static (sdcc bug?) */ if (i_month < 3 ) { i_year -= 1; } return (i_year + i_year / 4 - i_year / 100 + i_year / 400 + i_lookup[i_month - 1] + i_day) % 7; } void v_print_calendar(int i_month, int i_year) { const char* s_month[] = { " January", " February", " March", " April", " May", " June", " July", " August", " September", " October", " November", " December" }; int i_length[] = {31, 28 + i_isLeapYear(i_year), 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}; /* Doesn't have to be declaired before any statements !! */ int i_start = i_weekday(1, i_month, i_year); printf("%s %4d\n",s_month[i_month - 1], i_year); /* Print heading (month and year) and days of the week. */ printf("Su Mo Tu We Th Fr Sa\n"); for (int i_count = 0; i_count < i_start; i_count++) /* Indent the first line if necessary */ { printf(" "); } for (int i_day = 1; i_day <= i_length[i_month - 1]; i_day++) { printf("%2d ", i_day); if (++i_start > 6) /* End of week, print a newline. */ { i_start = 0; printf("\n"); } } if (i_start) /* print a new line after the last day unless it is the end of a week. */ { printf("\n"); } } int main(int argc, char* argv[]) { if (argc == 3) { int i_month = atoi(argv[1]); int i_year = atoi(argv[2]); if (i_month > 0 && i_month < 13 && i_year > 1752) v_print_calendar(i_month, i_year); else printf("Year or month out of range.\n"); } else printf("Usage: cal month year\n"); return 0; }
I have to admit I was quite pleased with the result, the executable file is only 4.625K so it only just over twice the size of the hand crafted version which I don’t think is too bad.
A>cal 9 2023 September 2023 Su Mo Tu We Th Fr Sa 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 A> Looking for a bit more of a challenge I decided to see if I could simply compile the following program which displays a Julia set using ASCII symbols. I expected that this would give the Z80 processor a bit more of a workout, and I wasn’t wrong!
/* * gcc-julia.c * * Displays an an ASCII representation of a Julia set. * */ #include <stdio.h> /* fprintf(), etc */ int main() { const char* c_ascii[] = {" ", ".", ":", "-", "=", "+", "*", "#", "%", "$", "@"}; const float xa = -1.5; /* Left edge */ const float xb = 1.5; /* Right edge */ const float ya = -1.0; /* Top edge */ const float yb = 1.0; /* Bottom edge */ const float xd = 0.025; /* X step size */ const float yd = 0.05; /* Y step size */ const int m = 100; /* Iterations */ float zr, zi, temp; float r = 2.0; /* Radius */ float x, y; float cr = -.79; float ci = 0.15; int i; for (y = ya; y < yb; y = y + yd) { for (x = xa; x < xb; x = x + xd) { zr = x; zi = y; i = 0; while ((zr *zr + zi * zi < r*r) && i < m) { temp = zr*zr - zi*zi; zi = 2 * zr * zi + ci; zr = temp + cr; i++; } printf("%s", c_ascii[i/10]); } printf("%s\n", c_ascii[0]); } return 0; }
Note that I didn’t have to make any changes to the original code.
After compiling the program I transferred it to my CP/M system, and ran it as before.
All that floating point arithmetic certainly kept the processor busy but everything worked exactly as expected, it was just a bit slower than on a modern system! (On a real 4Mhz system this program will take about 21 minutes to complete).
A>julia .:. =@=::#: @@=-@%@ ..::::#. :.+=--+@--%. =+-..#+@@::::@-....--@:.@..@@ .@*::**=@=::::..:=@@@*@@-:@-# .:-:::*@@@-:::::--@@%@**=+::%+-: -@+ @@-=@@@==*@**=*@:$#*@@%@@@@@@@*@@#:-. @-.@@@+:-. - .$-=+@$**@@====*@@@@====@@@-#=@@@%@:. .-@+:%@@:::.. @@:@. .@. @#:=*@$@@@@**+@%@@@@#-#*-::::=::-*@@#+. .*%@$#@$=+:%%@@=-- @-$-..@@@*:@....+ ..-+#@@@@@%@@@@@*$#@=:::@::::::::::-+%@.....%@@@@#*##@@=*@@@#. ..:@+::::*@%@=@#%*+::@.. ...:%@$@@@@%**%==+=*$::::::::::::::::--+@....@@$@#*@+@@:=::::+-. +:@= .::@@#-:%*@=*@@-=@@=*@@@.......@=@+@*@@*+@=====-::::::::::::@+=#---@=....=@=@@=-*+:::::::-@.=@@*#@. -+@*@@=+@@@*%::::::::@+*$@......@=*@+@@@@@@--@=@::::..:::-@@==++*--#@....::@@@@@::::::@=@=+.:@@:::-.%@ .:..+. .=#@@@@@+#@::::::::::::--+@-.......:*==*#@@@#@@::::......:-*=-=+$===@::::::::-#%#@+:...@--::::##@$.@@- $@-$=@+=@*...-+@@+@==%::::::::-#@@----@=......:::=%@@@@@@@%=:::......=@----@@#-::::::::@==@+@@+-...*@=+@=$-@@ -@@.$@##::::--@...:+@#%#-::::::::@===$+=-=*-:......::::@@#@@@#*==*:.......-@+--::::::::::::@#+@@@@@#=. .+..:. @%.-:::@@:.+=@=@::::::@@@@@::....@#--*++==@@-:::..::::@=@--@@@@@@+@*=@......@$*+@::::::::%*@@@+=@@*@+- .@#*@@=.@-:::::::+*-=@@=@=....=@---#=@@::::::::::::-=====@+*@@*@+@=@.......@@@*=@@=-@@*=@*%:-#@@::. =@:+ .-+::::=:@@+@*#@$@@....@+--::::::::::::::::$*=+==%**%@@@@$@@:... ..#::+*%#@=@%@*::::+@:.. .#@@@*=@@##*#@@@@%.....@%+-::::::::::@:::=@#$*@@@@@%@@@@@#+-.. +....@:**@@..-$-@ --=@@%%:+=$@#$@%*. .+#@@*-::=::::-*#-#@@@@%@+**@@@@$@*=:#@ .@. .@:@@ ..:::@@%:+@-. .:@%@@@=#-@@@====@@@@*====@@**$@+=-@. - .-:+@@$.-@ .-:#@@*@@@@@@@%@@*#$:@*=**@*==@@@=-@# +@- :-+%::+=**@%@@--:::::-@@@*:::-:. #-@:-@@*@@@=:..::::=@=**::*@. @@..@.:@--....-@::::@@+#..-+= .%--@+--=+.: .#::::.. @%@-=@@ :#::=@= .:. A>
My younger self would probably have thought it was well would be worth the wait!
The resulting executable is a little larger than before but is still only 6.25K which doesn’t seem unreasonable.
If you do have a fast system you can try adjusting the values for cr and ci to see the effect they have on the pattern, even very small changes can make quite a dramatic difference.
After that what about some serious number crunching? The following program will print out the first 7660 digits of Pi. This is something I wouldn’t even have thought possible back when I was using CP/M, and it may not have been as the algorithm is over a decade newer1 than my CP/M machine was!
The only change I had to make to the original code was to reduce the value of N so the resulting program would fit in memory.
/* * sdc-spigot.c * * A spigot algorithm for the digits of Pi, Stanley Rabinowitz and * Stan Wagon, Amer.Math.Monthly, March 1995, 195-203 with bug fixes by * C. Haenel. * * See https://stackoverflow.com/questions/4084571 * */ #include <stdio.h> #define N 7660 /* Max 7660 if CCP starts at 0xDC00 */ #define LEN (10L * N) / 3 + 1 /* Chain length. */ unsigned j, predigit, nines, a[LEN]; long x, q, k, len, i; void main() { for(j=N; j; ) { q = 0; k = LEN+LEN-1; for(i=LEN; i; --i) { x = (j == N ? 20 : 10L*a[i-1]) + q*i; q = x / k; a[i-1] = (unsigned)(x-q*k); k -= 2; } k = x % 10; if (k==9) ++nines; else { if (j) { --j; printf("%ld", predigit+x/10); } for(; nines; --nines) { if (j) --j, printf(x >= 10 ? "0" : "9"); } predigit = (unsigned)k; } } printf("\n"); }
I have tested this on ‘simh’ and verified that the output is correct, but if you do decide to try this on a real Z80 running at 4Mhz you will need to be patient as it will take approximately 248 hours to finish!
1 I’m sure there were earlier implementations in FORTRAN 66 but I doubt I would have had the patience to wait 10 days for the program to finish.