by layer8 on 2/8/26, 12:06 AM
If this implementation had existed in the 1980s, the C standard would have a rule that different tokens hashing to the same 16-bit value invoke undefined behavior, and optimizing compilers in the 2000s would simply optimize such tokens away to a no-op. ;)
by mati365 on 2/7/26, 8:56 PM
Oh, it looks like my X86-16 boot sector C compiler that I made recently [1]. Writing boot sector games has a nostalgic magic to it, when programming was actually fun and showed off your skills. It's a shame that the AI era has terribly devalued these projects.
[1] https://github.com/Mati365/ts-c-compiler
by xorvoid on 2/7/26, 8:12 PM
I may be the author.. enjoy! It was an absolute blast making this!
by riedel on 2/7/26, 8:00 PM
by wzbtoolbox on 2/8/26, 10:33 AM
This is the kind of project that reminds you how far removed modern development is from the actual machine. We pile abstractions on abstractions until "Hello World" needs 200MB of node_modules, and then someone fits a C compiler in 512 bytes.
Not saying we should all write boot sector code, but reading through projects like this is genuinely humbling. Great educational resource too.
by mojuba on 2/7/26, 8:56 PM
Compare that to the C compiler in 100,000 lines written by Claude in two weeks for $20,000 (I think was posted on HN just yesterday)
by sanufar on 2/7/26, 8:09 PM
The way hashing is used for tokens and for making a pseudo symbol table is such an elegant idea.
by shikaan on 2/8/26, 9:08 AM
Such a great read! Reminds me of the bootsector OS I made some time ago[^1]
Maybe it's time to equip it with a C compiler...
[1]: https://github.com/shikaan/osle
by fooker on 2/8/26, 2:34 AM
by kreelman on 2/8/26, 5:17 AM
There seems to be a good amount of interest for a boot sector compiler!!
If you're running on Linux, adjust the qemu call to use alsa rather than coreaudio.
I generated a pull request for this on Github. If the author is happy enough with my verbose shell scripting style :-) it might get included.
by zahlman on 2/8/26, 9:43 AM
> Big Insight #2 is that atoi() behaves as a (bad) hash function on ordinary text. It consumes characters and updates a 16-bit integer.
I could have sworn I remembered atoi() being defined to return 0 for invalid input (i.e. text not representing an integer in base ten).
by drob518 on 2/8/26, 8:04 PM
Brilliant! I love the stealing of Forth ideas to power this. Forth’s minimalism is highly underrated.
by alittlebee on 2/8/26, 7:33 PM
This is really beautiful (I feel like this sort of project is outsider art), thank you for sharing.
by hgs3 on 2/8/26, 6:17 PM
Great read. It would be neat to see a mini operating system under 1 kb of code.
by userbinator on 2/8/26, 4:08 AM
C-subset, to be precise; but microcomputer C compilers were in the tens of KB range, for one that can actually compile real C.
by DeathArrow on 2/8/26, 7:46 AM
For me is not interesting because it fits in 512 bytes, it's interesting because it's very simple. I think it would be a great introduction to learning about compilers.
by SeanSullivan86 on 2/7/26, 9:22 PM
Why is it called a C Compiler if it's a subset of C?
by wbsun on 2/8/26, 5:28 AM
Nice, now you can dd it to your boot sector and ... Wait, it is 2026, there are 1000 ways of booting and memory mapping on so-called unified ARM architecture @,@
by NooneAtAll3 on 2/7/26, 8:47 PM
> I wrote a fairly straight-forward and minimalist lexer and it took >150 lines of C code
was it supposed to be "<150"?
by EGreg on 2/7/26, 11:34 PM
by gonzus on 2/7/26, 9:40 PM
Lacking support for structs, I think this is too minimalistic to be called "a C compiler".
by kayo_20211030 on 2/7/26, 10:02 PM
Nice. Very K&R-ish. Not a bad thing.