This is a challenge to create a personal computing system which is:
- Simple. The source code must be a relatively small size, following the assumption that more code roughly equals more complexity.
- Useful. A human can plug in their display and keyboard and start using the system to do useful things.
- Complete. The system can be used to create and run new programs. It doesn't depend on other systems to improve it or make it more useful.
Apologies for the somewhat free-form nature of this post. Much of it comes directly from my personal notes, which weren't exactly written for public consumption. If there is interest I will get around to polishing up this content.
Entries
None yet!
If you believe you have a viable entry to this challenge, email me.
- Do NOT email me attempting to submit other peoples' projects. You should instead let them know to get in touch with me if they are so inclined.
- Do NOT email me your project without including at least a preliminary self-evaluation against the challenge's rubric. If you haven't initially counted your full system (including toolchain) lines-of-code, please do so before emailing me. I will only spend my time doing a thorough evaluation if you spend some of your time proving your system has a shot. You don't have to be committed to your initial numbers.
Beware
A serious attempt at this challenge will steal months of your life away.
You will learn a lot and become a more skilled programmer, but it will cost you a significant amount of time and energy.
Challenge criteria
The following criteria must be met for the system to be considered compliant.
These criteria are chosen deliberatly to exclude many existing systems or approaches. They are chosen such that if followed, I believe a truly unique, interesting, and powerful system will result. You can build whatever you want, but if you try to conform to these it will be easier to compare with other peoples' systems making this attempt.
Use less than 100,000 lines of any code total
- The count includes the build system which creates the image. The compiler, linker, interpreter, etc. all count!
- The count excludes fundamental utilities like file opening during build (i.e., functions provided by the operating system you initially built the system on).
- Bonus points however if you do not use the operating system (e.g., your entire system builds from a single file or could be practically hand-assembled).
- This count excludes authoring tools like text editors the system implementor used to create and edit the code. It similarly excludes tools the implementor uses to debug the system while it is being written.
- The count excludes anything related to getting the image onto the boot media (e.g. copying to USB,
dd, etc.) so long as it is a generic operation. - This count excludes anything assumed to be part of the machine's firmware, e.g. BIOS. If it comes with the hardware right out of the box, it does not count towards the lines of code.
- Eventually, this challenge should be extended to include the hardware, but we should have tried implementing Simple Useful Systems before we start trying to make Simple Useful Hardware. Once we have Simple Useful Hardware, it should become easier to make Simple Useful Systems.
- A line of code is a line of human-editable source code. Utilities like
cloccan be used. Comments do not count. Documentation does not count either. Lines of code generated by other code during build do not count. Massively long and/or dense lines (e.g. regex style one symbol = one operation) count for more than one line (e.g., they then need to count per-token). Do not try to game this.- How to tell if you are gaming it: if you have another representation you use to write the submitted representation, you are gaming it. Your other representation is the true source you are writing the system in. For example, you write the system in C, then submit disassembly after a C compiler processes the C. The true source is C and counts; the disassembly actually does not count because it was generated by the compiler. Remember, however, that the C compiler's source and the toolchain that built it also count!
- FPGAs blur the lines a bit between hardware and software. Your FPGA code counts, and the FPGA synthesis toolchain also counts towards the lines-of-code metric.
All of these exclusions of code should not be read as to minimize the benefit of eliminating these other things, e.g. not requiring any firmware on the machine to run. The less complexity there is anywhere in the entire stack, the better.
Bootstrapping
If I use GCC to build Tiny C Compiler, then use Tiny C Compiler to build itself, do I need to count GCC?
I say no, you don't need to count GCC then, but you get bonus points if you can build the thing using an e.g. "bring your own interpreter" idea, which in theory could be done by hand or something crazy.
The bring-your-own-interpreter is actually a nice idea because it removes the requirement for a specific tool, which maybe is a way of getting around the "eventually you need to count some giant dependency" problem. The interpreter, however, would still count against the lines of code!
There could be said to be a "seed executable", which is allowed to bootstrap the system so long as A) the system can build the seed executable from source and B) the executable is under a specified size (e.g. 5 MiB), but I would be pretty skeptical of this and would need to be convinced it was fair and in spirit of the challenge.
The system must run on bare metal
- This specifically excludes WebAssembly, QEMU, virtual machines, etc. It must be the only thing running on the machine (not counting management engines which the programmer has no choice over; BIOS is also fine to enter the system through).
- If you can't poke the thing running your system, it does not count.
The system does not depend at all on a network connection
This requirement is to exclude any tomfoolery around actually doing the computation on a server somewhere else, and just having the system be a terminal or dumb client.
By network, I mean any means to connect one machine to another. Wired or wireless, bluetooth, etc. all count under this specification.
The system should display its output to a screen connected to the machine through its standard video ports
- This is written specifically to make e.g. headless systems count as "not useful", because they are dependent on another system (e.g. TTY) to be useful.
- This criteria can be met by utilizing the hardware's existing interfaces to the display. The display protocol need not be implemented from scratch if the hardware already handles communicating with the display–if the hardware provides you e.g. a framebuffer, you may use it to satisfy this requirement.
The system can receive keyboard input via USB
- This is written specifically to make e.g. headless systems count as "not useful", because they are dependent on another system (e.g. TTY) to be useful.
- Virtually all modern keyboards are USB. This interface must be supported.
- Keyboards need not be universally supported. If an off-the-shelf keyboard the system implementor has on hand works, this requirement is satisfied.
- Multiple keyboard layouts need not be supported.
- Purely touch-based interfaces can pass only if they provide an on-screen keyboard for input. This is stretching the Useful criteria, however.
The user can create and run new programs within the system
- The user must be able to save their code and resume working on it even after system shut-down.
- Other storage media can be required from the user in order to meet this request, but the user should still be able to write and run new programs without having to insert any storage media (they just won't be able to save the program in that case).
- The program creation must be contained within the system, i.e., no external utility may be used. For example, a standard program creation flow, that means the system has a text editor, compiler, and loader. If your language works differently, that's fine, so long as it is "usable" to create new programs.
- The method of creation should be at least somewhat practical. Anything more impractical than assembly language will not be considered useful. A macro assembler would be acceptable. Interpreters are acceptable, though ideally the language is compiled.
Could I make a simple game, or a spreadsheet application, or a program to track my budget, or a paint application in your system? If not, you are stretching the Usefulness criteria.
The system has reasonable facilities for coping with errors and debugging newly created programs
By error, I mean the execution of an instruction which causes the program to no longer be able to be run. I do not mean logical errors.
This requirement is specifically to make the creation and running of new programs requirement more rigorous, because it is unreasonable to expect someone to write programs successfully if they don't have any recourse when they make mistakes.
- If the user causes a runtime error in their program, they should be told what the error is, and preferably, where in their program the error was encountered.
- Exception is made for errors which are difficult to detect even with existing systems.
- The system will try to display portions of memory which might be relevant to debugging the program. Ideally, the user can inspect memory interactively.
- The system is allowed to crash only if the program or change being made by the user would reasonably crash a modern system if it had errors. For example, a paint program should never crash the system, but modifications to the core kernel or operating system parts or other fundamental hardware interfaces may crash.
The user's programs can save data that can be loaded even after the system has been rebooted
To put it simply, the system must be able to remember data for the user even after it has been turned off and back on.
Other storage media can be required from the user in order to meet this request.
The system targets modern hardware
This challenge is about making a simple system on modern hardware.
While I respect efforts to make new things for old hardware (e.g. retro computing, making things for 6502/Z80, etc.), I want to emphasize looking towards the future of computing, which includes the present and future of computing hardware.
Your system can (and practically, should) target a specific piece of hardware. For example, you could target the Raspberry Pi 5, or a specific ESP32 system-on-a-chip board. The hardware should be modern (still in production) and publicly available (not just in your garage, and not locked down like a modern game console).
If you build your own hardware, that's cool, but maybe save it for a future "Simple Useful Hardware" challenge. Do let everyone know you've done this though; the more attempts at new hardware, the better.
Bonus points
Bonus points are granted for creating systems that approach practical usability more than just the bare minimum specified previously.
- Bonus for bootstrapping the system, i.e., the system can be used to author, build, and "flash" better versions of itself.
- Bonus on this bonus if the system can do this in-place, i.e., without a separate boot media which must be physically switched to in order to try the new version.
- Bonus on this bonus if the system can "update" itself without needing to be restarted. Acceptable exclusion includes if the lowest levels of boot are not changeable without restart, e.g. the basic processor configuration/processor mode/setting registers. "Operating system"-level components like memory mapping or scheduling would need to be updatable without restart in order for this bonus to be met.
- Bonus if the system can debug another running instance of the system, i.e., act as a hardware debugger for diagnosing issues at the most fundamental level of the system software (e.g. diagnose errors in the system's boot sequence). It is expected that this would require having two separate pieces of hardware, one being the debugger and the other the debuggee, with both running the system, though the debuggee would be running a potentially different version. A hardware chip such as a JTAG interface, so long as it does not require any new firmware out of the box in order to function, is acceptable as part of this setup.
- Bonus for escaping text-only graphical display. This would be a requirement except for the fact that ASCII art can go a long way, and pure text-only systems can be quite useful.
- Bonus for supporting reading and writing runtime removable media. The litmus test is whether the user can insert some piece of media, move data onto that media, remove the media, and then give to a friend who could then copy the data to their system (of the same system type used to write the data to the media).
- Bonus for being able to share data across a network with another system of the same type. Note that this network could be over e.g. Bluetooth, custom-built radio, or even audio, not just Ethernet. The main idea is "can I share data without having to plug something in, or if I did plug something in, it was normal that it was plugged in to the other system (e.g., both systems have ethernet ports and were connected to the router)". There is no requirement for this setup to follow existing protocols, only that it can talk to another system of its same type. Note that this communication cannot be facilitated by a middle-man which is not of the same system type, unless that middle man is a standard "utility" like an ethernet router.
- Bonus on this bonus if there is absolutely no other hardware involved. Ethernet routers would invalidate this bonus. This would necessitate wireless communication, because connecting a wire directly should be a mostly trivial implementation, and is therefore not interesting enough to merit a bonus.
Rationale for challenge
- Not intended to be a "code golf" challenge. Don't let the size constraint be your driving inspiration. It is intended to inspire innovation, create more branches in the OS tree, and help more people learn low level.
- Important to see the things that are left out: hardware acceleration, broad device support, etc. are deliberate because one can have a useful system without needing all these things. Game consoles are a good example of this. The "personal" computer can be a device with limitations, so long as it helps the human fulfill their desires.
- I believe that hardware is important, and supporting multiple platforms can be limiting in terms of how effective you can utilize the hardware. With a sufficiently small base (and sufficiently comprehensible hardware), it becomes more reasonable to have many different systems for many different pieces of hardware. The idea of writing an application to work on many different pieces of hardware does have its benefits, but the drawbacks come in the form of incomplete hardware support, which ultimately gets noticed by the users. If, for example, I know the Raspberry Pi has GPIO, then even my pixel paint program could benefit from integrating that because the user could e.g. home roll their own pressure sensor or some other creative thing I never even thought of. If instead I am limited to making my application for the lowest common denominator of hardware, then either A) I cannot implement GPIO support in my app, B) it is unreasonably hard to implement GPIO, or C) I must limit my app to only that hardware. Unique features per hardware device depends on it being inexpensive to develop said features, which I believe reinforces the necessity of fighting complexity, not only in APIs but in workflow–iteration, debugging, distribution, etc.
- It is supposed to help people realize how much complexity we have built up. It is supposed to point out how languages, operating systems, and hardware have gotten unreasonably complicated. It is supposed to inspire a healthy forest fire burn down of technology.
- It is supposed to show us how our workflows and attitudes around technology can be changed. We can iterate faster, debug more intuitively, distribute more fairly/reliably/etc., and can feel more confident about making new things. We can feel good about our new systems.
- It is not supposed to slam other systems, but it is supposed to show another viable approach.
- It is not supposed to be an advertisement for specific languages, libraries, etc. It is, however, supposed to act as another gauge to judge tools by: is the tool minimally complex while still being useful, or is it huge and scary?
- It is supposed to be an honest attempt at declaring independence from old complex technology, with the hopeful vision that a fresh start will allow for faster and more sustainable innovation. By "sustainable", I mean, we can continue building new things without feeling like there is an accumulation of complexity that needs to be sweated against before new things can be built.
- The goal might be "a system or culture of systems that can be learned in the span of a small amount of time (say, a month), then modified and used to create things. The learner should be left with confidence that their only limitations are that of the underlying hardware and fundamental computer science principles, not that of boundless incomprehensible complexity, useless cruft, and knowledge of dull trivia."
- It is not supposed to be a prescription for what is necessary or useful to everyone, nor what a "complete system" should be. It is supposed to be a bare baseline, one that could plausibly become a viable platform for innovation and human realization. Now, we don't have baselines, we only have giant galaxies of code.
Technical recommendations
The following are some of my thoughts and speculation on how a system like this could evolve, and maybe recommendations for creators on how to make their system more appealing to me (and by extension, others with similar values).
- Possibly recommend that one's system baseline be kept separate or tagged in a way such that one can continue growing without losing or overwriting the comprehensible base.
- Ideally, a system can be kept small forever, but practically, especially if multiple authors are involved, growth past intended size is inevitable. This is valuable if complexity and usefulness have a linear relationship, because that means the system is getting more useful without becoming disproportionately complex. It is my belief things have become disproportionately complex in modern systems and applications.
- I am hesitant to propose any sort of breaking up of things, e.g. having the "OS" be small and the applications grow without measure, because that narrows the possible solutions that having a mono-repo approach offers. See e.g. Casey Muratori's "The only unbreakable law" on Conway's law. While a useful base and user-customized plug-in/application lists has advantages, there are enough definite complexities and disadvantages that it might be worth thinking of other ways of approaching the problem. The problem might be stated simply, "Not everything you find useful is useful to everyone else, and the same goes for everyone else to you. But, some things other people find useful you would appreciate too, and it might be good if you could share those things and possibly build the useful thing together. You don't want everything everyone has ever made because there would be so much you don't want."
- We should be thinking about how we distribute software, because there is a lot of room for improvement. Binaries put a huge burden on maintainers, are expensive to host and distribute, cause trust issues, favor secrets and centralization, and let build systems and code complexity spiral out of control. Code takes a long time to build and can be larger than the final product (the former is not an inherent issue; it can be solved). Code is difficult to read and the issue of trust is still present even when you can read the code (and obfuscation will always exist).
- One cannot think about software distribution without considering security, but security should not destroy any possibility of a better solution. Examples: stronger permissions models, but still satisfying and flexible inter-process communications. Comprehensible sandboxing schemes. Actually using the trust ideas from e.g. GPG. I should be able to email my brother executables, because we have a trusted communication channel. I shouldn't just lose that ability just because sometimes it is exploited.
- Similarly, the whole concept of updates should be reexamined. It's clearly an annoying complex thing to be bombarded with constant updates, which inevitably cause regressions and so on. We should think more deliberately about what it means to the user to update, and how it affects them. Think about how to convey the significance of an update. If e.g. the app has a new language I don't understand, then don't even ask to update because I don't care. What if users could actually post their desired update "tags", then when the developers reach that tag, they can indicate such and the users can then decide to update. (Note me naturally assuming a dichotomy between users and developers; in my ideal there is no such separation; everyone helps make near everything.) For example, my pixel app has users who desire a "palette managment" system, so I A) see what people want and B) can easily notify those people once I have the changes necessary to create that system. Don't make me care about security if I don't care about security.
- The goal is a less divided system where users and original software authors work much more closely together, e.g. by letting users more actively report bugs (with callstacks, recorded macros of seconds of app state before, etc.; all of course reviewable by the user for their privacy), leave comments right in the code on their machine that can at their option be made public (and the public at their option receive annotations of the code), and of course easily modify things and share their modifications as desired. This kind of dynamic changes what updates mean and even how they come about.
- Another side thing is of course funding. How can a system and/or culture be created to make one's financial success be possible through writing good software and distributing it with high ethical standards? How does this dynamic change when many people are contributing to the same base? Should many people contribute to the same base?
- Recommend culture of creating device drivers which are easier to share.
- How? Maybe a no
#includeheaders rule: you can forward declare all endpoints the integrator must provide. This could reduce the number of assumptions of platform drivers make, though admittedly will lower the quality of the drivers due to no longer having tight integration. - Really think hard about what state there is, and what operations need to occur. It's all just bits and instructions. Stop making layers over that fact.
- Always reference your references in the code so one can gain the background knowledge you used to create the thing. Make it clear which references are public or not. If you reverse engineer a thing, document it, don't just dump a binary blob.
- Do not use advanced language constructs. C-style types in structures and C-style functions are enough. Save the fancy stuff for the user space. Worried about "memory safety"? Static fixed-sized buffers. Done.
- Think of your driver more as a teaching tool. It is important that it works, i.e., it's not just pseudocode, but actually runs to drive the hardware, because the smallest details like exactly how to tickle the bits can be the trickiest to figure out from traditional documentation. The goal should be a simple thing that works, that can then be referenced and possibly even adapted to other environments.
- How? Maybe a no
Thanks and inspiration
- Project Oberon, Lisp Machines, Emacs, plan9, glamorous toolkit, Pharo
- Jonathan Blow, Casey Muratori, Handmade, John Carmack. Anyone writing good simple code and sharing their knowledge on how to do so.
For my attempt (Tiny C Compiler-based toolchain targeting Raspberry Pi 4) specifically:
- Rene Stange for Circle (and its contributors)
- Everyone writing tutorials and documentation on Raspberry Pi bare metal
Talks, projects, inspiration:
- Casey Muratori: The Thirty Million Line Problem
- Jonathan Blow: Preventing the collapse of civilization
- Alan Kay/Viewpoints Research Institute: A sense of what Kay is trying to do comes from this quote, from the abstract of a seminar at Intel Research Labs, Berkeley: "The conglomeration of commercial and most open source software consumes in the neighborhood of several hundreds of millions of lines of code these days. We wonder: how small could be an understandable practical 'Model T' design that covers this functionality? 1M lines of code? 200K LOC? 100K LOC? 20K LOC?"
- Rob Pike: Systems Software Research is Irrelevant (aka utah2000 or utah2k)