Understanding is a Responsibility

I was watching old educational videos, and this one on Self reliance struck a chord with me.

It proposes four rules for becoming self-reliant:

Assume responsibility
Be informed
Know where you're going
Make your own decisions

These come from Self-Reliance by Ralph Waldo Emerson.

I found myself immediately seeing many of the problems I had encountered or caused in my career as a game software engineer being caused by lack of information and responsibility. Each one of these rules could warrant their own article, so I will focus on this idea: It is your responsibility to understand things.

Understand the system before you try to change it

Think surgery, not sledgehammer. Is what you are doing guesswork, or engineering?

Read the code before you start changing it! If you think there is simply too much code to read, or source the isn't available, read the documentation! Is the documentation bad? Can you use a sampling profiler to see what's going on? Do some research. Be deft with your time and approach, not hasty and clumsy.

My article on debugging linker errors is an example of how understanding more about the code and my tools would have saved me a couple hours of painful debugging. Those fifteen minutes of research would have more than paid for themselves.

John Carmack recommends the exercise of stepping through your game, line by line, so you know what is actually going on. It may seem crazy with several-million-line programs you may have to work on, but spread over the course of weeks studying here and there, (and looking up documentation when available), you could make it through and learn a whole lot. Does it not seem crazy to be making changes to a large and complicated machine without having at least a quick scan of its fundamental operation?

Performance problems often arise from ignorance

Programmers are making their livings by writing instructions that physical machines must be able to understand. There should be respect given to understanding those machines, especially because that understanding can lead to massive payoffs.

Hardware (still) matters

Like it or not, physical hardware still matters (and it will always matter, unless we get machines which can execute code infinitely fast). Mike Acton mentions this point when he talks about data-oriented design.

The ideas most relevant to this article may have been expressed in Mike Acton's HandmadeCon talk. If this article's ideas resonate with you, watch that talk! (An aside: He presents the idea of reading the processor manual. A colleague I greatly respect for his hardware knowledge and corresponding software skill recommended Agner's Microarchitecture Performance handbook as a more practical manual for learning about microarchitecture optimization, rather than e.g. the all-inclusive Intel Architectures Software Developer Manuals, but if you can get through those, more power to you).

A useful way to approach learning about hardware may also be from the low-level software perspective. For example, Clang cross-compilation discusses most of the parts necessary to have in place to build an executable for another platform. It also shows that you can compile things for a specific architecture to get performance improvements (e.g. by using the right floating point settings, SSE, etc.). These compilation settings give you an idea of the range of hardware you may encounter, and how two machines can differ.

Garbage collection is ignorance

This is a rant. You should skip it if you are tired of such arguments.

Blissfully ignorant engineers claim that things like garbage collection are better in every way to manual memory management. This is simply not true. Fundamentally, computers do not understand what the lifetimes will be for many values. This results in expensive garbage collection passes that must happen at arbitrary or explicit times, completely stalling the program and seeing what values ended up dying. Leaving it up to the machine, who cannot know what the right thing to do is when, is eschewing responsibility for managing memory in a way that has a detrimental cost to performance and memory usage.

Car talk

An analogy can be made to automatic versus manual transmissions. Anyone who knows anything about auto racing will know that manual transmissions are essential for it. It takes time for a car to shift gears, and being in the optimal gear to get maximum engine torque (or to engine brake, if slowing) is important. An automatic transmission cannot predict when the driver needs to shift up or down, it can only try to guess at what their intentions are, and keep the engine in its operating range. Manual transmissions allow the driver to shift into and out of turns at the right moments to get exactly the behavior they desire.

Garbage collection is to automatic transmissions what manual memory management is to manual transmissions. Garbage collection works to get your program from point A to point B (run successfully), but there are going to be suboptimal shifts and lessened performance (e.g. a long garbage collection pause, at the wrong time, and not using memory optimally). To take the analogy a bit further, you can use an automatic to get somewhere, but if you and everyone you know was using a manual instead, everyone could get there faster (of course, assuming everyone could drive as if it were a race track, which is where the analogy breaks down). Wouldn't you want your processes to run like race cars, optimally and without pause, rather than like a bunch of commuter cars on a crowded highway?

Useless waste

If everyone made their programs N% faster, that would make a huge difference. It seems that the popular belief is that "It's fine if your program is N% slower, because computers are fast". However, we are just needlessly filling up the space we are given. This observation is known as Wirth's law, and has been lamented by many others (e.g. "What Intel giveth, Microsoft taketh away").

Ask yourself: do modern programs really need this much more power to run? I think the only exception to that rule would be games, due to how ever-increasing graphical fidelity, framerate, resolution, and game complexity are important factors when making new cutting-edge games.

Abstraction is still important, but dangerous

I have been finding more and more that the more abstracted things are, the harder it is to figure out what's actually going on. Importantly, I find myself still having to figure such things out. Performance is often a driver ("why does this code take as long as it does?"), but understanding and using higher-level APIs is as well (e.g. in the case of C++ template magic, "what code is actually running here, and why isn't it working for this type").

Jonathan Blow goes so far as to claim that building too high on other abstractions could lead to the collapse of technology as we know it. Casey Muratori covers similar themes in his video called "The 30 million line problem".

Languages as abstraction layers

What about when the software industry transitioned from assembly to C? Developers back then thought C had an unacceptible performance impact. Should be just be writing assembly? Well, no, but why not? I forget where I read it, but the argument was made that the jump to C's level of abstraction was such a step up to productivity, robustness, and effectiveness when compared to assembly that the performance impact was worth it.

How many layers of abstraction are appropriate? In a perfect world, we could program with declarative languages where we just tell the computer what we want and let it figure out the details. The reality is that hardware is limited, and it's easy to exhaust it by spending its power on abstractions which don't payoff.

I believe the gap between C and languages such as Python or Javascript is smaller than high-level language advocates would like to think. I think with a healthy ecosystem and sound access methods to that ecosystem, programming in C could be similarly productive. This is assuming the position that the modern web wouldn't be possible nor as productive without a high-level language, which I think is a ridiculous assumption. It also assumes the programmer is competent (I'm not going to care about beginners, though I think learning C first is a great way to have a strong foundation for them).

As a result of writing in a lower-level language, programs would be more robust due to things like type safety, take up fewer resources by managing memory optimally, and run faster due to reduced levels of abstraction.

Once we get to the point where the higher level of abstraction pays off in every way, by all means we should move to it. The current language offerings and modern hardware are not there yet, and I'll wager they won't be there soon.

Getting better

If you are a professional software developer, you should realize that it's your livelihood. It's what you are using to feed and sustain yourself. You should respect that fact and deliberately make an effort to become better at it.

Things I'm ignorant about

A good exercise to find weak points in your understanding is listing the things which you know only as "black boxes", even if you know how to use those things. For example, here are some of the things I should learn more about:

Unicode: Language is incredibly complicated. Unicode implementers are heroic in doing what they do. For a taste of how complex text is, check out the Unicode Collation Algorithm spec, which basically defines how strings should be sorted. Joel Spolsky has a good post on foundational Unicode knowledge every software developer should have
Operating system: I should be more familiar with foundational operating system components, such process schedulers or virtual memory systems
Hardware: As I mentioned before, every programmer should be more familiar with how their code actually gets executed, including basic familiarity with modern microprocessor architecture and things like buses, interrupts, and RAM. I should know more about these as well
Assembly: Similar to Hardware, I should know more about the lowest level of software, assembly
Device drivers: When I encounter problems getting e.g. a game controller to work on my machine, I encounter the total gap in my knowledge about what to do to debug those problems. I want to know more about how the pipe from hardware devices to software inputs works
GPUs: I am embarassed with how little I know about how modern GPUs function. I think I will start with Life of a triangle and go from there
C++: There are some really amazing things you can do with C++ (templates and otherwise) that do make your code more robust and simpler, but they can be hard to understand if you do not take the time to learn how the parts work. For example, lvalues and rvalues still confuse me, but they are critical to writing copy-free, high-performance code in C++. Recently, I have become skeptical of the value of these higher-level things in C++, because they are making it harder to write code which is easy to read, has predictable performance characteristics, and is easy to debug

Curiosity fosters a pleasurable path to understanding

If you are generally curious, gaining usable understanding for unfamiliar subjects can be fun and rewarding. Not all information will turn out to be immediately useful, but the pride in knowing things and the rewards from Aha! moments keep you going to stumble on useful information.

I donate each month to Wikipedia because I think my excursions through the site legitimately make my life better. It is very rewarding to feel like I know more about how the world works, be it nuclear power, microprocessor architecture, or general electrical infrastructure. Even surface-level knowledge of things is much better than completely ignoring them.

I found Connections to be very compelling in its approach to explaining how we got from the dark ages to modern technology. If you know the history of some technology, it can help your understanding by letting you retrace the steps and decisions that were made.

Stop to think

This article is just as much for me as it is for you. I need to consistently fight the urge to dive and bash at a system in ignorance rather than taking my time to develop a sound understanding of what is going on.

If you are getting frustrated with a problem or system, stop yourself and ask, "Is this frustration stemming from a lack of understanding?" If the answer is yes, you likely should stop trying to make forward progress and instead try to gain more understanding of the system/situation.

It may be hard to swallow your pride and admit you have been deliberately avoiding learning things. Once you realize you are doing this, and if you try to counter it, you may find great rewards from respecting the thing and studying it deliberately.