Debugging is the process of correcting the mismatches between a formal system and reality, through acquisition and application of intimate knowledge of a system's implementation, intimate knowledge of the relevant part of reality and a reflective self-awareness surrounding both. Software running on a computer is a typical formal system, and that's what I will focus on. Similar principles apply to debugging electronics, mechanical systems and abstract mathematics.
Bugs are mistakes. Bugs are oversights. Bugs are fundamental misunderstandings. Bugs are Black Swans. Bugs are software. We can re-frame feature development in terms of the bug class of missing functionality. Therefore, the entirety of software development can be viewed as a process of debugging.
![]() |
A bug, but not an issue |
Every principle in software engineering must bow to making it work. Ergo, debugging is the work of software development.
Reality ain't no formal system
At least one program has no bugs: the program of zero length. It has no features. It’s the same in every language. It takes no time to execute and requires no memory. It has no dependencies. It requires no one to maintain it. But most programs have bugs. (Not all non-empty programs have bugs. I’ve seen some pretty robust “Hello, world” implementations.)
Bugs are perversive because useful programs are formal models of parts of reality, and no formal model can fully encompass reality’s confounding infinite complexity. The software system that you're working on must take one more step toward reality: you have to fix a bug. Here's how:
- Build knowledge of the system (like reading the code)
- Build knowledge of the self (like bending your mind around the code)
- Build knowledge of reality (often by talking to people)
Somewhere in the interplay of these three cords of knowledge lie the hiding bugs. Once found, the self can cause the system to accord better with reality.
Finding bugs
Programming is a paragon of rational thought, at least, until a bug appears. I have often had the feeling that a bug "defied the laws of physics;" even if the fact of the matter is that the rational computer system is really rational, debugging feels like an encounter with the irrational. Actually, to debug is to encounter one's own irrationality. Often the discovery of the root cause of a bug is a sudden experience where the irrationality reduces to a point: “Oh! Here’s the problem. The program transgresses reality in this one part.”
The substance of debugging is identifying where the bug is. In the difficult cases, the bug is not where you expect, so you spend hours pouring over the wrong piece of code. Bugs hide, but they cannot run. Therefore, you must approach a bug in your own code as if you are seeing an undergraduate's code for the first time. Do not assume. Be free of bias. Assuming competence is wrong. Assuming incompetence is wrong. Be open to anything. Sometimes you can spend all afternoon chasing the bug, only to find that the code is correct and the bug was in your understanding. That bug was in you. In most cases, the code is incorrect, but I consider that the locus of every bug extends to the debugger's mind.
To find the elusive bug, get outside of yourself. Stop and do something else for a while. Relax. Describe the problem to someone else, even a toy monkey. Give up and go home. Take a warm shower. Read a journal paper. Meditate. Take a nap. Debugging is only a conscious process in the worst case.
The Hamming distance between success and failure is small. When you catch an elusive bug, sometimes—even usually—the fix is blindingly obvious, banally trivial and terribly short. Tiny bugs have a kind of post-hoc invisibility to them. Countless times I find that, an hour after fixing an elusive bug, I had forgotten what the root cause was! Isolate the exact code change that fixes the bug and put that change—and only that change—into a single commit and write a clear commit message “Blindingly obvious, banally trivial and terribly short fix for a highly elusive bug.” Then, rehearse what went wrong mentally to yourself, or write down: the presenting problem was X and I spent a long time looking at file Y but actually the root cause was a minor typo in file Z which was obvious in hindsight.
One-liner bugs cause millions of dollars of damage.
Hiding bug |
A systematic approach to debugging?
Programming has every appearance of being a fully logical and systematic process. Could debugging be approached systematically, too? In The Debugging Mindset, Devon H. O'Dell has some great ideas on debugging systematically. He writes:
Through continued learning, malleable views of problems, and effective use of tools, you can become successful in debugging. Still, some insist that debugging is more of an art than a science. I think we can dispatch this idea entirely. It is clear that debugging requires learning, and the scientific method is specifically designed to yield new knowledge. The method, summarized: (1) Develop a general theory of the problem. (2) Ask questions leading to a hypothesis. (3) Form a hypothesis. (4) Gather and test data against the hypothesis. (5) Repeat.
And:
Forming a sound hypothesis is important for other reasons as well. Mental models can be used to intuit the causes of some bugs, but for the more difficult problems, relying on the mental model to describe the problem is exactly the wrong thing to do: the mental model is incorrect, which is why the bug happened in the first place. Throwing away the mental model is crucial to forming a sound hypothesis.
Approaching debugging from a systematic point of view is necessary, using tools is necessary and discarding bad mental models is necessary. Systematic debugging is necessary: if you can’t be systematic, then you probably can’t do software engineering at any level of competence. But I would still contend that debugging has artistic qualities.
Consider this quote from the book “Free Play,” by improvisational violinist Stephen Nachmanovitch (p. 73):
In practice, work is play, intrinsically rewarding. It is that feeling of our inner child wanting to play for just five minutes more.
This compulsive side of practice is especially easy to experience in the new art of computer programming. The program we write is itself a responsive activity, which talks back to us in real time. We get into a loop of conversations with the program, writing and rewriting it, testing it, fixing it, testing and fixing again until we get it right, and then we find more to fix. The same applies to practicing an instrument or painting or writing. When we’re really doing well and working at our peak, we show many of the signs of addiction, except that it’s a life-giving rather than a life-stealing addiction.
To create, we need both technique and freedom from technique. To this end we practice until our skills become unconscious.
Once you’ve fully internalized a systematic approach to debugging, you might find that sometimes it doesn’t work! To overcome certain problems with systems, you have to enter the world of the meta-systematic, as David Chapman expounds here. For example, the senior software engineer must be ready to fix the meta-bugs, the problems in the way that an organization approaches software engineering. But even before that point, we have to face the non-systematicity that's intrinic to anything involving people.
Beautiful bug, not to be fixed |
Software is about people
We write software for our customers! Customers, even if non-paying, are people. The larger context of any software system includes people. Often, we write software in teams larger than me, myself and I. Much of what constitutes "reality" for a software system is squishy human wetware.
The bug may be truly unfindable inside the source code of the program—even if you can get outside yourself enough to see the program for what it really is. You may not know enough about the users, the team, the organization. You may be blissfully unaware that your program is riddled with agonizing problems that hurt your users every day. Fixing bugs is a social process that involves talking to people.
The 10x engineer (if they exist) doesn't type ten times faster. But wise engineers develop a deep understanding of what people need and focus their effort on the most impactful bug fixes (which may be missing features). A lot of effort can be spent on coding / debugging that doesn't make much difference to the customer. It might be uncomfortable to leave the realm of the formal system, walk away from your desk and listen to people mumble their contradictory desires. But that's the riverhead of software value.
A ladder of debugging techniques
Bugs are so pervasive that any artificial limitation to debugging will eventually be a hinderance. Therefore, I accept debugging in all its forms, contingent on those forms producing desirable results (assuming we're not doing anything unethical along the way, of course).
Instead of insisting that debugging should be systematic, or even more high-mindedly claiming that debugging should sometimes be meta-systematic, let's consider that all tools are ours to use in appropriate contexts. Here's a ladder classification of these techniques:
- Asystematic debugging techniques. For example, commenting out lines of code that cause compiler errors / exceptions / issues.
- Systematic debugging techniques. What O'Dell talks about: hypothesis forming & checking.
- Meta-systematic debugging. What Chapman talks about.
Each rung of the ladder goes meta to the previous rung. The higher rungs are more difficult to accomplish, but they add to the previous rung rather than replace it.
There's a time and a place for asystematic debugging! When you're tired and frustrated and you just gotta do something, heck yes, comment out the line of code that throws the exception and run it again! Copy that answer from Stackoverflow. Ask the LLM to write your code! Hopefully, by the time you have completed a degree of formal training, you have other tools ready, too, because asystematic debugging won't reliably converge.
And after you've been programming for a decade, hopefully you've developed some kind of "taste" for software and an appreciation for the humanistic aspects. Hopefully, you've put a smile on someone's face enough times to know that that's what software's for.
Debugging infinity
The treadmill of debugging goes ever on. Fixing one bug causes or reveals another. We chase reality with our formal system of software, but we will never get there. I find the eternal parade of security bugs particularly bothersome, but they're really the tip of a deep, monstrous iceberg. We humans will always live with bugs. Bugs in software, bugs in law, bugs in society, bugs in our minds.
But we keep on running, invigorated by the progress seen so far, joyful at the prospect of fixing one more bug. Make it work! Align the formal system just a little bit closer to reality, using every trick availble, by learning about the system, yourself and reality, for the benefit of people.
Not a bug, a feature |
Appendix: an incomplete bestiary of debugging heuristics
- Heuristic: you aren't looking at the bug! Unfortunately, you can only look in one place at a time, but this heuristic is to encourage you to look somewhere else.
- Put a breakpoint at the start of
main()
and step through your whole program. Unfortunately, this approach takes a long time, but eventually you'll eventually be looking at the bug. Usually, when I try this, I get impatient and I end up letting the program execute past the bug. Maybe the realistic version is that you do a kind of random search through the program by guessing where the bug is and trying to stop shortly before the bug then step into it. Making multiple unsuccessful attempts to step into the bug is ok. It’s the one successful attempt that matters. - Heuristic: software in layers (lasagna code). Bottom layer is physics, then silicon fab process, then mask level design, then gate level design, then processor architecture, then assembler code, then 'high level' language (e.g. C) [via a compiler], then sometimes 'higher level' languages (e.g. Python) [via an interpreter], then libraries, then application-specific abstractions, then application logic. Each layer is an abstraction with a certain amount of leakiness. If you're working in the application logic layer, then you will probably also need intimate knowledge of the application-specific abstractions. If you're working in an interpreted language, then you will probably need some knowledge of the workings of the interpreter. If you're working in a compiled language, then you will probably need some knowledge of the compiler. If you’re programming in Objective-C, then read the source code of the Objective-C run-time.
- Bugs in the configuration.
- Bugs in the build process.
- Heuristic: put
abort()
at the start of main()
. If it doesn’t crash, then your program isn’t actually running. This really helps if you're running entirely the wrong program. - Reduce the program down to the minimal form that still reproduces the bug. Create a new branch and start removing code. Ruthlessly remove features and functions until you have the shortest possible program that still has the bug. Unfortunately, you still have to fix the bug, but now there’s less code to search through and puzzle over.
- Human executed genetic programming.
- Oblique Strategies: Prompts for Programmers
- Debugging Zen, by Ben Ramsey
main()
and step through your whole program. Unfortunately, this approach takes a long time, but eventually you'll eventually be looking at the bug. Usually, when I try this, I get impatient and I end up letting the program execute past the bug. Maybe the realistic version is that you do a kind of random search through the program by guessing where the bug is and trying to stop shortly before the bug then step into it. Making multiple unsuccessful attempts to step into the bug is ok. It’s the one successful attempt that matters. abort()
at the start of main()
. If it doesn’t crash, then your program isn’t actually running. This really helps if you're running entirely the wrong program.