10 Oct 2019

This blog has been updated infrequently over the six years that I have intended to write, and the nearly twenty years that I have owned this domain. Part of this probably comes from trying to bite off more than I can chew in my personal life, which is often eclipsed by obligations when consulting or working. For the past six years, much of my work has been in supporting startups, and as anyone knows, bootstrapping a startup can take all of the time and energy that one has.

A theme that I keep coming back to in my work is that of safety and reliability. Specifically, how do I ensure that the software I write is safe, reliable, and relatively secure? Often, the answer to this question in the industry comes down to using this language or this platform. Personally, I’ve never been a fan of silver bullets. One change I have made in my professional and personal life is to try to drill down and simplify the software that I write. I don’t believe that the language, platform, framework, or operating system makes software safe or reliable. The only thing that can do this is the hard work of making software reliable. Instead of pushing some newfangled language like Rust or Idris, I’d like to build a case for a simple and humble language: C.

Immediately, the language bigots will jump up and harangue about how C is a foot cannon that will kill your dog just by accidentally forgetting a semicolon. I think that this sort of response comes from a combination of fear and an acceptance of the status quo. While it is true that there is a lot of terrible C in the wild, this is not because C itself is at fault. Languages are tools. A good crafts person never blames his/her tools. That being said, using C alone is, in my humble opinion, unwise. To paraphrase the old breakfast cereal commercials, C is part of a complete software engineering breakfast. To write effective and reliable C, we need to fortify this language with static analysis and formal methods. Furthermore, we need to provide C with a style that allows us to maximize the utility of these tools. Be assured that when I talk about engineering software using C, I’m advocating for the modern standard and a modern style; I’m not advocating that we use 40 year old software development techniques.

This does raise the question though: why use C? There are several salient reasons that come to mind. The first is ubiquity. C is practically everywhere. It is even more of a lingua franca than C++, because it can be embedded in places where the C++ runtime can be problematic, such as its near universal use in FFIs (Foreign Function Interfaces). Crossing the boundary between C++ and an FFI can lead to some interesting re-entry problems, especially when this is coupled with stack unwinding and exceptions. While these features can be disabled, it comes at the cost of some of the features that C++ provides. However, as much as I am a big fan of C++, there is another reason why C makes for a better language here, at least for our purposes: cognitive load. A lot of general-purpose languages have features that insinuate themselves into software and that can lead to an ever expanding list of design considerations. When these design considerations are part of a given application’s concerns, this is one thing. When these design considerations are built into the language and can significantly impact performance or reliability, it becomes too much. C++ is a great language. But, it comes with a learning curve and a higher cognitive load. Furthermore, it is a bit of a pain to parse. Here is one last area where C shines: parsing it is nearly academic. Anyone with a few months to kill can write a good C compiler from scratch. Furthermore, building a DSL that outputs C is trivial. If this is mixed with formal methods, it is possible to write a DSL that can be proven to always generate good C. This last point is for another topic entirely, but certainly, there is some appeal in being able to write tools that can parse C or even generate glue code from statements or declarations in C.

Could other languages be substituted and provide similar capabilities? Sure. Rust is slowly becoming a language that might be a contender in this area. With the sort of work I do, there is a heavy emphasis on both security and system / firmware. This is an area where both C/C++ can shine, but for reasons explained in the last paragraph, I think that C is the better choice.

Tooling

A tool that I have been spending a lot of time working with lately is CBMC. CBMC is a bounded model checker for C. It is being actively developed, and it isn’t perfect, but it can be used effectively to detect a wide range of error conditions, from stack / heap overruns, to divide-by-zero errors, to fence post errors, etc. Crucial to our use, it also supports arbitrary assertions and memory leak detection. This latter point is useful because it allows us to tie external resources to heap-allocated objects so that we can get an error from the model checker if it finds an execution path that does not properly release these resources. More on this in a moment.

Abstract interpretation and model checking is not enough. Of course, I have also evaluated unit test frameworks. I used to be a big fan of Google Test, but lately, I’ve decided to go with a more minimal framework that I have written myself called minunit. Unfortunately, this name is not very unique, and so there are a lot of similar unit testing frameworks with the same name. None of them have a trademark on the name, and as far as I’m concerned, I like mine more. But, I’m certainly biased.

Resource-Oriented Development Style

This leads me to the problem of style. To make the best use of the language and the tooling, we need to adopt a style that works well with each. For this, I have developed what I call “Resource-Oriented Development Style”. The purpose of this style is to ensure that we think in terms of resources. I consider a resource to be distinct from the concept of an object, because objects tend to have a more rigid structure. At its root, a resource is anything that meets two requirements: it is a pointer that can be explicitly cast to a resource handle using an inline function, and it can be released back to the system by calling resource_release. The particulars about how the explicit casting or the releasing of the resource works can be largely encapsulated. Any variations to this can be laid out in the documentation for this resource and in its shadow implementations (more on shadow implementations below). All that the user needs to worry about is that it follows a simple set of rules for acquiring and then releasing a resource. As long as these rules are followed, neither resources nor memory will be leaked. Furthermore, by using this resource pattern, the model checker can help us to prevent resource leaks.

One other feature is required for model checking Resource-Oriented Development. This is the shadow implementation. One problem with model checks in particular is that they are prone to combinatorial explosions. We want to ensure that when we run model checks, we can shadow anything except for the function under instrumentation. This allows us to shift problematic issues such as looping or heavy lifting away from the instrumented software. Instead, the shadow implementation does a bunch of coin flips so that it can simulate any possible execution path or error condition of the real software. Resource-Oriented Development couples nicely with the concept of shadowing.

To better optimize for low-memory environments, I have also created the concept of an allocator. This allocator exists so that a reasonable allocation strategy can be used in firmware without relying on malloc or free. However, the model checker works better with an allocator that uses these functions. Hence, by using the allocator, we can encapsulate the real allocation strategy so that the same software can be compiled for firmware or model checked.

Conclusion

This post is a lot of hand-waving. I’ll try to show some concrete examples soon. What I would like to do is build up a simple Reliable C Portable Runtime in a series of articles, and then use this portable runtime to write some examples. On the back-end of this work, I want to revisit previous ideas, such as the Kinetis Deep Dive, to show how we can use this Reliable C Development process to build real-world firmware. The proof, so to speak, is in the code.