Memory Safety and Type Safety
Most of the software we've come to rely on today are written in C (or run on top of a VM, which was written in C). A whole class of bugs and remotely-exploitable vulnerabilities can be attributed to programmer mistakes when dealing with manual memory management, as imposed by the C programming language.
Let's explore how to achieve memory safety and type safety within a C program, and hopefully build some primitives that, at a small runtime cost, can offer those nice guarantees to client application code.
What is Memory Safety?
A program exhibits memory-unsafe behavior when it:
- Allows reading from a memory region outside of intended buffer boundaries.
- Allows writing to a memory region outside of intended buffer boundaries.
- Allows uninitialized variables, which may draw data formerly located at that memory region.
- Allows dangling pointers, which may point to a uninitialized or a deallocated memory region.
Memory unsafety is also related to the management of memory resources:
- Stack exhaustion, caused by running out of stack space through deep recursive calls.
- Heap exhaustion, caused by running out of available heap space.
- Double free, invalid free or mismatched free, all of which may corrupt the heap.
- Unwanted aliasing, which occurs when the same memory region is allocated twice for unrelated purposes.
What is Type Safety?
Type unsafe behavior occurs when an underlying data representation of an intended data type is interpreted as another data type.
Vijay Saraswat provides the following definition: “A language is type-safe if the only operations that can be performed on data in the language are those sanctioned by the type of the data.”