Thoughts on Software Correctness

2023-05-15

Correct code

1. DOES what you intend for it to do; AND

2. Does NOT do what you don't intend for it to do.

Although these two statements may seem equivalent, those are two whole different parts when concerning software systems.

Customers -- and as a consequence, the business -- mostly focus on first half. A customer will ask oneself: What does this software do for me? What features does it have? How can it deliver value to me?

Why software correctness is mission-critical

The second half, despite not getting the spotlight, is arguably at least as important as the first. Let me emphasize why:

There is NO value in an instant messaging app that allows users to chat one-to-one, but also allows an attacker to read any private conversations.

There is NO value in a banking app that allows a customer to transfer funds out of their account, but also allows attackers to transfer funds out of any of the customers accounts.

There is NO value in a document or image editor that works most of the time, but then freezes and loses all your work once a week.

While there is a huge amount of resources dedicated to add more features to any current piece of software, ensuring reliability and correctness is often an afterthought.

Why modern software often has bugs

Modern software development has settled on shipping bugs directly to customers, letting those get impacted, then fixing the holes as they get noticed.

While this has been proven to drive forward more value than the alternative (such as attempting to ship something “perfect”, but getting into the market later than the competition -- or failing to ship anything), the costs also have been high.

By not taking the time to fully understand all the ramifications of a piece of code while trying to get something done, a software developer may unknowingly introduce a serious security vulnerability that later gets exploited.

Doesn't cryptography solve everything?

Note that cryptography is no silver bullet here. Even when properly employed, it only ensures the communication channel is secure, that your information is getting out of your device (assumed to be trusted) to the service's server (also assumed to be trusted) without being: a) known to third parties or b) being tampered with midway.

What it does NOT solve is:

Guarantee that your device has not been compromised.
Guarantee that the server (or any software running in it) has been compromised.
Guarantee that the server will be compromised in the future.
Guarantee that the server is responsibly storing sensitive or private information.
Guarantee that the server will voluntarily leak private information or perform an unintended operation due to incorrect software.

Who can address these issues?

The responsibility of preventing each of those scenarios falls to a few different parties:

Platform makers must design secure-by-default systems, defaulting to the “least access required” security model, properly warning and informing the user of anything that deviates from what one might reasonably expect.

Software supply-chain providers must ensure the distributed software is signed and untampered with, as well as quickly making updates available as vulnerability are disclosed and fixed.

But even on secure platforms, with properly supplied and up to date dependent software, consequences undesirable to the customers and to the business may occur. Software will do as it is told, so the final and most critical line of defense is having correct software being written.

What can I, as a software developer, do to address these issues?

Software makers must ensure that:

Their programs will NOT misbehave when presented with malicious data payloads (which brings us back to the point 2 at the start of this article)

Reasonably secure customer data and design information silos that minimize the repercussions of data leaks, assuming the scenario the system is compromised.

Do NOT blindly trust authentication schemes or encryption, but aim to fully understand the extent they protect what they are trying to protect, grasping the limitations and the non-goals of such approaches.

So have you considered what you software does for the full range of data it might receive?