Thoughts on preventing Cloudbleed

What a horrible mess. Privacy-destroying bugs like Cloudbleed disturb me greatly, since they highlight just how much leverage simple programming errors have in the modern era. They can expose huge amounts of data and impact very large numbers of people.

This incident started at Cloudflare, probably by just one individual, when they made a one line programming error in their use of a HTML parser library. Then, because of increasing centralisation on the web, thousands of online services that use Cloudflare were affected by it. In turn, tens of millions of people (think Uber) that use those services had their privacy breached. That one-liner reached out and touched a frightening percentage of internet users.

Without taking steps to remedy and prevent these kinds of errors, we are heading towards a privacy apocalypse, where all private data becomes public.

So, how do we prevent errors like this from occurring in the future?

Well, it's very difficult.

Even the use of better tools, techniques and languages is no panacea. Just like airplanes, you can build the safest one in the world, and somebody will find a way to fly it into the ground.

I think part of the answer is more rigorous handling and investigation of these types of events, and also a reclassification and reframing. Instead of thinking about them as simple bugs, these types of events are international safety and security accidents and need to be treated accordingly.

If these large scale data breaches continue to occur, and my bet is that they will, a regulatory response will become necessary. As the tech industry matures, we will see the need for analogues of the FAA and NTSB. This means organisations like NIST, CERT and various Cyber Security Centres, but with investigative, regulatory and punitive powers.

In the meantime, we need to rely on the private sector to police itself. I applaud Google for running Project Zero, and we need to see much more work along these lines.

So, what practical things can companies, teams and individual programmers working on network services do to prevent these breaches?

Well, here is a list of things you can do, in roughly descending order of cost:

Consider safer languages and runtimes
Make your systems as deterministic as possible and probe for non-determinism regularly
Employ full time hackers like Tavis Ormandy
Invest in static analysis
Fuzz your systems
Use better tooling like Valgrind and ASAN
Run a small percentage of services with assert, debug and tracing on, and look for anomalies
Log and capture statistics from your services, and regularly look for anomalies
Randomly inspect wire captures between modules in your systems and look for canaries
Add more asserts, like a lot more.

Obviously, the cost of some of these things will vary, depending on how far along your project is when you start doing them. Nevertheless, it should provide a good starting point for thinking about how to build your software with security in mind.

For smaller companies without a lot of resources, doing as many of the cheaper things on the list as your budget will allow is a good idea. For larger, multi-billion dollar companies, you should probably be doing all of them.

Furthermore, if you are a large multi-billion dollar company that suffers a catastrophic breach, and it can be shown that you weren't doing these things, at what point does it look less like an accident, and more like negligence?

update:
Discuss on lobste.rs