We have stuffed up – again.
This continues the data security theme that I started in Personal data safe at rest, but this time the focus is on data that is in memory or on a wire. There is a lot to cover here, so in this article, I’m going to start with the most important security principle of all.
It’s not patching your software, using encryption, training your staff, monitoring, or … It’s not even doing all of these. Every one of them is important, but every one of them is imperfect. If they involve a human, then very imperfect. The most important principle is defence-in-depth, This means having at least two strong defences against any attack, with monitoring to detect when either is breached.
There are people out there whose day job is to steal your data and subvert the purpose of your system. In the last decade of rising world tensions, we have seen attacks on education institutions, weather bureaus, ambulance services, fire brigades, real estate agents, charities, sporting clubs, and so on – hardly “high-value targets”. The days of thinking, “I’m just a <insert unimportant business>, no one will want to hack me“, are truly over. A decade ago, this discussion would be directed to architects of banks, government, major retailers, and public infrastructure providers. There is now overwhelming evidence that this is no longer enough. If you think you are safe, read the report of the ANU breaking. I want to congratulate ANU for releasing this report so that we on the good side know what we are up against – it’s rare for organisations to show that courage.
This change means the tools and techniques formerly only applied to super-serious developments now have to be wielded by all architects. As a profession, we need to take a different approach. We don’t all have the time or skill to do threat analyses and design defences. It’s not feasible to turn all software architects into security experts – however, I firmly believe that every person with that title needs to be able to design secure systems. Teaching our security experts how to design software systems is also not feasible.
So in this series of articles, I will outline around 50 blueprints that software architects can use to build more secure systems. Together they will give multiple defences against known attacks. You can apply them without fully understanding why you need them, or exactly how they work. Few GPs understand exactly how ibuprofen works, and none prescribe it by starting with the premise of prostaglandin over-production. But they all (I hope) effectively use that drug without knowing that. That’s where we need to be.
Defence-in-depth vs soft-centre
Feudal Japan was a dangerous place and the lingering reminders are the castles. This one, Matsumoto-jo, is an icon of Japan. Many tourists ascend to the sixth floor and imagine the princess safe there as the battle rages around. However, that is not the castle – it’s merely the inner sanctum, the rest has long gone.
This ← shows the original three moats, four walls and nine gates. It does not show the few thousand blokes who look like this, the nine smaller castles in the surrounding hills, or the ninja out spying on the rivals.
Muromachi period Japanese engineers really got the defence-in-depth thing:
- An attacker has to breach multiple barriers.
- There are multiple types of barriers.
- Each barrier is actively monitored and defended.
- An attacker has to breach multiple narrow gates to get the prize.
Interestingly Matsumoto castle fell without a battle in 1872 when Japan decided to drop feudal government. They didn’t see that one coming did they?
In the 1930s, the French War Minister, André Maginot, built an immensely strong defence wall on their 200km German border. Forty-five of these grand forts, ninety-seven small forts, and 5000 blockhouses, all backed by heavy weapons and 500,000 men. To no avail – in 1940, attacking aircraft flew over the Maginot Line, tanks drove around the northern end, and Pairs fell one month later. The Line itself was never seriously breached, and when the armistice was signed, the French Army ordered their soldiers to abandon their posts.
This defence is called “Hard-shell, soft-centre“, or just “soft-centre“. It involves:
- expending excessive effort on an outer defence,
- under-estimating the attacker’s ability to breach or bypass that defence,
- leaders get a (false) sense of security, and so under-resource other defences,
- when breached, disaster follows.
Let’s move from throwing sharp things, onto the attacks and defences for software systems – particularly software accessed by the Internet.
Defence-in-depth IT example
What is a defence in an IT system? An exposed database cannot put stuff on the Internet, that needs code. So an attacker must:
- trick your code into thinking they are a valid user, and then your code will give them the data,
- trick your code into executing their code,
- find a back door to access the database.
Attackers are clever people attracted to crime by jingoism, greed, or poverty, who can practice for months or years to get their code into your system. One of the simplest tricks is an “Injection Attack“, where they masquerade their code as data and then trick you into executing that data. It usually involves interpreted languages, such as shell, SQL, PHP, and HTML, where there is a lower barrier between data and code. Complied languages are much harder to attack, but C/C++ buffer overflow or JVM class loader attacks are examples. For this example, I will use SQL Injection, which after 20 years is still OWASP’s 3rd top vulnerability (see what I mean about being ashamed 😳). A proposed defence could be:
- A policy to only use prepared statements. ⇒ That’s a defence without defenders, so it hardly counts.
- Code review emphasising prepared statements. ⇒ That’s weak defenders – it’s hard to eyeball some SQL injections. However, at this stage, you actually do have a defence.
- A static code scan tool (like PHP Progpilot) in your CI chain. ⇒ You now have one strong defence.
There are now three forces in play, policy, review, and a tool. So is that defence in depth? No – it’s all just static code analysis, based on one understanding of an attack, just like additional weapons in a Maginot fort or deepening a Matsumoto moat. Stop and think now about how this can be breached. Have you thought of at least two ways yet? To get defence-in-depth, you need to add a second (or third) defence, such as:
- A WAF that detects attack strings like AWS WAF or Azure WAF.
- Narrow API with input validation – do you really need to support Smith’+OR+1=1==’ as a name? (in your zeal, don’t prejudice O’Brien, Bernes-Lee, or 习近平)
A basic test for Defence-in-depth is ‘If this defence is absent, can my system now be breached?‘. Moving on now with how to build defence-in-depth.
To cloud or not to cloud
There really is a fork in the road here. Cloud makes it much easier to build secure systems, as your cloud vendor has already instantiated many security patterns, provides robust tooling, and they are very committed to security in general. The biggest AWS customer is Netflix spending $300M annually for a $5B profit. However, that is small compared to AWS’s $20B profit. So the cloud vendor has more skin in the security game than any of their customers.
Why does this matter? As a software architect, cloud vendors provide a solid infrastructure with a lot of high-quality tooling that helps you to build secure software. I think it’s fair to say that no on-prem environment can match that, and most would not even come close. As time progresses that gap is getting wider – the clouds are getting better, and the on-prem vendors are losing market share and investment money. By on-prem, I mean in a Tier 3+ data centre – in your office/school/factory/club, security is impossible. To make this clear – anything you store on your business premises can be stolen.
So this is the first blueprint:
C1 – Build on a major public cloud. If you use AWS or Azure, you have excellent services, tooling, easy access to IT skills, and online support. The next tier clouds, Google, Alibaba and IBM, look solid enough, but the support pool is much shallower. The cloud offerings from the top vendors are similar, and my blueprints often rely on you using one. When I refer to a tool, I will include AWS and Azure links.
A defence-in-depth framework
This is the overall model for defence-in-depth. There are two layers of defence between the Internet and your business code, and only your business code has access to the precious data. Looking at each as a blueprint item:
D1. Internet access. Use a configuration of your cloud vendor’s front-end tech stack to build your Internet interface. There is none of your code here. This means that your code never has Layer7 access to the Internet, either incoming or outgoing, for all protocols – HTTPS, email, SMS … This defence:
- supports TLS1.2+ (only) – the cloud stack will fully manage the TLS protocol and your private key (this is hard to get right, and clouds are good at it),
- mitigates DOS attacks, protocol-level attacks and some message content attacks,
- is your first layer of flow control, where you can defend against things like excessive login attempts,
- is monitored so that you can check for valid and invalid usage patterns.
D2. Front-end. The Internet Access defence will give your code messages that probably won’t have protocol-level attacks, but could have message-level attacks – attacks like SQL Injection, intentionally malformed JSON, infected images, attack cookies and such. This blueprint item requires you to build a distinct front-end function that:
- implements an API for your UIs and your customers (if that is something you offer),
- defends against message-level attacks,
- protects personal data for the rest of its journey through your system.
Get your expert developers to build this, and apply strong quality gating (eg git branch protection), to make sure it is built well and maintained well. Don’t be tempted to just build this into your business code – the place where you have DB passwords, encryption keys and all your developers. It needs to be isolated, and it’s not amenable to rapid release.
D3. API design. HTTPS APIs can be hugely complex, with lots of options and choices. Some of those are very difficult to secure. API defence is designing fully functional APIs that you can secure with reasonable effort.
D4. Business code. This is where your business functions are executed. It could use messaging, microservice, monolith, … whatever your think works for your business problem. On the whole, this is a safer development environment where all your development teams can work, and (if it suits your business) you can do a rapid-release CI/CD pipeline. However, there are rules – the most important is “Don’t invent anything – use the tool kit”. As for accessing the data itself, I covered that in Personal data safe at rest.
D5. Back-end. Generally speaking, outbound connections are less risky than inbound. However, there are risks, and you need a defence against them. Outgoing can be HTTPS (e.g. webhook callbacks, image fetching, business services) or human – email, SMS, or push notifications. The risks for outbound connections are – exfiltration of data, annoying the destinations, and Server Side Request Forgery. The back-end defense is where you put those protections. Like the front-end defences, this is the territory for your experts.
D6. Tool kit. For a modern architect, you typically delegate the business coding and design to the development teams. That’s how you get fast time-to-market and can scale out development. However, for the system to be reliable, secure and performant, you will want to enforce a series of policies about how code is developed. For example, you might require connections to external systems to use a Nygard Circuit Breaker, or support your policy for DB connection management, system configuration, or how to MAC or encrypt data.
You can (and should) write policy documents about each of these, but your primary control is through the toolkit. Each policy should have accompanying code that makes it simple for a dev team to implement that policy. This means that the easiest way for a team to solve a problem is your way. Unlike business code, architectural policies are cross-cutting and need consistent solutions as they are often hard to get right. That includes all your security policies.
This code, along with Front-end and Back-end, should be built by your best developers and subject to the most rigorous review and quality gating processes. It’s probably not amenable to rapid release.
D7. No user access to data in DB. This is well covered in Personal data safe at rest. This diagram shows what your carefully crafted defence-in-depth looks like when you allow human access to data. If you look at the recent attacks, essentially all of them have a staff member tricked into giving access. The way people work – emails, social media, attention lapses, spreadsheets – means that a data path involving people is very vulnerable. It bypasses all your strong defences, has essentially zero egress control, and allows malware to be planted. ‘No data access‘ is most critical for the human DBAs and operations staff – as their credentials (in the hands of the attacker) are more powerful, but the humans who own those credentials are just as fallible as anyone. Don’t be lulled into “its only READ access” – READ is the most dangerous access in the Personal Data war. I don’t know how to secure this mess, and I don’t think anyone else does either – so don’t build it.
In conclusion
Now we have a framework for the defence of your precious data. In future articles, I will give blueprints for each of these defences. If you follow this structure, you will have a robust system. That does not mean it cannot be breached – any system can be breached with enough effort. Currently, the state of software defence is woeful. 13,000 WordPress sites get breached every day, but that’s at the low end. At the other end, Heartland (the US largest payment processor) lost 150M credit cards, the US Department of Defense lost the plans to the F-35 fighter, and Marriot lost 5 million personal records …
I am sick of seeing a hand-wringing CEO surprised they were attacked – surprised that phishing worked even though they had an anti-phishing policy, their technical experts worked hard, the attack was sophisticated, they had ‘best practice’, and they promise to do better now. Instead of the hand-wringing, I want to see fines and CEO sackings. Their job is to resource this, our part is to build it.