I gave a lightning talk on how SQL Lighthouse uses a bouncy castle to protect user passwords. The video isn’t on YouTube yet!
My next blog post will be on the details of HTTPS, but before we get to that, here’s a quick cryptography primer.
Alice, Bob, Eve, and Mallory
In cryptography discussions, 4 characters are commonly used: Alice, Bob, Eve, and Mallory.
Alice and Bob want to send messages to each other.
However, Alice and Bob are not alone in the world.
There is also Eve, who is capable of eavesdropping messages whilst they are in transport. Because Alice and Bob don’t want Eve to be able to read their messages, they use cryptography to encrypt all messages before they are sent, and to decrypt them when they are received. If Eve eavesdrops a message whilst it is in transport, it will be encrypted, and because Eve doesn’t know how to decrypt it, she will be unable to read it.
And there is also Mallory, who is capable of sending messages with a faked from address. Because Alice and Bob don’t want Mallory to be able to impersonate them, they use cryptography to sign all messages before they are sent, and to verify them when they are received. If Mallory sends a message with a faked from address, then because Mallory doesn’t know how to sign it, it won’t pass verification, so Alice and Bob will know it’s been faked.
In order to communicate with each other in this way, Alice and Bob need to know how to encrypt, decrypt, sign, and verify messages.
Because Alice and Bob are good software engineers, who believe in separating concerns, they are going to achieve this by using two separate algorithms: an encryption/decryption algorithm and a signature/verification algorithm.
Alice and Bob also believe in code reuse, so they’re not going to write their own algorithms from scratch. Instead, they’re going to choose algorithms that already exist and have been thoroughly peer reviewed.
However, as we’ve seen, Eve mustn’t know how to decrypt messages and Mallory mustn’t know how to sign messages.
Because there are only a small number of suitable algorithms to choose from, it’s possible for Eve and/or Mallory to guess which algorithms Alice and Bob have chosen, and therefore the secrecy can’t be in the algorithm itself.
To achieve the required secrecy, the algorithm takes an input, a key, which is a shared secret between Alice and Bob, and Eve and Mallory must not know it.
One way for Alice and Bob to establish this shared secret is to meet up in person, far away from Eve and Mallory, and pick a random pre-shared key that only Alice and Bob know.
Alice and Bob also need to choose both an encryption/decryption algorithm as well as a signature/verification algorithm.
To keep things simple, to start with we’re going to use symmetric encryption, so called because the secrecy is achieved using a shared secret, which both Alice and Bob know.
There are many symmetric signature/verification algorithms; one example is called HMAC.
For symmetric encryption/decryption algorithms, there are two general categories: stream ciphers and block ciphers.
Stream ciphers, for example RC4, typically use a cryptographically secure pseudorandom number generator to generate a sequence of random numbers that is as long as the message being sent. To encrypt/decrypt a message, it is then simply XORed with this sequence.
In contrast, block ciphers can only encrypt/decrypt fixed-length blocks. For example, AES uses 16 byte blocks.
That said, it is also possible to make block ciphers capable of encrypting/decrypting variable length messages by using one of the many modes of operation, for example CBC.
Finally, some modes of operation, for example GCM, combine both encryption/decryption as well as signature/verification, solving both problems in one algorithm.
In the example above, Alice and Bob were signing and verifying their messages in order to detect fake messages from Mallory. Now imagine that Bob receives a message from Alice that has a valid signature, and he wants to legally prove that Alice did in fact sign it.
The difficulty for the legal proof is that because both Alice and Bob know the shared secret, they are both capable of signing the message, and so there is reasonable doubt as to who signed it — for example Bob could have signed it instead of Alice.
The fix is to remove the shared secret using asymmetric encryption, which is so named because it uses a private key and a public key. These two keys are mathematically connected to each other, but deducing one from the other is designed to be infeasible.
To use asymmetric encryption, Alice creates a public/private key pair. She stores the private key securely so that only she has access to it, and she distributes the public key to everyone else: Bob, Eve, and Mallory.
In asymmetric signature/verification algorithms, for example DSA, to sign a message you need to know the private key, whereas to verify the signature you need to know the public key.
Since only Alice knows her private key, and Bob does not, asymmetric signature/verification algorithms remove the reasonable doubt and prove that Alice signed the message. This property is called non-repudiation.
As we’ve seen, the symmetric encryption algorithms above need shared secrets, essentially keys, and so far we’ve been assuming that Alice and Bob met up in person and picked a random pre-shared key that only they know. Whilst great for friends, this is clearly untenable for HTTPS in the real world. I visit a lot of HTTPS websites, and I don’t want to spend my life on aeroplanes just to meet their sysadmins to establish pre-shared keys.
One option would be to outsource all this travel to a trusted third party, who would establish the pre-shared keys on my behalf. The problem with this is that every pair of individuals needs a different pre-shared key, so it scales as O(n^2), and also, because the trusted third party knows the pre-shared key, it could conspire with Eve to help her read the messages without detection.
We need another solution to perform key exchange and arrive at a shared secret.
One such solution is Diffie-Hellman key exchange. However, it is only designed to protect against Eve, and it is not secure against Mallory.
The solution that’s commonly used instead, involves an asymmetric encryption/decryption algorithm, for example RSA.
With these algorithms, to encrypt a message you need to know the public key, whereas to decrypt a message you need to know the private key.
Just like block ciphers they can only encrypt/decrypt fixed length messages, for example 245 bytes long. Although you could technically use the block cipher modes of operation to encrypt/decrypt longer messages, this isn’t done in practice because asymmetric encryption algorithms are considerably slower than symmetric encryption algorithms.
In practice, symmetric encryption is used for the message, which can be very long, for example if it’s a video, whereas asymmetric encryption is only used for the session key, which is tiny:
- picks a random session key,
- symmetrically encrypts the message using that session key,
- and asymmetrically encrypts that session key using Bob’s public key.
- Using his private key, can asymmetrically decrypt the session key,
- and use that to symmetrically decrypt the message.
Now when Alice wants to communicate with Bob, she only needs to know Bob’s public key. There is still the problem of distributing a mapping between identities and public keys, but this can be left to a trusted third party. It scales as O(n), and there is no way for a trusted third party to covertly help Eve — although it could distribute an incorrect mapping, this has a high risk of being caught.
Wikipedia is great for this kind of stuff; here are some links to get you started.
Alice and Bob (but not Eve and Mallory), have a shared secret, which is the key to a symmetric encryption algorithm. Perhaps, they’re using a stream cipher, e.g. RC4 and its cryptographically secure pseudorandom number generator, or perhaps they’re using a block cipher, e.g. AES, with perhaps CBC or GCM as its mode of operation.
They need to perform key exchange; maybe they:
- met up in person and picked a random pre-shared key
- used Diffie-Hellman key exchange
- used the asymmetric encryption algorithm RSA with its public keys and private keys and a trusted third party to securely send a session key to each other
I gave a talk at DDD EA!
Designing secure systems is hard. Soon, more and more of us will be working on web apps, and software as a service, so knowing about this stuff matters.
Red Gate is in the process of growing from purely standalone desktop apps into the world of writing software as a service offerings in the cloud.
As part of that journey we’ve been making mistakes and learning as we go.
This session is a nice introduction to security with lots of examples from things that we’ve learnt along our way. It’ll cover the basics of thinking like an attacker, things you might expect your framework to do for you automatically but actually it doesn’t like CSRF vulnerabilities, to proposed “features” that might make the software easier to use and more awesome, but also makes an attacker’s job much easier as well.
Okay, so this talk may actually fit in five minutes :)
So, this is a talk about Security 101: Just don’t actually do it.
So, the background for this is that there was a post by Daniel on Yammer which was basically we’re writing a piece of code in SQL Server Monitor Hosted, and we need to know how to do something in a secure way, and there was a whole 12 replies, and people came up with something, and I found a blog post where Google basically solved the same problem, only they solved it a different way because if you do it the way, the conclusion, we came to, it leaves you open to a misinterpretation attack. Because they’re quite complicated to explain, I’m going to pick a different example instead.
So, the point of this talk is to show you that something that seems so trivial and such a good idea actually is not.
So, in this hypothetical world in my example, you’re working for a company that has a couple of products. It’s got a web browser which is used regularly by 45% of people on the internet, and it’s got a web server which is visited by 90% of people on the internet. You should be able to work out which company I’m talking about here; it’s not entirely hard.
Cool, so, the product manager comes to you; you’re one of the developers, and says it has to go faster. We want our web browser and our web server to work awesomely fast together, because the application is people doing internet searches, where showing results is really really important to happen very quickly. And users have very slow internet connections, especially their upload, so if we can take into account of that in our design.
And what the product manager thinks is the feature we should implement is we’re going to embrace, extend and extinguish the HTTP/HTTPS standard, by adding our own proprietary extension so that when our web browser talks to our web server we’re going to compress all the HTTP headers, and we’re going to do this even when we run HTTP over HTTPS.
Are we all following so far? Cool, awesome.
So, what are you going to say back to your product manager? So this is a show of hands. So how many of you are going to say yep that’s a brilliant feature, we should definitely go and implement that? And how many people are going to say nope that’s a terrible idea; it would introduce a security vulnerability in our web browser? Hands for that? You all know what’s coming, don’t you? :( And the third one is the one that…, votes for the third one? The third one is that it depends on what our threat model is.
So then we come onto threat models. So with security you always start with a threat model, before you do absolutely anything else, you start with a threat model.
And the quote’s from Wikipedia, so it says: Attacker-centric threat modelling starts with the attacker, and evaluates what they actually want to achieve and what technical capabilities they have in their bag of tricks to achieve that, because only when you know what the attacker wants to do with your system, do you know where it’s worth spending the time to invest, because you’ve got limited resources, limited development, and you probably should be spending those resources actually defending the bits that the attacker is going to attack, and the bits that the attacker is not going to attack probably doesn’t need as much of your time.
So, now we come onto how the…, so let’s say that we have actually compressed the HTTP headers, this is what the attacker is going to do.
So, this is just a bit of background about HTTP headers for those of you that don’t know what they are. So they look a bit like this, so, basically, the bit in green is all constant across every single request. The bit in red is the bit that the web browser must keep secure, this is the authentication cookie, so if the attacker gets the bit in red he wins, and the bit in blue at the top is the actual page which we’re requesting off the web server.
So, now this is how the attacker attacks the compression of HTTP headers. The attacker can change the DOM on his site, just insert a whole bunch of images into the DOM; images that come from the target site he’s attacking. So the first time round he inserts an image with the URL (it’s going to 404 but that’s not really important), so with the URL DeploymentManagerAuthenticationTicket=0. The next time round he inserts an image with that one, and then next time round he inserts an image with that one.
And the point is that now we’re compressing the HTTP headers (these are all HTTP headers), the top one is going to compress better than the bottom two, because it’s got a longer repeated string, which means that because the attacker’s on the public Wi-Fi network, just as you are, he gets to see the length of your request, and he gets to work out the first character of the cookie is 0, and he gets to repeat the same thing to get the next character, and he can do all of this with the number of requests he has to make is the number of characters in the cookie times the number of possibilities for each character, which is fairly small.
So, this is why you can’t implement the feature your product manager says.
And this is the last slide. So the takeaway from the talk is that a feature that seems so simple will have a security vulnerability in it that you can’t reason about, so basically just don’t write this kind of code. Use an existing library if you can. If OpenSSL doesn’t have a function to compress the HTTP then there’s a reason, you shouldn’t build it on top, because there’s a reason it’s not in the underlying library. And if you can’t use an existing library then you’ve got a big problem, which is beyond the scope of this 5 minute talk :)
Cool, and that’s it. Any questions?