It's Dangerous, a Python cryptographic signing module

ctz · on June 10, 2012

Use of SHA-1 for this signature/MAC purposes in new applications is deprecated and will be disallowed from next year (source: NIST).

I repeat: do not use SHA-1 in new applications. Do not use this module.

Luit · on June 10, 2012

Well then fork it and make it use HMAC with SHA256/512... Additionally, "Do not use this module" is bad advice if it leads to someone making their own MAC implementation, because that's almost always a bad idea. HMAC-SHA1 still is good security for this purpose.

I bet that as soon as standards bodies (like NIST) actively encourage you to use better hash functions (which they will probably do soon) the developer of this module will decide on what is safe enough for this purpose.

gambler · on June 10, 2012

You advise to "fork it", but immediately after that you add the notorious "do not write your own" meme. Can you see the inconsistency of the messages?

Part of security is being up-to-date in regards to things like hashes. If authors don't update their own libraries and you need to tweak them manually, how is it different from "writing your own"?

Luit · on June 10, 2012

I agree I'm a bit inconsistent, but the library is well written, and exchanging the hashing function is a trivial task. That's not really quite "writing your own". I just mean to discourage people who aren't 100% sure they know what they're doing.

I strongly agree with Armin: HMAC-SHA1 (note, the combo, it's not just the latter) is still good security for most applications you'd consider this library for.

ebiester · on June 10, 2012

I always thought the advice against "writing your own" spoke of the algorithm itself, not a library around the algorithm.

Locke1689 · on June 10, 2012

Nope. Google Keyczar: standard algorithms; still screwed up. You should always use the highest level interface available.

ch0wn · on June 10, 2012

It shouldn't be more than changing those two lines:

https://github.com/mitsuhiko/itsdangerous/blob/59f3bf7877e21...

And the tests, of course.

Luit · on June 10, 2012

My point exactly! Armin (the maker/maintainer of the module) will probably consider doing this somewhere in the future.

the_mitsuhiko · on June 10, 2012

I just pushed out a release that makes it possible to override the digest in a subclass easier.

zokier · on June 10, 2012

While overrideability is good, wouldn't it be better to also be 'secure by default'? I'd imagine that SHA-2 would be more sensible default for most users.

the_mitsuhiko · on June 10, 2012

> I repeat: do not use SHA-1 in new applications.

Probably, but not necessarily. It's not like the other sha versions are fixing the overall problem however.

> Do not use this module.

Even before the 0.13 release I just pushed out where you can override the module easier you could still easily change the hash method with a two line change. Also it's not like maintainers can't change things easily if modules are well defined.

And even if the module would not allow changing the hash method easily SHA1 with HMAC is still incredibly hard to exploit.

tptacek · on June 10, 2012

SHA-1 and SHA-2 (SHA256, SHA512) are not the same algorithm. The problem with SHA-2 isn't that it's insecure; it's that it's slow for its current predicted level of security, and that it's MD-strengthened and so requires an HMAC construction to use in applications like MACs. The sentence "It's not like the other sha versions are fixing the overall problem" is wrong.

You would indeed be better off using SHA256.

But not much. It's worth pointing out though that dogmatism around not using SHA-1 is misplaced. There are still no practical attacks on HMAC-MD5, for instance, even though MD5 itself is effectively broken and its use in (for instance) X.509 certificates is insecure.

the_mitsuhiko · on June 11, 2012

> SHA256

As it is, the size of that digest is also 14 bytes more and I don't feel happy truncating the hash. The bigger SHA2 versions are even longer. Considering that thing should go into cookies and URLs I did not feel very happy with that.

> You would indeed be better off using SHA256.

Probably. I was indeed under the impression that SHA2 is based on the same building blocks as SHA1 but I assume that is not the case. That being said: upgrading is not hard if it even comes to the point where it would be necessary.

tptacek · on June 11, 2012

Do you really want to be the guy promoting a message verification library who won't use SHA-2 because of the extra 14 bytes of digest length? You made a design mistake here. Just fix it.

marcinw · on June 11, 2012

Neils Ferguson, et. al of Cryptography Engineering (p. 95) suggest that truncating a HMAC-SHA-256 to 128 bits should be safe, given current knowledge in the field.

tptacek · on June 11, 2012

This is code that's mostly going to be used in Python web apps (if at all). I thought about arguing in favor of truncating the hash, but then figured this guy would just say "well, I'm not so sure, so to be on the safe side... [I'll use an inferior hash]"... a better argument is, just eat the extra bytes and stick them on your message.

jacques_chester · on June 10, 2012

To clarify very slightly, use SHA-256, SHA-384 or SHA-512 instead.

Sometime soon I believe SHA-3 will be chosen and we'll all be able to move onto that.

fluffyllemon · on June 10, 2012

For those interested, NIST (National Institue of Standards and Technoology) announced a competition to find candidates for the SHA-3 algorithm back in November 2007:

"NIST also plans to host a final SHA-3 Candidate Conference in the spring of 2012 to discuss the public feedback on these candidates, and select the SHA-3 winner later in 2012." [http://csrc.nist.gov/groups/ST/hash/sha-3/Round3/index.html]

The final SHA-3 Candidate Conference was held March 22-23, so they should be picking the winner any day now.

atto · on June 10, 2012

I forked the project and updated the hashing algorithm: https://github.com/andrewconner/itsdangerous

the_mitsuhiko · on June 10, 2012

> I forked the project and updated the hashing algorithm: https://github.com/andrewconner/itsdangerous

That is not necessary, I adjusted the API so that you can easily subclass it and change the digestmod.

tptacek · on June 10, 2012

It should default to a SHA-2 hash and you should deprecate the branch that defaulted to SHA-1.

tptacek · on June 10, 2012

This facility is "built in" to Rails, with MessageEncryptor and MessageVerifier. Before you consider writing your own "apply an HMAC-SHA1 hash to a string" library in Ruby, just take those classes from ActiveSupport instead. They've been reasonably well tested.

ashconnor · on June 10, 2012

Alternatively, if you are not using Rails but are still using Ruby, then you can use OpenSSL::HMAC.

tptacek · on June 10, 2012

This is a MUCH WORSE IDEA. Don't do this. Even this Python wrapper library does a better job providing HMAC services to users than OpenSSL's HMAC primitive does. Leave OpenSSL:* to the suckers^H^H^H^H^H^H crypto library implementors.

ashconnor · on June 10, 2012

The wrapper is pretty elegant weighing in at 3 functions; it all depends on how high-level you want to go.

If people are having trouble they can refer to the Rails ActiveSupport source code for help.

__alexs · on June 10, 2012

Do we really need an entire library to encapsulate http://docs.python.org/library/hmac.html and string.rsplit?

tptacek · on June 10, 2012

Yes, because Python's "hmac" library doesn't provide a secure "verify" method.

__alexs · on June 10, 2012

Is == not good enough for you?

tptacek · on June 10, 2012

Sigh.

http://codahale.com/a-lesson-in-timing-attacks/

Preemptively:

* Yes, a realistic attack.

* Yes, very difficult over the Internet.

* Still difficult but not implausible if the attacker can colocate near your app; ie, if you deploy anywhere on EC2.

* Measurement bounds are high nanoseconds LAN, tens of usecs WAN.

* HMAC verification, unlike password hash comparisons, is a place where timing actually does matter.

Not to suggest that I occupy the high road when it comes to snarky comments, but consider whether your snarky comment in this case suggests an unearned (and thus dangerous) level of confidence about crypto app security. This problem (HMAC timing) is so well known that it's generated many hundreds of comments over the years on HN.

__alexs · on June 10, 2012

That's still a lot of code just to import a single 6 line function...

tptacek · on June 10, 2012

The fact that he knew about that function and you (standing in as "representative Python developer") did not seems to justify the library pretty nicely from what I can tell.

But, obviously, this code does more than Python's hmac library does, beyond just knowing how to properly verify the MAC itself.

__alexs · on June 10, 2012

I wasn't criticising the security of this library, I was criticising it's reason to exist in the form that it does.

pmylund · on June 10, 2012

Nice save.

obiterdictum · on June 10, 2012

Executive summary:

- The signature: SHA1 HMAC of data and, optionally, a timestamp to expire signatures.

- HMAC key derivation: SHA1 of a secret key and a salt.

JoachimSchipper · on June 10, 2012

> - HMAC key derivation: SHA1 of a secret key and a salt.

Note to non-cryptographers: do not use low-entropy (password-ish) secretswith this, they can be bruteforced. (Using bcrypt/scrypt/PBKDF2 instead would fix this.)

the_mitsuhiko · on June 10, 2012

> Note to non-cryptographers: do not use low-entropy (password-ish) secretswith this, they can be bruteforced. (Using bcrypt/scrypt/PBKDF2 instead would fix this.)

PBKDF2 for this would be possible, but a terrible idea. The reason why PBKDF2 works for password verification purposes on web apps is because you can rate limit the login entry point. Doing that for sessions or activation links like what you would use itsdangerous for is an easy way to have other people DDOS your app.

PBKDF2/bcrypt etc. is not the solution for this problem.

tptacek · on June 10, 2012

Deriving secret keys for signing and encryption systems is the entire reason PBKDF2 was designed. The parent comment doesn't suggest using a PBKDF2-type construction to verify packets, but instead only to generate the secret key used by HMAC-SHA1. That's a one-time operation.

Also, PBKDF2 has nothing whatsoever to do with rate limiting login attempts. The attack PBKDF2 tries to blunt is offline, not online. PBKDF2 and bcrypt remain effective even when the imposed expense of a single hash compare is less than the network overhead of making a login request.

the_mitsuhiko · on June 10, 2012

> The parent comment doesn't suggest using a PBKDF2-type construction to verify packets, but instead only to generate the secret key used by HMAC-SHA1. That's a one-time operation.

I am officially an idiot now. I see where the original comment is going with. Namely the "salt" part of the library, not the verification of the signature. You can just use a different secret key or derive one yourself.

I sincerely apologize for missing that and the comments that resulted from that.

That said: the secret key is the important part there, not the salt. The salt is just used to alter the secret key that the use of the serializer in two different places for different purposes does not create the same signature. The intended usage of the whole thing is to not use the salt at all but to provide different secret keys. The "salt" part was taken from the original django implementation with the idea to stay mostly compatible with their format.

Could that be improved by switching to PBKDF2? Probably, but breaking a lengthy secret key is an impossible operation currently, even with SHA1.

> Also, PBKDF2 has nothing whatsoever to do with rate limiting login attempts.

That's not what I meant. I meant that PBKDF2 on a server is feasible because you can rate limit the endpoint. I don't want my servers to be on 100% CPU because someone spams the login form.

tptacek · on June 10, 2012

The salt in PBKDF2 used as a KDF does almost exactly the same thing as the salt in a password hash: it prevents attackers from precomputing a huge dictionary of candidate password-key mappings and employing it on every PBKDF-keyed system the attacker finds. All secure password-handling schemes are randomized.

I don't know what you mean by "lengthy secret key", but if you're talking about a string that a human would recognize as language, that's not a strong key. Strong keys come from /dev/keygeneration --- er, I mean, /dev/random and from schemes like PBKDF2.

The attack on a human-intelligible crypto key occurs offline, and HMAC verifying keys are usually long-lived, so the attacker has both a lot of time and a lot of incentive to go after that key. Do not use human-intelligible strings as HMAC keys.

I've heard a lot of smart people worry about adaptive password hashes becoming a DOS vector for their application, and while stipulating that they are indeed smart people, they haven't thought this problem through very well. Every application can be DOS'd, usually in much more effective ways than by forcing the server to hash lots of passwords. Availability attack vectors in web apps are so common that they aren't even objectives on penetration tests; what would be the point?

Also, HMAC "authenticates" and "verifies" messages; it doesn't "sign" them. "Signature" has a subtly different meaning in crypto.

the_mitsuhiko · on June 10, 2012

> I don't know what you mean by "lengthy secret key"

`os.urandom(120)` for instance. That's what Flask and Django document. I can see how this is not obvious from the documentation which puts a human readable string in there and I will update it appropriately.

> Also, HMAC "authenticates" and "verifies" messages; it doesn't "sign" them. "Signature" has a subtly different meaning in crypto.

I agree. The use of the word signing for this however is quite widespread and I adopted the same meaning (and implementation) as with the original Django one for compatibility.

tptacek · on June 10, 2012

Sure. Just know that developers sticking "c0mp4ny_n4m3" in their source file as an AES or HMAC key is a very common problem. os.urandom is a fine solution for the problem.

JoachimSchipper · on June 10, 2012

Let me expand a bit: use bcrypt etc to derive the key instead of using a salted hash, then use HMAC to sign the data. You're right that using bcrypt for every request would be really bad for performance, but it's a per-app password, so you can just do run bcrypt once, at startup.

(Alternative: drop the SHA1/bcrypt/whatever and just use a really strong secret key. 128 bits of randomness is impossible to brute-force.)

the_mitsuhiko · on June 10, 2012

> but it's a per-app password, so you can just do run bcrypt once, at startup.

itsdangerous has nothing to do with passwords. It's about signing small messages and these messages are obviously created at runtime.

> nonsense and just use a really strong secret key

That's how you should use itsdangerous: use a strong secret key.

Luit · on June 10, 2012

One way to get a good strong secret key for this purpose:

$ python

>>> import os

>>> os.urandom(64)

tcas · on June 11, 2012

You shouldn't use urandom for crypto purposes. /dev/random is generated (on most platforms) as cryptographic strength numbers (usually from hardware), but can block if it runs out of data. /dev/urandom was created with the guarantee to never block and will use /dev/random's pool of numbers initially but can start outputting lower entropy numbers if /dev/random blocks.

amalcon · on June 11, 2012

/dev/random is better than urandom for this sort of thing. On linux:

dd if=/dev/random bs=64 count=1 2>/dev/null | hexdump

JshWright · on June 10, 2012

If rate limiting login attempts was all it took to protect passwords, then it wouldn't matter what hashing function you used, since the rate limiting would be independent of that.

The point of PBKDF2 and friends is to make it expensive to brute force acquired hashes. It effectively 'rate limits' hash-in-hand brute force attempts. It's designed to protect hashes once they are in hostile environments.

The use cases you described (session keys, activation links) are the perfect example of such environments.

the_mitsuhiko · on June 10, 2012

> If rate limiting login attempts was all it took to protect passwords, then it wouldn't matter what hashing function you used, since the rate limiting would be independent of that.

It does not matter what hashing function you use for the login. It matters in case your database leaks.

> The use cases you described (session keys, activation links) are the perfect example of such environments.

They are not. Because you don't want to reverse engineer the contents of the message (the message is there in plain text). That's the reverse of what you want to do with passwords. In case of the password you want to brute force what the password is, in case of a signed message you want to forge a signature.

JshWright · on June 10, 2012

>It does not matter what hashing function you use for the login. It matters in case your database leaks.

That was my point... I suppose I'm not sure I understood why you brought up login entry point rate limiting?

amalcon · on June 10, 2012

Rate limiting makes it harder for people to DOS you using the password hash on your login form.

Locke1689 · on June 10, 2012

There should be no password at all involved. You're doing key signing, not password authentication. You should have a tool, like OpenSSL or GPG, generating your key pair for you.

And people, STOP USING CRYPTOGRAPHIC PRIMITIVES FROM THINGS THAT ARE NOT OPENSSL.

tptacek · on June 10, 2012

Aroo? OpenSSL is usually a terrible place to pull crypto primitives from; the string "OpenSSL" in a Python or Ruby file is a decent predictor of crypto bugs. Also, OpenSSL has a relatively poor track record of algorithm-level bugs.

I'd like your sentence more if it read "STOP USING CRYPTOGRAPHIC PRIMITIVES" and then ended with a period.

Locke1689 · on June 10, 2012

I took that from cperciva when he did his crypto talk.

Edit: Found quote:

Website security: Use OpenSSL. OpenSSL has a horrible track record for security; but it has the saving grace that because it is so widely used, vendors tend to be very good at making sure that OpenSSL vulnerabilities get fixed promptly. I wish there was a better alternative, but for now at least OpenSSL is the best option available. UPDATE: For added security, terminate SSL connections in restricted environment and pass the raw HTTP over a loopback connection to your web server.

And yes, I'd always recommend using the highest-level possible API. If you have SHA in your code you are probably working at too low a level.

Do you have any recommendations for a better toolkit? Tarsnap uses Colin's own implementations so it's not a very good resource. There are other problems I've found with cryptlib, etc., (e.g., cryptlib is commercial).

tptacek · on June 10, 2012

I agree that one thing that is actually worse than directly pulling AES or SHA-2 out of OpenSSL and fucking with it in your code is actually implementing AES or SHA-2 yourself. :)

Locke1689 · on June 10, 2012

My inclination for doing this kind of thing would be to use PyOpenSSL or a similar wrapper to do an S/MIME sign/verify on each side. Encryption using AES if necessary. I'd be inclined to do this for a couple reasons:

1) If there's anything my grad crypto class taught me it's that RSA, specifically padding, is the most god-forsaken idea ever created by man and you will never, ever, ever, ever get it right. If the words RSA are in your code you are in deep shit.

2) S/MIME seems to be a simpler system than any certificate system I have seen. X.509 is an unholy mess. In fact, all PKI is just a complicated disaster waiting to happen.

3) Super simple API -- it can even be done on the command line.

Is there something different you'd recommend?

Edit: Actually, I just thought of another option. GPG has a --sign and --verify option. If GPG can be installed on the system it may be worth trying to integrate that.

ibotty · on June 10, 2012

... when will they ever learn...

the_mitsuhiko · on June 10, 2012

> ... when will they ever learn...

When will some people on hackernews ever learn to start using their brain. PBKDF2 is not what you use to verify signatures. I think the name should already give that away.

//EDIT: and when will I ever learn to read properly. I just realized that the comment was referring to the "salt" part, not the signature.