Tuesday, January 11, 2011

Now Recruiting: Password Mules!

The above announcement originates from a web forum where users submit password hashes for cracking. Other users reply with recovered passwords. Recovering your own? well, why not. Recovering 100 million? A reasonable question would be: Where did you get those? It's about time to talk about ethics.
I frequently visit forums and websites that discuss password cracking in many forms. Many people participates in such forums either out of personal interests, research or commercial purposes. I've become increasingly aware of users posting large amounts of password hashes, asking for help to crack them without any explanation about their origins, the purpose of posting & cracking them. NOTHING. There is no information available about the user either, happily hiding behind the "anonymity" the Internet provides them with.

At the #Passwords10 conference I got challenged after my talk to do a debate on password cracking ethics by Howard Smith of Oracle:
(Howard Smith during his talk at #Passwords10)
I think my presentation, previous blog posts as well as my guest blog post for Elcomsoft entitled "Why you should crack your passwords" clearly presents my point of view.

Howard's opinion, which was pretty opposite (correct me if I'm wrong Howard), is that there is no point in doing password cracking for research purposes, since we pretty much know that passwords haven't improved much during the last 10-15 years or so. People are still using simple passwords, they are still personally related, they are still easy to crack. If we still need to do it, we shouldn't have to actually display the found passwords, as this may violate privacy and the entire point of keeping passwords secret. Compare them "automagically" to a predefined set of rules, list those accounts that doesn't comply, and enforce a new password. Howard also said that one should question the legality of cracking password hashes with unknown origins, found on anonymous blog posts and shady forums on the Internet.

It's hard to disagree.

However; we do know that the bad guys are doing this. In fact, it seems to me as if there is an increasing trend in releasing large hash-only lists onto various web forums, asking other participants - even site owners - to participate freely in cracking those password hashes.

I'm afraid those participating are effectively becoming free password mules, aiding the bad guys in increasing the value of their stolen data. These data are of course obtained through illegal hacking activity against websites, compromising parts of, or entire user databases. By stealing user names and cracking their associated passwords, they suddenly have data with a monetary value attached to it in the black market.

Now here's a dilemma (back to Howard): As ... I don't know... password security professionals? can we aid in saving both users and service providers by monitoring such forums, by downloading such lists and try to identify their origins before the bad buys start selling the valuable data - informing the service provider about what we've found (under closed/responsible disclosure?) Or is that a job for the police or other government agencies to do?

We've seen cases before where service providers have no clue about being compromised long after their data has been put out on the Internet for sale by criminals. It will happen again. And again. And again.

The Gawker compromise also showed what could be a possible first; Chris Wysopal (CTO at Veracode, Twitter: @WeldPond) pointed out in a tweet that other service providers (Linkedin and others) used the list of compromised accounts (e-mail addresses) from Gawker to disable any of their own users with the same e-mail addresses. This just in case the users had broken one of the many laws of passwords: Never use the same password across multiple services. What Linkedin and others did, he saw as a possible new best practice. I fully agree to that!

So here we are, with a discussion that has been going on "forever", and it doesn't really have an ending either: ethics. At the same time password mules are unknowingly (or wittingly?) helping criminals increase the value of their stolen data, creating even more damage to providers as well as end-users. What next?

I'll end this blog post by quoting "Barsmonster", or Michail Svarychevski as he's named in real life, when he were asked why he quit developing his GPU based tool for high-performance password cracking at his own forum:
"The complexity & danger - is due to risks to help someone to violate the laws, especially if you do this for money - you may be liable."

My definition of Password mule:
A person that willingly or unknowingly aids in cracking passwords obtained through illegal or questionable actions, and where the purpose is to increase the criminal monetary value of the data obtained.


  1. Well, FRT has a hash cracking section which is probably a portion of the site that makes everyone, including me, quite nervous.

    However, I did encounter an interesting case recently in which the user was very confident about their hashes and wanted them to tested and hoped for no results. Indeed I got no hits and the user was pleased.

    Now, when people make long posts about how they got the hashes via SQL injection or so forth we generally ban the user outright.

    We have various restrictions in place regarding quantity and so forth.

  2. I'd say that posting legally obtained hashes to such open forums would be a rather bad idea in most cases. Exceptions could be were the primary point is to verify that "nobody" can break them, they are no longer valid for any systems or users (how do you know?) or other reasons.

  3. I think there are probably two different questions being asked here:

    1) If it is ethical to perform research on passwords obtained from publicly disclosed lists, (which include hash sets of "unknown" origin posted online).

    2) If it is ethical to do the above and then post the cracked hashes publicly.

    For question #2 I fall squarely into the "no" category. A vast majority of people post these hashes online because they want to gain access to accounts/systems that do not belong to them. That's why I don't make available on my blog the cracked hashes or even the original hash sets that I work with. I also respond to e-mail requests that I receive asking for help cracking a neighbor's wireless password with a reminder that I do most of my work to support law enforcement ;)

    I'm not saying this to take a holier than thou stance, and I can certainly see some of the reasoning behind providing hash lists online so people can check to see if they were compromised. I just think that there are better approaches, such as the hashed e-mail lookup used in the most recent Gawker disclosure. Yes these disclosed hash lists are public, and someone with enough skills and time can find them even without additional help, but I still want to make it as hard as possible for someone to obtain these lists and use them for nefarious purposes.

    For question #1 though, I think we do need as a security community to make use of these hash lists to better understand the security provided by human generated passwords. I would expect Howard to agree with this since he used such a list in his presentation ;). The simple fact is that unlike what some people imply, password security isn't completely broken. If it was, the internet as we know it wouldn't work. At the same time, passwords aren't perfectly secure either. The problem is that no-one has a good understanding of where in between these two extremes the security provided by password protected systems actually falls. We need to have a better understanding of what works, what doesn't, what policies help security, and which policies hinder it, and unfortunately most of that will depend on analyzing real user data. Now there's three ways I can think of to obtain that data, and each one has its pluses and minuses. The first is to set up a user survey. The second is to analyze user passwords in a system under your control. Finally there is analyzing disclosed password hashes.

  4. Dealing with disclosed hash lists always seems to be the most controversial option, though it is important to remember there are major issues that also need to be addressed with the other two options as well. Considering I've found my own userid/password hash in one of these lists, I'm very aware of the privacy concerns. Here are a couple of things I do to try and address them:

    1) I never verify if a password is valid for a particular account, (aka try to log in using that password). This should go without saying, but I have seen people post analysis using this, (most notably with the Spanish/Portuguese hotmail list where some researchers logged into the accounts and saw the number of "password reset" messages generated by the hackers).

    2) I do not associate the userids with the hashes in my cracking session/analysis. Instead I assign each hash a designator of the set it belongs to. This is because I don't want to see that someguy@yahoo.com's password is '123456'. Admittedly using that info when targeting salted lists would help a lot, but ignoring that optimization is a price that I'm willing to pay.

    3) I do not perform additional research on individuals that appear in the lists. This was a tricky one, since one big question is how can open source data like Facebook posts/tweets/etc can be used to optimize an attack. That being said, this would require me to ignore point #2 above, and it starts getting into a little too much privacy intrusion for me to be comfortable with.

    I don't claim this eliminates all concerns, but I also feel that ignoring publicly available data would be irresponsible. We might as well try to do some good with an obviously bad situation like a password disclosure.

  5. I think you laid down two challenges:

    1) That my views are the opposite of yours; and
    2) the dilemma (should we monitor forums).

    Of the two I think the dilemma is the more important.

    I do think there is a difficult balance to strike. Do we encourage the criminals by making use of their 'product' and thereby potentially justifying the criminal activity? Or is the knowledge gained of such value that we should accept the risk?

    So, what would we gain, over what we should be assuming?

    Firstly absence of evidence is not evidence of absence. If you find a
    bunch of passwords for your site on a forum you know you're hosed. But if you don't it doesn't mean someone hasn't comprised your site either.

    You don't gain certainty through searches.

    So maybe rather than spending your time looking for something that may not be there it would be better to spend the time designing and building systems that have defence in depth?

    - why isn't it a priority 1 bug that passwords aren't stored properly salted?
    - why does your mid tier have select on the password table (as opposed to, say, having a stored procedure that can verify a single account at a time)?
    - do you have sufficient audit and logs in place to detect suspicious account usage?

    The list goes on. But my suggestion is that we should spend less time cracking passwords and more time ensuring that systems are designed and implemented to the best of our ability. Or more specifically when we perform our threat analysis we shouldn't assume that users passwords are secure and known only to the end user - in fact we should assume that hostile actors have a large number of end user accounts available to them and design accordingly.

    Of course it would be naive to assume that all systems are fixable and all legacy products fully supported with developers willing and able to fix these issues. And in those cases we have to make do with what we have.

    So I don't think our positions are all that far apart really - of course there will be times when cracking passwords may well be the best outcome.

    But I think a better outcome would be systems where that was not cost effective (because the salting and algorithm choice made it so) and not necessary (because the design and implementation provided the policy and audit functionality for it not be necessary).

  6. I fully agree with you that there's a lot of similarities between our views. For most of the points I think we're in violent agreement ;) I fully support the view that we need to focus on designing secure systems. From reading your post though, I'd like to stress that my position is that it's not an either/or proposition on focusing on policy/system design/etc vs studying how people actually create/use passwords in real life. We can do both. The best example of that is when it comes to studying and designing password creation policies. Yes we can set up policies such as blacklists, password guess rate limiting, etc, but without studying human behavior and actual use cases we don't really know how effective those policies are.

    A concern I do share with you is by making use of these disclosed lists, we validate/encourage the people who are stealing these lists in the first place. I don't have a good answer for that. It's my belief that people are going to hack sites regardless, but I fully realize the potential flaws with that argument. Right now it comes down to the fact that I strongly believe that my research is having a positive contribution to helping the good guys. For example the title of my ACM CCS Paper was "Testing Metrics for Password Creation Policies by Attacking Large Datasets of Disclosed Passwords."

    I'll fully agree with you though that there certainly are other ways to study the security of systems besides analyzing user passwords. That's actually one of the main reasons I'm bugging Per, (as well as others), so much right now. I'm trying to collect real world use cases of the effectiveness of different password creation policies. There's tons of examples of these policies failing, but not very may examples of them being successful. This isn't that they are not successful; As Per pointed out, "Remember all those who haven't been compromised yet", but I do want to find some cases where we can point to instances where specific password creation policies, (such as requiring an uppercase character, vs just using a blacklist of banned passwords), actually had a verifiable positive impact on protecting a system.

  7. Well, first of all I don't mind getting bugged Matt! :-) At least not on these subjects.

    As @KluZz tweeted about his SSHD logs, and as I've blogged about before, keeping your username secret will also aid in decreasing the risk of your account getting compromised.

    Lets rewind a little bit to the Gawker compromise. I think I've said this before; commentators all over had a laugh on all the obviously bad passwords that were observed once again. Howard is right here; not much new to be seen, if any change at all for the better.

    For those of us with an interest in password research, or security in general, it is easy to laugh at the large and often anonymous group of "stupid users with stupid passwords".

    Sarah Palin got her account hacked because she used a password that were connected to her in some personal way, and the Yahoo password recovery system helped the attacker guess it. Now I don't know much about Sarah Palin, but anyone in such a position must have some working brain cells. Did anyone say she was stupid using such a simple password? Oh.. and what was the minimum password policy of Yahoo at that time?

    Again; much easier to laugh of all the "stupid" passwords revealed in public disclosures, compared to actually naming all those companies who perhaps should be named for not even being remotely close to good practices for password policies.

    Brian Krebs says that storing passwords in clear-text is a 101 NO-NO. I agree. Would he be interested in listing large online services that actually does store their passwords in cleartext, thereby warning people not to use their services? I don't know.

    I don't want to make security more difficult, on the contrary. Almost everybody uses passwords and PINs today.

    I want less passwords in a tradeoff with better passwords. I want service providers to actually implement best practices. Kind of difficult when Microsoft recommends minimum 8 mixalpha, while not being close for Hotmail, as an example.

    Any suggestions on how to turn the service providers towards better practices would be highly appreciated. Just remember that any such advice should contain a positive business case, not just the old talk of "but what if...?".


All comments will be moderated, primarily for spam. You are welcome to disagree with my posts of course.