Tom’s Tutorials For Excel: Spreadsheet Passwords — The Facts About Protection
Spreadsheet password protection is a topic of major concern for Excel users, rightly so. Information in worksheets can be confidential, needing to remain undisturbed with formulas that must be protected from deletion.
It’s wise for an Excel user to voice his or her curiosity of spreadsheet protection, or has questions about just how secure a password-protected spreadsheet really is. When people know the facts without scare tactics or hyperbole, they can make the best decisions for themselves when armed with objective, unbiased information.
As protection platforms go, Microsoft’s products have inherent weaknesses. In its defense, Microsoft has never claimed to have reliable spreadsheet protection. In Office applications, a password is like the lock on your home’s front door; its primary purpose is to keep your friends out. If someone really wants to get in, they will get in.
Try this: open a new workbook, go to Sheet1 and protect it with the password “test” (without the quotes, lower case just as you see it here). Now unprotect Sheet1 but instead of using the password “test”, use the password “zzyw”.
Take comfort that Microsoft is like any other company, in that virtually any application is hackable. Here’s some background on Excel spreadsheet password protection:
When someone password protects a sheet in Excel, they generate a 16-bit 2-byte hash, a technical term for a number generated from a string of text by a function called the MD5 Message Digest Algorithm. An MD5 hash has fewer numeric characters than the actual password text, making it unlikely but not impossible to be replicated. Note that “replicated” is not the same as “duplicated”.
When unprotecting a protected sheet, the password value is compared to the MD5 hash. Excel allows for up to 255 password characters in its worksheet protection scheme. Since it is a case-sensitive scheme, there are over 90 acceptable characters, which translate into the multiple trillions of password text possibilities. Since the combination of possible passwords is much greater than the combination of possible MD5 hashes, some passwords can share the same MD5 hash value.
The MD5 hash is a standard mixing algorithm, executed as follows:
• Take the ASCII values of all characters.
• Shift left the first character 1 bit.
• Shift left the second 2 bits.
• Continue for quantity of characters up to 15 bits, with the higher bits rotated.
• XOR those values.
• XOR the count of characters.
• XOR the constant 0xCE4B.
As you may know, XOR is a logical term associated with a mathematical compound statement, an acronym for “exclusive or”. In this case, statement “A” is the password value you type in. Statement “B” is the generated MD5 hash. The XOR operation returns TRUE when only one of its combinations is TRUE. This translates to more than one password value possible in the context of a truth table:
A B XOR Result
T T FALSE
T F TRUE
F T TRUE
F F FALSE
By the way, if you wanted to reproduce the actual password, and not just a compatible one, it’s a virtual certainty that it literally could not be accomplished during your lifetime.
There are 94 standard characters (26 of A-Z; 26 of a-z; 10 of 0-9; and 32 special such as #,%,!, and so on). That means, for every character there are 94 possibilities.
To extrapolate using the example of an eight-character password, the number of characters to test is
94 x 94 x 94 x 94 x 94 x 94 x 94 x 94
which equals
6,090,000,000,000,000
At the hefty pace of 100,000 password attempts per second, it would take 1,932 years to recover the exact password. And that’s just with 8 characters; with the 255 max it can take millions of years.
What all this boils down to is, if you don’t want to expose your Excel spreadsheets to *any* possible password circumvention, don’t share them. However, the likelihood of someone guessing a compatible hash is very slim, though there are commercially-sold password cracking programs.
One thing is sure, you are in good company: the whole world is in the same boat with this Excel protection issue. As you understand the spreadsheet password protection scheme, you can make your own informed decisions about what and what not to risk putting in your workbook, and how or with whom you share access to your workbooks.
Great post Tom. Really enjoy your tweets too. Any opinions on a good password recovery tool? In my particular line of work, I’m frequently presented with a locked-up file and no-one knows the password (since it’s author has left the company.)
I do not provide assistance with breaking passwords, nor offer opinions on tools that purport to do so.
It should be noted that a “password” (or password hash) that uses 16 bits only creates a possible state space of 2^16 = 65536 combinations/numbers. That’s a count you can brute-force even via a *VB-Macro written in Excel itself* within a few seconds (!).
So to stick to your own metaphor with the friends and the house: I would not even consider this “attack” vector a “break-in” attempt into your home. It’s more like that you locked your door with a single thin latch – from the *outside*… without securing it with a padlock ;).
Some other corrections:
It should also be noted that MD5 does not generate 16-bit hashes:
https://de.wikipedia.org/wiki/Message-Digest_Algorithm_5
It uses 128 bits.
Also don’t mix up workbook protection (where MD5 *is* used) with worksheet protection (where it is *not* – only some bits get shifted around and at the end are XORed together). More details you can read here:
http://chicago.sourceforge.net/devel/docs/excel/encrypt.html
Finally:
If one wants to really protect the data from being accessed at all (by non-authorized persons), maybe it’s best to encrypt the *whole Excel file* with an advanced encryption method, for example by putting it into an archive (format) that supports this feature.
The built-in “protection” methods seem to me more like a CheckBox you can tick somewhere in the file options:
“[ ] Write-Lock on/off” ;).
Hope these clarifications are for some use to people.
MeepLeep, I think perhaps he mixed up 16-bit with 16-byte (128bit). That would make sense…
Hmm… it seems Workbook encryption uses 16byte, but worksheet protection uses 16bit.
Protection is there to stop your friends from opening/changing a sheet.
Encryption is there to “try” and stop your enemies from opening your document.