What are pass phrases?

Pass phrases are simply very long passwords consisting of whole words and are therefore readable and hopefulle better memorable.

As of summer 2017 also the NIST recommends allowing long passphrases from 6 up to at least 64 characters. Windows XP and later versions for example allow for up to 127 character passwords.

Different from Passwords these phrasen don't need to be cryptic. Despite their greater length they are often considerably easier to memorize. The total length is the trademark of a pass-phrase: the longer it is, the more secure it is.

But beware of reusing a well-known proverb or a sentence from a book. These are largely stored in so-called Rainbow-Tables and are regularly checked by crackers.

A new passphrase every microsecond

The following line can generate a million new, grammatically correct sentences every second as passphrases. The randomness is extremely high and still most are quite memorable (read below for detailed analysis):

Passphrases should be easy to remember

The human brain has developed to deal with the real World and has no problem memorizing larger contexts. Using mnemonics exploits this fact and helps us learning passphrases.

For thousands of years knowledge and events were passed orally from one generation to the next and the medium for that were stories. As Terry Pratchett often remarked, humans are rather Pan narrans (narrating ape) than homo sapiens. The brain is a pattern recognizing machine: it finds patterns (or makes them up) both in observing the environment and also in random sentences. Even better: the more unusual these sentences are, the better they can be memorized!

Passphrases must also be random enough!

Ideally you form a complete sentence from random words. Sentences form a mentale unit and can be memorized considerably better.
In one of his Comics Randall Munroe has shown the problems of conventional passwort-patters and popularized the idea of passphrases:

Passphrases were discussed in detail by Jesper Jensen with Microsoft as early as 2004. A study from Cambridge(MA) showed how (in-)secure well-known or self-constructed phrases are. It is essential to choose random words, since the active vocabulary often has fewer than 2000 words and the brain tends to choose "neighboring" words (adjektives and verbs that fit the nouns like e.g. "red sun") and thus don't add much randomness.

The generator on this page uses only common words; unusual words could increase entropy, but the also increase the likelihood of input errors. The secret lies more in the combination of really randomly selected words. A powerful means to increase security even further, is the combination with replacements of individual characters by numbers or symbols as seen in the comic. Common replacements by so-called Leet-speak (e.g. a by @ and S by $ etc.) are well-known with hackers. Therefore they rather add to confusion than helping increase the security of passwords (they merely increase the size of the character set from 64 to about 75). But when used to modify some words in a phrase, they effectively swap the dictionary with completely unknown words, which renders a dictionary attack useless.

But passphrases can be quite long!

Due to their length phrases are impractical for frequently used passwords. Another disadvantage is the increased likelihood of typing errors (although that rises only proportionally with the length). A different method for music enthusiasts to generate, moderately long but still relatively secure passwords is to collect the first letters of the chorus or better an arbitrary verse from a familiar song. Thus "Oooh Baby I love your way" yields e.g. "OBIlyw".

Passwords must be unique per web site!

A huge security risk lies with the operators of web sites and applications. Web servers are regularly hacked and eMail-adresses, as well as customer data are stolen, often inkluding passwords. A stock of about half a billion known passwords is regularly being used in the form of so-called rainbow-tables by hackers to foray even hashed password lists.

A compromised password for a single web site is already a major nuisance. But if the very same password is used for many if not all web sites, it is catastropic.

With 'classical' passwords employing small and capital letters, numbers and special characters the user itself is a considerable risk, becaus she often forms only simple combinations of these with a common word. When numbers are used, they often come in pairs, counting up to at most 99 yielding a small entropy of only six Bit for two characters. Sometimes individual Characters are substituted by numbers or special characters to increase randomness, which helps only a little, because these substitutions are well-known. Measuring the randomness of these passwords regularly ends in disillution, yielding only a few million combinations. This is not a big hindrance for hackers, when the number of retries is not limited!

And for secure long-time encryption you need even longer passwords with at least trillions of alternatives.

Math again?

We can estimate the randomness/security also for passphrases. To obtain realistic values we need multiple models, yielding entropies for different types of phrases.

One very simple modell e.g. assumes using song titles as phrases. Although there are millions of songs, this is not a cryptographically significant number. Additionally the prominence of the songs is exponentially decreasing, so you reach a significant number of hits with only a few hundred titles. Moreover the individual words in these titles are very stereotypical, so you can already find many hits solely by permuting the most frequent words!

A quite secure phrasemaker

Computers are much better than humans at generating passphrases with high entropy. Usually phrasemongers are not popular, even at parties, but this one is hopefully liked better:

Every page request generates a unique combination of words, forming a correct german sentence. Although nonsense they are regularly much easier to memorize than cryptic passwords. Their only disadvantage is length, which shouldn't be a problem for people who learned to type with all ten fingers though. Furthermore you can leave out words or replace them with briefer ones. This reduces security but these sentences start with a very high value of about 92 Bits. At least 50 Bits are recommended, a value you reach already with half of the words.

By the way, you should stay suspicious and not use these phrases unchanged, when your data is at risk. We don't store these phrases and we can neither repeat them, the cryptographic random number generator won't let us. But even if you trust us, you should definitely change to https in this URL when requesting a phrase. Otherwise a router or server on the way could pick it up.

We can take no responsibility for the words used in this phrase, they are taken at random from the english Wiktionary:

Words from the following sets are used to generate these phrases:

11770 Nouns corresponding to ca. 13 Bit
920 Adjectives resp. 10 Bit
4790 Cities and countries: 12 Bit
7170 Verbs: 13 Bit
1120 Adverbs: 10 Bit

This easily aggregates to about 92 Bit for the generated sample-sentences (2 times Substantive, Adjektive und Orte plus Verb plus Adverb), but this is only a lower limit to the Entropie. When language or dictionary are unknown, the uncertainty is significantly higher with the hacker, since hundreds of thousand words need to be tested.

But even when the dictionaries are unknown, hackers can easily reconstruct them when the source of the phrases is known. They just poll the generator until they have a representative sample of all word classes and thus reduce the entropy to the bespoken 92 Bit.

A more effective strategy is to use additional special characters or Leetspeak, because you create completely new dictionaries then.