Professional Documents
Culture Documents
Character classes.
?? matches "?"
?v matches vowels: "aeiouAEIOU"
?c matches consonants: "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ"
?w matches whitespace: space and horizontal tabulation characters
?p matches punctuation: ".,:;'?!`" and the double quote character
?s matches symbols "$%^&*()-_+=|\<>[]{}#@/~"
?l matches lowercase letters [a-z]
?u matches uppercase letters [A-Z]
?d matches digits [0-9]
?a matches letters [a-zA-Z]
?x matches letters and digits [a-zA-Z0-9]
?o matches control characters
?y matches valid characters
?z matches all characters
?b matches characters with 8th bit set (mnemonic "b for binary")
?N where N is 0...9 are user-defined character classes. They match characters
as defined in john.conf, section [UserClasses]
NOTE 2, the rules engine currently have very limited understanding of UTF-8 so
character classes etc. will only work with ASCII characters, even if using
--encoding=utf-8.
Simple commands.
NOTE, all of these are encoding-aware. Eg. if you do not specify an encoding,
the l command will lowercase A-Z only. If you use --encoding=iso-8859-1 it will
also recognise ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ and lowercase them properly.
String commands.
To append a string, specify "z" for the position. To prefix the word
with a string, specify "0" for the position.
Also see the "X" command (extract and insert substring) under "Memory
access commands" below.
Note that square brackets ("[" and "]") are special characters to the
preprocessor: you should escape them with a backslash ("\") if using
these commands.
M memorize the word (for use with "Q", "X", or to update "m")
Q query the memory and reject the word unless it has changed
XNMI extract substring NM from memory and insert into current word at I
If "Q" or "X" are used without a preceding "M", they read from the
initial "word". In other words, you may assume an implied "M" at the
start of each rule, and there's no need to ever start a rule with "M"
(that "M" would be a no-op). The only reasonable use for "M" is in the
middle of a rule, after some commands have possibly modified the word.
The intended use for the "Q" command is to help avoid duplicate
candidate passwords that could result from multiple similar rules. For
example, if you have the rule "l" (lowercase) somewhere in your ruleset
and you want to add the rule "lr" (lowercase and reverse), you could
instead write the latter as "lMrQ" in order to avoid producing duplicate
candidate passwords for palindromes.
The "X" command extracts a substring from memory (or from the initial
word if "M" was never used) starting at position N (in the memorized or
initial word) and going for up to M characters. It inserts the
substring into the current word at position I. The target position may
be "z" for appending the substring, "0" for prefixing the word with it,
or it may be any other valid numeric constant or variable. Some example
uses, assuming that we're at the start of a rule or after an "M", would
be "X011" (duplicate the first character), "Xm1z" (duplicate the last
character), "dX0zz" (triplicate the word), "<4X011X113X215" (duplicate
every character in a short word), ">9x5zX05z" (rotate long words left by
5 characters, same as ">9{{{{{"), ">9vam4Xa50'l" (rotate right by 5
characters, same as ">9}}}}}").
Numeric commands.
vVNM update "l" (length), then subtract M from N and assign to variable V
"l" is set to the current word's length, and its new value is usable by
this same command (if N or/and M is also "l").
V must be one of "a" through "k". N and M may be any valid numeric
constants or initialized variables. It is OK to refer to the same
variable in the same command more than once, even three times. For
example, "va00" and "vaaa" will both set the variable "a" to zero (but
the latter will require "a" to have been previously initialized),
whereas "vil2" will set the variable "i" to the current word's length
minus 2. If "i" is then used as a character position before the word is
modified further, it will refer to the second character from the end.
It is OK for intermediate variable values to become negative, but such
values should not be directly used as positions or lengths. For
example, if we follow our "vil2" somewhere later in the same rule with
"vj02vjij", we'll set "j" to "i" plus 2, or to the word's length as of
the time of processing of the "vil2" command earlier in the rule.
Note that U will accept plain ASCII. It will only reject words that contain
8-bit characters but can't be parsed as UTF-8. It can be used to reject
invalid input words, or to reject invalid output words after applying other
commands.
When defining "single crack" mode rules, extra commands are available
for word pairs support, to control if other commands are applied to the
first, the second, or to both words:
If you use some of the above commands in a rule, it will only process
word pairs (e.g., full names from the GECOS field) and reject single
words. A "+" is assumed at the end of any rule that uses some of these
commands, unless you specify it manually. For example, "1l2u" will
convert the first word to lowercase, the second one to uppercase, and
use the concatenation of both. The use for a "+" might be to apply some
more commands: "1l2u+r" will reverse the concatenation of both words,
after applying some commands to them separately.
The preprocessor is used to combine similar rules into one source line.
For example, if you need to make John try lowercased words with digits
appended, you could write a rule for each digit, 10 rules total. Now
imagine appending two-digit numbers - the configuration file would get
large and ugly.
With the preprocessor you can do these things easier. Simply write one
source line containing the common part of these rules followed by the
list of characters you would have put into separate rules, in square
brackets (the way you would do in a regexp). The preprocessor will then
generate the rules for you (at John startup for syntax checking, and
once again while cracking, but never keeping all of the expanded rules
in memory). For the examples above, the source lines will be "l$[0-9]"
(lowercase and append a digit) and "l$[0-9]$[0-9]" (lowercase and append
two digits). These source lines will be expanded to 10 and 100 rules,
respectively. By the way, preprocessor commands are processed
right-to-left while character lists are processed left-to-right, which
results in natural ordering of numbers in the above examples and in
other typical cases. Note that arbitrary combinations of character
ranges and character lists are valid. For example, "[aeiou]" will use
vowels, whereas "[aeiou0-9]" will use vowels and digits. If you need to
have John try vowels followed by all other letters, you can use
"[aeioua-z]" - the preprocessor is smart enough not to produce duplicate
rules in such cases (although this behavior may be disabled with the
"\r" magic escape sequence described below).
Please refer to the default configuration file for John the Ripper for
many example uses of the features described in here.
$Owl$