Regular Expressions For Regular Folk

Character Escapes

Character escapes act as shorthands for some common character classes.

Digit character — \d

The character escape \d matches digit characters, from 0 to 9. It is equivalent to the character class [0-9].

/\d/g
  • 4 matches2020
  • 6 matches100/100
  • 3 matchesIt costs $5.45
  • 6 matches3.14159
/\d\d/g
  • 2 matches2020
  • 2 matches100/100
  • 1 matchIt costs $5.45
  • 2 matches3.14159
Note

While 59 is also a pair of digits, most engines look for non-overlapping matches from left to right by default.

\D is the negation of \d and is equivalent to [^0-9].

/\D/g
  • 0 matches2020
  • 1 match100/100
  • 11 matchesIt costs $5.45
  • 1 match3.14159

Word character — \w

The escape \w matches characters deemed “word characters”. These include:

  • lowercase alphabet — az
  • uppercase alphabet — AZ
  • digits — 09
  • underscore — _

It is thus equivalent to the character class [a-zA-Z0-9_].

/\w/g
  • 6 matchesjohn_s
  • 7 matchesmatej29
  • 6 matchesAyesha?!
  • 4 matches4952
  • 4 matchesLOUD
  • 4 matcheslo-fi
  • 6 matchesget out
  • 6 matches21*2 = 42(1)
/\W/g
  • 0 matchesjohn_s
  • 2 matchesAyesha?!
  • 0 matches4952
  • 0 matchesLOUD
  • 1 matchlo-fi
  • 1 matchget out
  • 3 matches;-;
  • 6 matches21*2 = 42(1)

Whitespace character — \s

The escape \s matches whitespace characters. The exact set of characters matched is dependent on the regex engine, but most include at least:

  • space
  • tab — \t
  • carriage return — \r
  • new line — \n
  • form feed — \f

Many also include vertical tabs (\v). Unicode-aware engines usually match all characters in the separator category.

The technicalities, however, will usually not be important.

/\s/g
  • 1 matchword word
  • 2 matchestabs vs spaces
  • 0 matchessnake_case.jpg
/\S/g
  • 8 matchesword word
  • 12 matchestabs vs spaces
  • 14 matchessnake_case.jpg

Any character — .

While not a typical character escape, . matches any1 character.

/./g
  • 6 matchesjohn_s
  • 8 matchesAyesha?!
  • 4 matches4952
  • 4 matchesLOUD
  • 5 matcheslo-fi
  • 7 matchesget out
  • 3 matches;-;
  • 12 matches21*2 = 42(1)

  1. Except the newline character \n. This can be changed using the “dotAll” flag, if supported by the regex engine in question.