A regular expression or Regex is a pattern or filter that describes a set of string that matches the pattern.
Basic Regex
Regex expression | Description | Ref |
---|---|---|
\ | Escape character | |
+, *, ? | Quantifiers | [[#quantifiers---- |
. | Match words that contains character | [[#wildcard- |
^ | Start with a pattern, ^pattern | |
$ | End with a pattern, pattern$ | |
[ ] | Match either in the brackets, [xy]either x or y will be matched |
Quantifiers (+ ,*, ?)
Quantifiers allow you to specify how many times a character or group of characters should be matched.
+
: Matches one or more occurrences.*
: Matches zero or more occurrences.?
: Matches zero or one occurrence.
The ?
find the lines that containing zero or one u
.
grep "colou?r" colors.txt
The this case the program will find all lines containing either “color” or “colour”. The character before the ?
sign is an option which is the u
.
Wildcard (.)
The dot .
matches any single character. For example, to find all three-letter words where the second letter is any character:
grep ".a." words.txt
You can use multiple dots to match multiple characters at specific positions. For example, to find all five-letter words where the first and third letters are “a” and the fourth letter is any character:
grep "a.a." words.txt
This will match words like “alas”.