Regular expressions are patterns used to match character combinations in strings. In MaSH we can use regular expressions to perform text search and text replace operations. See here for more information on search and replace.
Creating a regular expression
To compose a regular expression in MaSH we provide an unquoted string surrounded by two forward slashes. The final slash can be followed by optional modifiers.
Natural
set regex to /pattern/im
Standard
regex = /pattern/im
Patterns
Brackets are used to find a range of characters:
Expression | Description |
---|---|
[abc] | Find one character from the options between the brackets |
[^abc] | Find any character NOT between the brackets |
[0-9] | Find one character from the range 0 to 9 |
Modifiers
Modifiers can change how a search is performed.
Modifier | Description |
---|---|
i | Performs a case-insensitive search |
m | Performs a multiline search (patterns that search for the beginning or end of a string will match the beginning or end of each line) |
Metacharacters
Metacharacters are characters with a special meaning:
Metacharacter | Description |
---|---|
| | Find a match for any one of the patterns separated by | as in: cat|dog|fish |
. | Find just one instance of any character |
^ | Finds a match as the beginning of a string as in: ^Hello |
$ | Finds a match at the end of the string as in: World$ |
\d | Find a digit |
\s | Find a whitespace character |
\b | Find a match at the beginning of a word like this: \bWORD, or at the end of a word like this: WORD\b |
\uxxxx | Find the Unicode character specified by the hexadecimal number xxxx |
Quantifiers
Quantifiers define quantities:
Quantifier | Description |
---|---|
n+ | Matches any string that contains at least one n |
n* | Matches any string that contains zero or more occurrences of n |
n? | Matches any string that contains zero or one occurrences of n |
n{x} | Matches any string that contains a sequence of X n’s |
n{x,y} | Matches any string that contains a sequence of X to Y n’s |
n{x,} | Matches any string that contains a sequence of at least X n’s |
If your expression needs to search for one of the special characters you can use a backslash ( \ ) to escape them. For example, to search for one or more question marks you can use the following expression
pattern = /\?+/
Quantifiers don’t just apply to characters, you can also use them with metacharacters. For example, to search for two consecutive digits you can use the following expression
pattern = /\d{2}/
matches
– Using regular expressions with conditionals
We can use the matches operator inside conditionals in our mashlets to execute code if a regular expression is a match.
Natural
set str to "How Now Brown Cow"
if str matches /^How/
printline "Match"
end
if str matches /[Nn]ow/
printline "Match"
end
if str matches /[A-Z]ow/
printline "Match"
end
if str matches /[A-Za-z_0-9]ow/
printline "Match"
end
Standard
str = "How Now Brown Cow"
if (str matches /^How/) {
printline("Match")
}
if (str matches /[Nn]ow/) {
printline("Match")
}
if (str matches /[A-Z]ow/) {
printline("Match")
}
if (str matches /[A-Za-z_0-9]ow/) {
printline("Match")
}
Output
Match
Match
Match
Match
Grouping
You can use parentheses ( ) to apply quantifiers to entire patterns. They also can be used to select parts of the pattern to be used as a match.
Natural
# Use grouping to search for the word "banana" by looking for ba followed by two instances of na:
set str to "Apples and bananas."
set pattern to /ba(na){2}/i
printline (str matches pattern)
Standard
# Use grouping to search for the word "banana" by looking for ba followed by two instances of na:
str = "Apples and bananas."
pattern = /ba(na){2}/i
printline((str matches pattern))
Output
true