The regex formula!
The regex formula in Excel’s REGEXTRACT function is:
REGEXTRACT(text, regex_pattern, [group_num])
Here’s a breakdown:
-
text
: The text string you want to extract from. regex_pattern
: The regular expression pattern to match.-
[group_num]
: Optional, specifies which capture group to extract (defaults to 0, which extracts the entire match).
The regex_pattern
is where the magic happens! It’s a string that defines the pattern to match, using special characters and syntax. Some common elements include:
.
(dot): Matches any single character.*
(star): Matches zero or more occurrences of the preceding character or group.+
(plus): Matches one or more occurrences of the preceding character or group.?
(question mark): Matches zero or one occurrence of the preceding character or group.{n,m}
(curly braces): Matches at leastn
but no more thanm
occurrences of the preceding character or group.[...]
(square brackets): Defines a character class (e.g.,[a-zA-Z0-9]
matches any letter or digit).^
(caret): Matches the start of the string.$
(dollar sign): Matches the end of the string.|
(pipe): Logical OR operator.(
and)
(parentheses): Groups characters or patterns together.
The group_num
argument specifies which capture group to extract. Capture groups are parts of the pattern enclosed in parentheses (()
). By default (if group_num
is omitted), the entire match is extracted.
Remember, the regex pattern is used to search for a specific pattern in the text
string, and the group_num
argument determines what part of the match to extract.
If you like to know more about regex_pattern click