The regex formula!

The regex formula in Excel’s REGEXTRACT function is:

REGEXTRACT(text, regex_pattern, [group_num])

Here’s a breakdown:

  • text: The text string you want to extract from.
  • regex_pattern: The regular expression pattern to match.
  • [group_num]: Optional, specifies which capture group to extract (defaults to 0, which extracts the entire match).

The regex_pattern is where the magic happens! It’s a string that defines the pattern to match, using special characters and syntax. Some common elements include:

  1. . (dot): Matches any single character.
  2. * (star): Matches zero or more occurrences of the preceding character or group.
  3. + (plus): Matches one or more occurrences of the preceding character or group.
  4. ? (question mark): Matches zero or one occurrence of the preceding character or group.
  5. {n,m} (curly braces): Matches at least n but no more than m occurrences of the preceding character or group.
  6. [...] (square brackets): Defines a character class (e.g., [a-zA-Z0-9] matches any letter or digit).
  7. ^ (caret): Matches the start of the string.
  8. $ (dollar sign): Matches the end of the string.
  9. | (pipe): Logical OR operator.
  10. ( and ) (parentheses): Groups characters or patterns together.

The group_num argument specifies which capture group to extract. Capture groups are parts of the pattern enclosed in parentheses (()). By default (if group_num is omitted), the entire match is extracted.

Remember, the regex pattern is used to search for a specific pattern in the text string, and the group_num argument determines what part of the match to extract.

If you like to know more about regex_pattern click