Match Text Patterns with REGEX

Hello and welcome to an exciting dive into the world of Regular Expressions (regex) in Excel! Today, we’re going to explore three powerful new functions in Excel: REGEXTEST, REGEXEXTRACT, and REGEXREPLACE. If you’re not familiar with regex, don’t worry—we’ll start from the basics and work our way up.

Note: these are new functions and are not available in all Excel versions.

Video

What is Regex?

Regex, short for Regular Expressions, is a tool used in many programming languages to identify patterns within text. Recently, Excel introduced three new regex functions that greatly enhance its text processing capabilities.

Please note that regex is a very comprehensive system, and it is far too big to cover in this single post. So, my goal is to demonstrate the new Excel functions with some illustrations that introduce regex and some basic patterns. If these capabilities look like they may be helpful in your workbooks, you’ll definitely want to dig deeper into the pattern syntax.

Function 1: REGEXTEST

The REGEXTEST function helps you determine if a specific pattern exists within a text string. It returns TRUE if the pattern is found and FALSE if it is not.

Syntax:

=REGEXTEST(text, pattern)

Example:

Let’s say we have cells in column B, and we’d like to see if they contain any number:

We can write a formula into column C that uses a simple pattern to search for any numeric digit within a cell. For example, we can write the following formula in C10:

=REGEXTEST(B10, "[0-9]")

We can fill this down, and it checks the cell to the left. If there are any digits in the text, it returns TRUE accordingly:

You can also use shorthand notation for digits with \d.

=REGEXTEST(A2, "\d")

Function 2: REGEXEXTRACT

The REGEXEXTRACT function extracts text that matches a specified pattern from within a cell.

Syntax:

=REGEXEXTRACT(text, pattern)

Example:

Let’s say we have some names and addresses combined in a column:

Suppose we want to extract a full name in the format “Last, First”. We could use FlashFill if this were a one-time project. But, if we prefer a formula-based approach for workbooks we update frequently, we can use regex. Prior to regex, we had access to text functions such as LEFT, RIGHT, MID, FIND, LEN, and so on. But now we have access to regex functions. So, we’ll use REGEXEXTRACT here.

  1. Pattern Explanation: To match a series of word characters followed by a comma, a space, and another series of word characters, use \w+, \w+ enclosed in quotes.
  2. Formula:
    =REGEXEXTRACT(B11,"\w+, \w+")

We can write the formula into C11 and fill it down:

Let’s say you want to extract street numbers.

  1. Pattern Explanation: For one or more digits in a row, use \d+.
  2. Formula:
    =REGEXEXTRACT(A2, "\d+")

We write that into D11 and fill it down:

To extract a ZIP code (assuming it’s always 5 digits), you can refine the pattern.

  1. Pattern Explanation: For a 5-digit number at the end of a string, use \d{5}$. {5} means a sequence of exactly five numeric digits, and $ means from the end of the text.
  2. Formula:
    =REGEXEXTRACT(A2, "\d{5}$")

We write it into E11 and fill it down:

For extracting state abbreviations (assuming they are always two uppercase letters):

  1. Pattern Explanation: For two uppercase letters, use [A-Z]{2}.
  2. Formula:
    =REGEXEXTRACT(A2, "[A-Z]{2}")

We write it into F11 and fill it down:

Function 3: REGEXREPLACE

The REGEXREPLACE function allows you to replace text matching a specific pattern with a new text string.

Syntax:

=REGEXREPLACE(text, pattern, replacement)

Example:

Let’s say you want to replace all but the last 4 credit card digits with ****-****-****.

  1. Pattern Explanation: For the pattern, use \d+-\d+-\d+.
  2. Formula:
    =REGEXREPLACE(A2, "\d+-\d+-\d+", "****-****-****")

Write the formula into C9 and fill it down:

Conclusion

These new regex functions in Excel—REGEXTEST, REGEXEXTRACT, and REGEXREPLACE—open up a world of possibilities for text processing and pattern matching. While this post provides a brief introduction, there are many more patterns and applications you can explore.

Plus, these capabilities are said to be making their way into XLOOKUP, so, imagine performing lookups with the power of regex!

Also, ChatGPT can actually help you write regex syntax, plus there are scores of websites that can help you dig deeper into regex.

Sample File

FAQ

Q. What is the difference between REGEXTEST and REGEXEXTRACT?

REGEXTEST checks if a pattern exists in the text and returns TRUE or FALSE. REGEXEXTRACT retrieves the actual text that matches the pattern.

Q. Can I use regex in older versions of Excel?

Unfortunately no, since new functions are not retroactively installed into legacy versions of Excel.

Q. Can I use these functions with dynamic arrays?

Yes, these functions work with dynamic arrays, allowing for more complex data processing.

Posted in ,

Jeff Lenning

I love sharing the things I've learned about Excel, and I built Excel University to help me do that. My motto is: Learn Excel. Work Faster.

Excel is not what it used to be.

You need the Excel Proficiency Roadmap now. Includes 6 steps for a successful journey, 3 things to avoid, and weekly Excel tips.

Want to learn Excel?

Our training programs start at $29 and will help you learn Excel quickly.

Leave a Comment