John Dalesandro

How to Use Regular Expressions in Excel for Parsing Substrings with VBA

Parsing a substring from a cell value in Excel can be challenging, especially when the value or pattern is unclear. It often requires complex combinations of functions like SEARCH, FIND, LEFT, RIGHT, and MID. The result is often hard to read and may not work for all cell values, leading to manual cleanup. It would be much easier if SEARCH or FIND supported regular expressions by default.

In this article, we’ll enable Microsoft VBScript Regular Expressions and create a custom VBA function to use in worksheet formulas for matching regular expression patterns and returning the matched substring.

NOTE: Microsoft 365 Excel includes built-in regular expression functions, but since I’m using Microsoft Office Professional Plus 2019, which lacks these functions, I’m providing the VBA function below as an alternative solution. For reference, the built-in functions are:

  • REGEXTEST extracts text from a string based on a regular expression. You can extract the first match, all matches, or capturing groups from the first match.
  • REGEXEXTRACT extracts parts of the supplied text that match a regular expression.
  • REGEXREPLACE searches for a regular expression pattern in the supplied text and replaces it with different text.

Instructions

Step 1: Enable Microsoft VBScript Regular Expressions

To use regular expressions for parsing a cell value, enable Microsoft VBScript Regular Expressions 5.5.

  1. From the Developer ribbon in Excel, click Visual Basic.
  2. The Microsoft Visual Basic for Applications window is displayed.
  3. Click the Tools menu followed by References….
  4. The References dialog box is displayed.
  5. Enable the Microsoft VBScript Regular Expressions 5.5 checkbox.
  6. Click the OK button.
Enable Microsoft VBScript Regular Expressions
Enable Microsoft VBScript Regular Expressions

Step 2: Create VBA Function

Once the regular expression reference is enabled, add a new module to the workbook (if one doesn’t already exist). The code won’t work if added directly to a worksheet object.

  1. Return to the Microsoft Visual Basic for Applications window.
  2. Click the Insert menu followed by Module.
  3. Add the following code to the module.
  4. Close the Microsoft Visual Basic for Applications window.
Insert VBA Module and Create Custom VBA RegExParse Function
Insert VBA Module and Create Custom VBA RegExParse Function
Option Explicit

Public Function RegExParse(val As String, SearchPattern As String) As String
  On Error GoTo errorHandler
  
  Dim regEx As New RegExp
  
  With regEx
    .Global = True
    .MultiLine = True
    .IgnoreCase = True
    .Pattern = SearchPattern
  End With
  
  If regEx.Test(val) Then
    RegExParse = regEx.Execute(val)(0)
  Else
    RegExParse = CVErr(xlErrNA)
  End If
  
errorHandler:
  If Err.Number <> 0 Then
    RegExParse = CVErr(xlErrNA)
  End If
End Function

Results

Returning to Excel, you can then use the new function in formulas, such as =RegExParse(A1, "ID[0-9]+"). This formula will return a substring from cell A1 that matches the pattern ID[0-9]+, e.g., ID0, ID1234, ID183274917234. An example is also provided for extracting more complex patterns, such as US telephone numbers.

Excel: Example Formula Usage of Custom VBA RegExParse Function
Excel: Example Formula Usage of Custom VBA RegExParse Function
Excel: Result Displaying Substring from Matched Regular Expression Pattern
Excel: Result Displaying Substring from Matched Regular Expression Pattern

Summary

Using regular expressions in Excel simplifies parsing substrings, eliminating the need for complex formulas. By enabling Microsoft VBScript Regular Expressions, you can easily extract specific patterns from cell values. This approach streamlines the process and reduces the need for manual cleanup.