How to Use Regular Expressions in Excel for Parsing Substrings with VBA
Parsing a substring from a cell value in Excel can be challenging, especially when the value or pattern is unclear. It often requires complex combinations of functions like SEARCH
, FIND
, LEFT
, RIGHT
, and MID
. The result is often hard to read and may not work for all cell values, leading to manual cleanup. It would be much easier if SEARCH
or FIND
supported regular expressions by default.
In this article, we’ll enable Microsoft VBScript Regular Expressions and create a custom VBA function to use in worksheet formulas for matching regular expression patterns and returning the matched substring.
NOTE: Microsoft 365 Excel includes built-in regular expression functions, but since I’m using Microsoft Office Professional Plus 2019, which lacks these functions, I’m providing the VBA function below as an alternative solution. For reference, the built-in functions are:
REGEXTEST
extracts text from a string based on a regular expression. You can extract the first match, all matches, or capturing groups from the first match.REGEXEXTRACT
extracts parts of the supplied text that match a regular expression.REGEXREPLACE
searches for a regular expression pattern in the supplied text and replaces it with different text.
Instructions
Step 1: Enable Microsoft VBScript Regular Expressions
To use regular expressions for parsing a cell value, enable Microsoft VBScript Regular Expressions 5.5.
- From the
Developer
ribbon in Excel, clickVisual Basic
. - The
Microsoft Visual Basic for Applications
window is displayed. - Click the
Tools
menu followed byReferences…
. - The
References
dialog box is displayed. - Enable the
Microsoft VBScript Regular Expressions 5.5
checkbox. - Click the
OK
button.
data:image/s3,"s3://crabby-images/0ff4a/0ff4a9377dce2f9a8ccdd8b7f7f550465fa1b73a" alt="Enable Microsoft VBScript Regular Expressions"
Step 2: Create VBA Function
Once the regular expression reference is enabled, add a new module to the workbook (if one doesn’t already exist). The code won’t work if added directly to a worksheet object.
- Return to the
Microsoft Visual Basic for Applications
window. - Click the
Insert
menu followed byModule
. - Add the following code to the module.
- Close the
Microsoft Visual Basic for Applications
window.
data:image/s3,"s3://crabby-images/7f5ff/7f5ff9f704144d5b63317307530d02ac6fbf3b1e" alt="Insert VBA Module and Create Custom VBA RegExParse Function"
Option Explicit
Public Function RegExParse(val As String, SearchPattern As String) As String
On Error GoTo errorHandler
Dim regEx As New RegExp
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = SearchPattern
End With
If regEx.Test(val) Then
RegExParse = regEx.Execute(val)(0)
Else
RegExParse = CVErr(xlErrNA)
End If
errorHandler:
If Err.Number <> 0 Then
RegExParse = CVErr(xlErrNA)
End If
End Function
Results
Returning to Excel, you can then use the new function in formulas, such as =RegExParse(A1, "ID[0-9]+")
. This formula will return a substring from cell A1 that matches the pattern ID[0-9]+
, e.g., ID0
, ID1234
, ID183274917234
. An example is also provided for extracting more complex patterns, such as US telephone numbers.
data:image/s3,"s3://crabby-images/79cbe/79cbe0822392650941469aa521ffc763a97d2215" alt="Excel: Example Formula Usage of Custom VBA RegExParse Function"
data:image/s3,"s3://crabby-images/7c5e3/7c5e319b584f882d8bda01ad11c111aa27eb9295" alt="Excel: Result Displaying Substring from Matched Regular Expression Pattern"
Summary
Using regular expressions in Excel simplifies parsing substrings, eliminating the need for complex formulas. By enabling Microsoft VBScript Regular Expressions, you can easily extract specific patterns from cell values. This approach streamlines the process and reduces the need for manual cleanup.