Excel VBA: Automate E-mail Content Extraction from Outlook
In this scenario, Excel VBA is used to extract information from structured e-mails in Outlook using both text-based and HTML table-based methods. By automating the extraction process with VBA, data quality and accuracy are improved, and it is much faster than manually copying and pasting from each e-mail.
Text-based extraction uses the Replace
and Split
functions, while HTML table-based data is extracted using HTML object library functions.
Structured E-mail Template
Automation depends on consistent, structured data. Each e-mail has the subject Template Email Subject
and the body follows this format: The account for <Full Name> has been created.
, where <Full Name>
represents the name associated with the created account. The e-mail body also includes a structured table with user details like Last Name
, First Name
, Company
, Department
, and Job Title
.
The macro extracts the Full Name
from the e-mail body and adds it to an Excel worksheet. It then extracts additional user details from the structured table and adds this information to the Excel worksheet.
Outlook Configuration
In Outlook, create a sub-folder named Automation
within the default Inbox
to collect relevant e-mails. You can use an Outlook rule to automatically move these e-mails into the sub-folder. Keeping the e-mails in a separate folder helps reduce the VBA search effort. In this example, the sub-folder name Automation
and its location in the default inbox are hard-coded. If the folder is created elsewhere, you will need to adjust the code accordingly.
Enable Libraries
To use this function, you must enable the Microsoft Outlook 16.0 Object Library
(used to access and manipulate Outlook) and Microsoft HTML Object Library
(used to traverse structured HTML). The version number varies based on the installed version of Microsoft Excel.
- From the
Developer
ribbon in Microsoft Excel, clickVisual Basic
. - Once the
Visual Basic for Applications
window opens, go to theTools
menu and selectReferences…
. - From the
References
dialog box, check/enableMicrosoft Outlook 16.0 Object Library
. - From the
References
dialog box, check/enableMicrosoft HTML Object Library
. - Click the
OK
button.
Add Source Code
The VBA code is added directly to ThisWorkbook
within the Visual Basic for Applications
window.
- From the
Developer
ribbon in Microsoft Excel, clickVisual Basic
. - Once the
Visual Basic for Applications
window opens, selectThisWorkbook
from theProject Explorer
pane. - Add the following code.
Option Explicit
Public Sub ExtractDetailsFromOutlookFolderEmails()
On Error GoTo errorHandler
Dim olApp As Outlook.Application
Dim olNameSpace As Namespace
Dim olFolder As Folder
Dim olItem As Object
Dim olItemBody As String
Dim olHTML As MSHTML.HTMLDocument
Dim olElementCollection As MSHTML.IHTMLElementCollection
Dim htmlElementTable As IHTMLElement
Dim iTableRow As Integer
Dim iTableColumn As Integer
Dim wksList As Worksheet
Dim iRow As Integer
Dim iColumn As Integer
Dim strFullName As String
Set wksList = ThisWorkbook.Sheets("Sheet1")
Set olApp = CreateObject("Outlook.Application")
Set olNameSpace = olApp.GetNamespace("MAPI")
Set olFolder = olNameSpace.GetDefaultFolder(olFolderInbox).Folders("Automation")
iRow = 1
For Each olItem In olFolder.Items
If InStr(olItem.Subject, "Template Email Subject") > 0 Then
iColumn = 1
olItemBody = Replace(olItem.body, "The account for", "####################")
olItemBody = Replace(olItemBody, "has been created.", "####################")
strFullName = Trim(Replace(Replace(Replace(Split(olItemBody, "####################")(1), Chr(10), ""), Chr(12), ""), Chr(13), ""))
wksList.Cells(iRow, iColumn) = strFullName
Set olHTML = New MSHTML.HTMLDocument
olHTML.body.innerHTML = olItem.HTMLBody
Set olElementCollection = olHTML.getElementsByTagName("table")
For Each htmlElementTable In olElementCollection
For iTableRow = 0 To htmlElementTable.Rows.Length - 1
For iTableColumn = 0 To htmlElementTable.Rows(iTableRow).Cells.Length - 1
On Error Resume Next
If iTableColumn Mod 2 = 1 Then
iColumn = iColumn + 1
wksList.Cells(iRow, iColumn).Value = htmlElementTable.Rows(iTableRow).Cells(iTableColumn).innerText
End If
On Error GoTo errorHandler
Next iTableColumn
Next iTableRow
Next htmlElementTable
iRow = iRow + 1
End If
Next olItem
errorHandler:
If Err.Number <> 0 Then
MsgBox "Error extracting details. [ExtractDetailsFromOutlookFolderEmails]"
End If
End Sub
Code Description
This code has been tested with Microsoft Office Professional Plus 2019 using the described e-mail template and Outlook configuration. It includes hard-coded values that must be updated for different e-mail templates and configurations, such as the Excel worksheet name (Sheet1
), and the Outlook sub-folder name (Automation
) and location.
The code connects to Outlook and accesses a sub-folder named Automation
in the default inbox, which holds all the structured e-mails.
It then loops through each e-mail in the sub-folder, checking for those with the subject containing Template Email Subject
.
For text-based information extraction, the code replaces specific phrases (The account for
and has been created.
) in the e-mail body with ####################
. This common delimiter makes it easier to extract the text between them. For example, the modified body would look like this:
############## <Full Name> ##############
The code uses the Split
function to extract the text between these delimiters and removes any extra whitespace, line feeds, form feeds, and carriage returns using the Trim
and Replace
functions. The cleaned text is then written to Column A
of Sheet1
.
For HTML table-based extraction, the code references the HTML table
tag in the e-mail body using getElementsByTagName
. It loops through the table rows and columns to collect data from the second column, ignoring the first column which contains headers.
Execute Procedure
- From the
Developer
ribbon in Microsoft Excel, clickMacros
. - The
Macro
dialog box opens. SelectExtractDetailsFromOutlookFolderEmails
- Click the
Run
button to execute the procedure.
Result
After running the procedure, the worksheet is filled with information from both the text-based and table-based methods. This example tested with only one e-mail, but if there were multiple e-mails in the Outlook sub-folder, the sheet would have an additional row for each e-mail.
Summary
This scenario demonstrates how VBA can automate data extraction from structured e-mails in Outlook to an Excel worksheet, enhancing data quality and processing speed compared to manual methods. The process involves two extraction techniques: a text-based method using string manipulation functions and a structured table-based extraction using HTML object library functions.