Splunk Search

Regexes for Exchange SMTP logs

Thuan
Explorer

Greetings,

The sample logs are listed below
2014-06-18T02:25:16.879Z,TSEAET01\NEW - Internet receive connector TSEAET01,08D1456B7AFF9BDF,22,147.81.121.139:25,147.81.122.24:61707,,"CN=Entrust Certification Authority - L1C, OU=""(c) 2009 Entrust, Inc."", OU=www.entrust.net/rpa is incorporated by reference, O=""Entrust, Inc."", C=US",Certificate issuer name
2014-06-18T02:25:16.879Z,TSEAET01\NEW - Internet receive connector TSEAET01,08D1456B7AFF9BDF,23,147.81.121.139:25,147.81.122.24:61707,
,4C1B9021,Certificate serial number
2014-06-18T02:25:16.879Z,TSEAET01\NEW - Internet receive connector TSEAET01,08D1456B7AFF9BDF,24,147.81.121.139:25,147.81.122.24:61707,,27A7B6AAACBE39610C3A148D60EF4F5F2BE60FB0,Certificate thumbprint
2014-06-18T02:25:16.879Z,TSEAET01\NEW - Internet receive connector TSEAET01,08D1456B7AFF9BDF,25,147.81.121.139:25,147.81.122.24:61707,
,TSEAET01.tascnet.tasc.com;Mail1.tasc.com;Mail.tasc.com;Mail.tascnet.tasc.com,Certificate alternate names
2014-06-18T02:25:16.910Z,TSEAET01\NEW - Internet receive connector TSEAET01,08D1456B7AFF9BDF,26,147.81.121.139:25,147.81.122.24:61707,*,,Received certificate

The field headers are

Fields: date-time,connector-id,session-id,sequence-number,local-endpoint,remote-endpoint,event,data,context

My (non-working) regexes are as follows

(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?(?=")(?.+),)|(?(?=,)(?,),)|(?(?=.)(?[^,]),),(?.+)\r\n

I have trouble parsing the field named "data", which can take any one the following forms
1) "CN=Entrust Certification Authority - L1C, OU=""(c) 2009 Entrust, Inc."", OU=www.entrust.net/rpa is incorporated by reference, O=""Entrust, Inc."", C=US"
2) 4C1B9021
3) ,

It appears that the time stamp is processed correctly, however.

Tags (2)
0 Karma

Thuan
Explorer

Sorry, I copied the wrong regex with the previous answer.
The correct and working regex in props.conf is listed below

[xchange_smtp]
NO_BINARY_CHECK = 1
pulldown_type = 1
BREAK_ONLY_BEFORE_DATE = false
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%N%Z
EXTRACT-xchange_smtp =,(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]),(?".+"|[^,]|,),(?.*)

Thuan
Explorer

The working regex in props.conf is listed below

[xchange_agent]
NO_BINARY_CHECK = 1
pulldown_type = 1
BREAK_ONLY_BEFORE_DATE = false
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%N%Z
EXTRACT-xchange_agent =,(?P[^,]),(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P,|[^,]),(?P[^,]),(?P<21FromAddresses>[^,]),(?P[^,]),(?P[^,]),(?P[^,]),(?P[^,]),(?P[^,]),(?P[^,]),(?P[^,]),(?P[^,]),(?P.*)

0 Karma

somesoni2
Revered Legend

What values of data does the regex returning? I does seems to work with your sample data for me.

0 Karma

Thuan
Explorer

Thank you for promptness.

A _
The suggested regex does not work. It cannot parse the "data" field as listed below
1) "CN=Entrust Certification Authority - L1C, OU=""(c) 2009 Entrust, Inc."", OU=www.entrust.net/rpa is incorporated by reference, O=""Entrust, Inc."", C=US"
2) 4C1B9021
3) ,

B _

The blogs is about the IIS format, not MS Exchange SMTP logs format. In fact, I have tried to parse a sample file using the "IIS" existing source type. That attempts fails too.

Looking forward to a working solution.

0 Karma

Thuan
Explorer

Thank you for promptness.

A _
The suggested regex does not work. It cannot parse the "data" field as listed below
1) "CN=Entrust Certification Authority - L1C, OU=""(c) 2009 Entrust, Inc."", OU=www.entrust.net/rpa is incorporated by reference, O=""Entrust, Inc."", C=US"
2) 4C1B9021
3) ,

B _

The blogs is about the IIS format, not MS Exchange SMTP logs format. In fact, I have tried to parse a sample file using the "IIS" existing source type. That attempts fails too.

Looking forward to a working solution.

0 Karma

somesoni2
Revered Legend

Try this

^(?<date_time>[^,]+),(?<connector_id>[^,]+),(?<session_id>[^,]+),(?<sequence_number>[^,]+),(?<local_endpoint>[^,]+),(?<remote_endpoint>[^,]+),(?<event>[^,]*),(?P<data>.*),(?P<context>.*)$
0 Karma

ahall_splunk
Splunk Employee
Splunk Employee

Is there some reason you are not using INDEXED_EXTRACTIONS?

Here is a blog post about the feature:
http://blogs.splunk.com/2013/10/18/iis-logs-and-splunk-6/

It deals with IIS logs, but the same principal can be used.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...