Splunk Search

Regexes for Exchange SMTP logs

Thuan
Explorer

Greetings,

The sample logs are listed below
2014-06-18T02:25:16.879Z,TSEAET01\NEW - Internet receive connector TSEAET01,08D1456B7AFF9BDF,22,147.81.121.139:25,147.81.122.24:61707,,"CN=Entrust Certification Authority - L1C, OU=""(c) 2009 Entrust, Inc."", OU=www.entrust.net/rpa is incorporated by reference, O=""Entrust, Inc."", C=US",Certificate issuer name
2014-06-18T02:25:16.879Z,TSEAET01\NEW - Internet receive connector TSEAET01,08D1456B7AFF9BDF,23,147.81.121.139:25,147.81.122.24:61707,
,4C1B9021,Certificate serial number
2014-06-18T02:25:16.879Z,TSEAET01\NEW - Internet receive connector TSEAET01,08D1456B7AFF9BDF,24,147.81.121.139:25,147.81.122.24:61707,,27A7B6AAACBE39610C3A148D60EF4F5F2BE60FB0,Certificate thumbprint
2014-06-18T02:25:16.879Z,TSEAET01\NEW - Internet receive connector TSEAET01,08D1456B7AFF9BDF,25,147.81.121.139:25,147.81.122.24:61707,
,TSEAET01.tascnet.tasc.com;Mail1.tasc.com;Mail.tasc.com;Mail.tascnet.tasc.com,Certificate alternate names
2014-06-18T02:25:16.910Z,TSEAET01\NEW - Internet receive connector TSEAET01,08D1456B7AFF9BDF,26,147.81.121.139:25,147.81.122.24:61707,*,,Received certificate

The field headers are

Fields: date-time,connector-id,session-id,sequence-number,local-endpoint,remote-endpoint,event,data,context

My (non-working) regexes are as follows

(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?(?=")(?.+),)|(?(?=,)(?,),)|(?(?=.)(?[^,]),),(?.+)\r\n

I have trouble parsing the field named "data", which can take any one the following forms
1) "CN=Entrust Certification Authority - L1C, OU=""(c) 2009 Entrust, Inc."", OU=www.entrust.net/rpa is incorporated by reference, O=""Entrust, Inc."", C=US"
2) 4C1B9021
3) ,

It appears that the time stamp is processed correctly, however.

Tags (2)
0 Karma

Thuan
Explorer

Sorry, I copied the wrong regex with the previous answer.
The correct and working regex in props.conf is listed below

[xchange_smtp]
NO_BINARY_CHECK = 1
pulldown_type = 1
BREAK_ONLY_BEFORE_DATE = false
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%N%Z
EXTRACT-xchange_smtp =,(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]),(?".+"|[^,]|,),(?.*)

Thuan
Explorer

The working regex in props.conf is listed below

[xchange_agent]
NO_BINARY_CHECK = 1
pulldown_type = 1
BREAK_ONLY_BEFORE_DATE = false
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%N%Z
EXTRACT-xchange_agent =,(?P[^,]),(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P,|[^,]),(?P[^,]),(?P<21FromAddresses>[^,]),(?P[^,]),(?P[^,]),(?P[^,]),(?P[^,]),(?P[^,]),(?P[^,]),(?P[^,]),(?P[^,]),(?P.*)

0 Karma

somesoni2
Revered Legend

What values of data does the regex returning? I does seems to work with your sample data for me.

0 Karma

Thuan
Explorer

Thank you for promptness.

A _
The suggested regex does not work. It cannot parse the "data" field as listed below
1) "CN=Entrust Certification Authority - L1C, OU=""(c) 2009 Entrust, Inc."", OU=www.entrust.net/rpa is incorporated by reference, O=""Entrust, Inc."", C=US"
2) 4C1B9021
3) ,

B _

The blogs is about the IIS format, not MS Exchange SMTP logs format. In fact, I have tried to parse a sample file using the "IIS" existing source type. That attempts fails too.

Looking forward to a working solution.

0 Karma

Thuan
Explorer

Thank you for promptness.

A _
The suggested regex does not work. It cannot parse the "data" field as listed below
1) "CN=Entrust Certification Authority - L1C, OU=""(c) 2009 Entrust, Inc."", OU=www.entrust.net/rpa is incorporated by reference, O=""Entrust, Inc."", C=US"
2) 4C1B9021
3) ,

B _

The blogs is about the IIS format, not MS Exchange SMTP logs format. In fact, I have tried to parse a sample file using the "IIS" existing source type. That attempts fails too.

Looking forward to a working solution.

0 Karma

somesoni2
Revered Legend

Try this

^(?<date_time>[^,]+),(?<connector_id>[^,]+),(?<session_id>[^,]+),(?<sequence_number>[^,]+),(?<local_endpoint>[^,]+),(?<remote_endpoint>[^,]+),(?<event>[^,]*),(?P<data>.*),(?P<context>.*)$
0 Karma

ahall_splunk
Splunk Employee
Splunk Employee

Is there some reason you are not using INDEXED_EXTRACTIONS?

Here is a blog post about the feature:
http://blogs.splunk.com/2013/10/18/iis-logs-and-splunk-6/

It deals with IIS logs, but the same principal can be used.

0 Karma
Get Updates on the Splunk Community!

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

It’s Monday morning, and your phone is buzzing with alert escalations – your customer-facing portal is running ...

What’s New in Splunk Observability – September 2025

What's NewWe are excited to announce the latest enhancements to Splunk Observability, designed to help ITOps ...

Fun with Regular Expression - multiples of nine

Fun with Regular Expression - multiples of nineThis challenge was first posted on Slack #regex channel ...