Getting Data In

Keep specific part of a textfile / email and discard the rest

eichfuss
Path Finder

Hi there,

I know the docs and the search function in answers.splunk.com. But I think I sit on the line. Hope someone can get me in the right direction or can help me with my problem.

I want to log emails and with all the header in the mail I just want to index a part of the mail. Here is an example of a similar mail.
I just want the part from "Object: Sensor A" till "Time: 2013-01-27 11:58:23" and push the rest to the Null-Queue.

Thanks a lot
Cheers, Sven

##################################

Content-Type: multipart/alternative; boundary=Apple-Mail-3A77049A-4A01-443F-B1DB-C1AA16C7497D
Content-Transfer-Encoding: 7bit
Subject: blablablablabla
From: Doc Snider blablabla@blablabl.de
Message-Id: 92D35476-1711-4B3451-A4B5-8D14534351E@gmail.com
Date: Mon, 27 Jan 2014 11:30:57 +0100
To: doc@blablabla.de
Mime-Version: 1.0 (1.0)
X-Mailer: iPhone Mail (11A501)

--Apple-Mail-3A77049A-4A01-443F-B1DB-C1AA16C7497D
Content-Type: text/plain;
charset=utf-8
Content-Transfer-Encoding: quoted-printable

Here are the infos

Object: Sensor A
Temperature: 42
Humidity: 32
Time: 2013-01-27 11:58:23

here is more uninteresting text.
blablablablablabla

############################################
Tags (3)
0 Karma
1 Solution

kristian_kolb
Ultra Champion

I guess you could (permanently) remove the unwanted stuff with a sed script, invoked through SEDCMD in props.conf, like so;

props.conf

[your_email_sourcetype]
SEDCMD = s/(?m).*[\r\n](Object:.*[\r\n]Time:\s[\d-]+\s[\d:]+)/\1/g

Just ensure that the events get indexed with the correct timestamp as well - as there seems to be different timestamps in the header and the message. So perhaps you should also add the following to the stanza above;

TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = Time:+\s
MAX_TIMESTAMP_LOOKAHEAD = 400

Read more here;

http://docs.splunk.com/Documentation/Splunk/6.0.1/Data/Anonymizedatausingconfigurationfiles

View solution in original post

kristian_kolb
Ultra Champion

I guess you could (permanently) remove the unwanted stuff with a sed script, invoked through SEDCMD in props.conf, like so;

props.conf

[your_email_sourcetype]
SEDCMD = s/(?m).*[\r\n](Object:.*[\r\n]Time:\s[\d-]+\s[\d:]+)/\1/g

Just ensure that the events get indexed with the correct timestamp as well - as there seems to be different timestamps in the header and the message. So perhaps you should also add the following to the stanza above;

TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = Time:+\s
MAX_TIMESTAMP_LOOKAHEAD = 400

Read more here;

http://docs.splunk.com/Documentation/Splunk/6.0.1/Data/Anonymizedatausingconfigurationfiles

eichfuss
Path Finder

Thanks a lot Kristian,
that`s the way.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...