Getting Data In

Keep specific part of a textfile / email and discard the rest

eichfuss
Path Finder

Hi there,

I know the docs and the search function in answers.splunk.com. But I think I sit on the line. Hope someone can get me in the right direction or can help me with my problem.

I want to log emails and with all the header in the mail I just want to index a part of the mail. Here is an example of a similar mail.
I just want the part from "Object: Sensor A" till "Time: 2013-01-27 11:58:23" and push the rest to the Null-Queue.

Thanks a lot
Cheers, Sven

##################################

Content-Type: multipart/alternative; boundary=Apple-Mail-3A77049A-4A01-443F-B1DB-C1AA16C7497D
Content-Transfer-Encoding: 7bit
Subject: blablablablabla
From: Doc Snider blablabla@blablabl.de
Message-Id: 92D35476-1711-4B3451-A4B5-8D14534351E@gmail.com
Date: Mon, 27 Jan 2014 11:30:57 +0100
To: doc@blablabla.de
Mime-Version: 1.0 (1.0)
X-Mailer: iPhone Mail (11A501)

--Apple-Mail-3A77049A-4A01-443F-B1DB-C1AA16C7497D
Content-Type: text/plain;
charset=utf-8
Content-Transfer-Encoding: quoted-printable

Here are the infos

Object: Sensor A
Temperature: 42
Humidity: 32
Time: 2013-01-27 11:58:23

here is more uninteresting text.
blablablablablabla

############################################
Tags (3)
0 Karma
1 Solution

kristian_kolb
Ultra Champion

I guess you could (permanently) remove the unwanted stuff with a sed script, invoked through SEDCMD in props.conf, like so;

props.conf

[your_email_sourcetype]
SEDCMD = s/(?m).*[\r\n](Object:.*[\r\n]Time:\s[\d-]+\s[\d:]+)/\1/g

Just ensure that the events get indexed with the correct timestamp as well - as there seems to be different timestamps in the header and the message. So perhaps you should also add the following to the stanza above;

TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = Time:+\s
MAX_TIMESTAMP_LOOKAHEAD = 400

Read more here;

http://docs.splunk.com/Documentation/Splunk/6.0.1/Data/Anonymizedatausingconfigurationfiles

View solution in original post

kristian_kolb
Ultra Champion

I guess you could (permanently) remove the unwanted stuff with a sed script, invoked through SEDCMD in props.conf, like so;

props.conf

[your_email_sourcetype]
SEDCMD = s/(?m).*[\r\n](Object:.*[\r\n]Time:\s[\d-]+\s[\d:]+)/\1/g

Just ensure that the events get indexed with the correct timestamp as well - as there seems to be different timestamps in the header and the message. So perhaps you should also add the following to the stanza above;

TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = Time:+\s
MAX_TIMESTAMP_LOOKAHEAD = 400

Read more here;

http://docs.splunk.com/Documentation/Splunk/6.0.1/Data/Anonymizedatausingconfigurationfiles

eichfuss
Path Finder

Thanks a lot Kristian,
that`s the way.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Calling All Security Pros: Ready to Race Through Boston?

Hey Splunkers, .conf25 is heading to Boston and we’re kicking things off with something bold, competitive, and ...

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Financial services organizations face an impossible equation: maintain 99.9% uptime for mission-critical ...

Customer success is front and center at .conf25

Hi Splunkers, If you are not able to be at .conf25 in person, you can still learn about all the latest news ...