Getting Data In

Keep specific part of a textfile / email and discard the rest

Path Finder

Hi there,

I know the docs and the search function in answers.splunk.com. But I think I sit on the line. Hope someone can get me in the right direction or can help me with my problem.

I want to log emails and with all the header in the mail I just want to index a part of the mail. Here is an example of a similar mail.
I just want the part from "Object: Sensor A" till "Time: 2013-01-27 11:58:23" and push the rest to the Null-Queue.

Thanks a lot
Cheers, Sven

##################################

Content-Type: multipart/alternative; boundary=Apple-Mail-3A77049A-4A01-443F-B1DB-C1AA16C7497D
Content-Transfer-Encoding: 7bit
Subject: blablablablabla
From: Doc Snider blablabla@blablabl.de
Message-Id: 92D35476-1711-4B3451-A4B5-8D14534351E@gmail.com
Date: Mon, 27 Jan 2014 11:30:57 +0100
To: doc@blablabla.de
Mime-Version: 1.0 (1.0)
X-Mailer: iPhone Mail (11A501)

--Apple-Mail-3A77049A-4A01-443F-B1DB-C1AA16C7497D
Content-Type: text/plain;
charset=utf-8
Content-Transfer-Encoding: quoted-printable

Here are the infos

Object: Sensor A
Temperature: 42
Humidity: 32
Time: 2013-01-27 11:58:23

here is more uninteresting text.
blablablablablabla

############################################
Tags (3)
0 Karma
1 Solution

Ultra Champion

I guess you could (permanently) remove the unwanted stuff with a sed script, invoked through SEDCMD in props.conf, like so;

props.conf

[your_email_sourcetype]
SEDCMD = s/(?m).*[\r\n](Object:.*[\r\n]Time:\s[\d-]+\s[\d:]+)/\1/g

Just ensure that the events get indexed with the correct timestamp as well - as there seems to be different timestamps in the header and the message. So perhaps you should also add the following to the stanza above;

TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = Time:+\s
MAX_TIMESTAMP_LOOKAHEAD = 400

Read more here;

http://docs.splunk.com/Documentation/Splunk/6.0.1/Data/Anonymizedatausingconfigurationfiles

View solution in original post

Ultra Champion

I guess you could (permanently) remove the unwanted stuff with a sed script, invoked through SEDCMD in props.conf, like so;

props.conf

[your_email_sourcetype]
SEDCMD = s/(?m).*[\r\n](Object:.*[\r\n]Time:\s[\d-]+\s[\d:]+)/\1/g

Just ensure that the events get indexed with the correct timestamp as well - as there seems to be different timestamps in the header and the message. So perhaps you should also add the following to the stanza above;

TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = Time:+\s
MAX_TIMESTAMP_LOOKAHEAD = 400

Read more here;

http://docs.splunk.com/Documentation/Splunk/6.0.1/Data/Anonymizedatausingconfigurationfiles

View solution in original post

Path Finder

Thanks a lot Kristian,
that`s the way.

0 Karma