Getting Data In

How can I extract the DATE from the middle of an event?

Ron_Naken
Splunk Employee
Splunk Employee

I have an ISA web log of the following format. Splunk doesn't correctly identify the timestamp in every event, even though the format doesn't change from event to event. Splunk sometimes seems to take the time/date that it indexes the updates as the timestamp.

How can I tell Splunk to only extract the date/time from the 5th & 6th fields in this CSV?

Sample Data: 1.4.5.1, GOOBERS\MYUSER, Shockwave Flash, -, 4/23/2010, 18:54:00, -, ROCKY, -, 8.7.7.7, 8.7.7.7, 80, 531, 215, 192, http, -, POST, http://8.7.7.7/idle/eHqkwMkh1DNu-XXR/102752, -, Inet, 200, -, Allow Internet Access to Web Group, -, Internal, External, 0x780, Allowed 1.4.5.1, GOOBERS\YOURUSER, Shockwave Flash, -, 4/23/2010, 18:54:00, -, ROCKY, -, 8.7.7.7, 8.7.7.7, 80, 531, 215, 192, http, -, POST, http://8.7.7.7/idle/eJqkwMkh0DNeCn2f/102747, -

Props.conf: [isa_web] SHOULD_LINEMERGE = false REPORT-isaw = isa-web

Transorms.conf: [isa-web] DELIMS = "," FIELDS = "src_ip","username","agent","authenticated","date","time","service","server","referer","r-host","r-ip","r-port","tmp1","tmp2","tmp3","cs-protocol","tmp4","s-operation","cs-uri","tmp5","s-object-source","sc-status","s-cache-info","rule","filter-info","cs-network","sc-network","error-info","action"

Thanks!

1 Solution

dwaddle
SplunkTrust
SplunkTrust

Try something like this in props.conf:

TIME_FORMAT=%M/%D/%Y,%H:%M:%S
TIME_PREFIX=^([^,]*,){4}

I have not tested this, but the TIME_PREFIX should tell Splunk to skip the first 4 comma-delimited fields - and the TIME_FORMAT should pick it up from CSV fields 5 and 6.

View solution in original post

Simeon
Splunk Employee
Splunk Employee

There are many ways to tune the timestamp extraction within Splunk. For your particular data, you should create the appropriate regex to correctly extract the timestamp. Details on how to set this can be found here:

http://www.splunk.com/base/Documentation/latest/Admin/TrainSplunktorecognizeatimestamp

For your scenario, you might be able to set the TIME_PREFIX and MAX_TIMESTAMP_LOOKAHEAD parameters:

MAX_TIMESTAMP_LOOKAHEAD = <integer>
* Specifies how far (in characters) into an event Splunk should look for a timestamp.
* Defaults to 150.

TIME_PREFIX = <regular expression>
* Specifies the necessary condition for timestamp extraction.
* The timestamping algorithm only looks for a timestamp after the first regex match.
* Defaults to empty.

Without seeing more of your data, it will be hard to suggest the exact REGEX, but I would imagine you could do something that searches for the 4th comma.

dwaddle
SplunkTrust
SplunkTrust

Try something like this in props.conf:

TIME_FORMAT=%M/%D/%Y,%H:%M:%S
TIME_PREFIX=^([^,]*,){4}

I have not tested this, but the TIME_PREFIX should tell Splunk to skip the first 4 comma-delimited fields - and the TIME_FORMAT should pick it up from CSV fields 5 and 6.

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...