Splunk Search

Can timestamp be conditional according to data content

bowesmana
Champion

My csv data contains a number of timestamps. I want the timestamp field to be conditional on the result of another field, say field 12. So if I compare field 12 in each row for "X" then I want to use the timestamp from field 15th and if field 12 contains "Y", I want to use field 10 for the timestamp.

Currently I am splitting my log file into two based on the field 12 value and then defining 2 different sourcetypes, each with a different TIME_PREFIX value in props.conf, but wondered if I could avoid that step and have a single sourcetype.

Following Ayn and Kristian Kolb's comments, here's more detail

These are the two standard lines of data

Course A,_scorm12_ar123,Custom Asset,username1,First,Last,Branch Sales,Group Id1,Group Path,0,2013-11-07,2013-11-07,2,0,,,In Progress,,,,35,14,Direct sales
Course B,_scorm12_cf411,Custom Asset,username2,First,Last,Group name,Group Id2,Group Path2,0,2013-11-16,2013-11-20,3,0,,2013-11-17,Completed,,100.00,100.00,35,62,Indirect sales

The first is a course in progress (field 17) and started on 2013-11-07 (field 11)
The second is a course completed (field 17) and started on 2013-11-16 (field 11) and completed on 2013-11-17 (field 16).

I tried two TIME_PREFIX settings

TIME_PREFIX=^(((\"[^\"]*\",)|([^,]*,)){15}|((\"[^\"]*\",)|([^,]*,)){10})
TIME_PREFIX=^(((\"[^\"]*\",)|([^,]*,)){10}|((\"[^\"]*\",)|([^,]*,)){15})

In the first case, _time is set to the completion date or the 'previous' record's time if the course in In Progress. In the second case, the _time is always set to the start date.

I don't suppose I can look ahead BEYOND the wanted time field to check for the value of field 17 to then extract the time from a field BEFORE...??

I guess if the Completed|In Progress text was before the time field I could include that in the pattern, but the docs for TIME_PREFIX say the time is extracted from the text following the regex match.

Tags (1)
0 Karma

kristian_kolb
Ultra Champion

Not too sure how Splunk would treat this, but it might work. The (untested) example below assumes that you have space-separated fields, and that the fields themselves do NOT contain any spaces.

TIME_PREFIX = ^((\S+\s+){11}X\s+\S+\s+\S+\s+|(\S+\s+){9}(?=\S+\s+\S+\s+Y))

Also, depending on the size of the events, you may need to increase the MAX_TIMESTAMP_LOOKAHED.
If this approach does not work, it may be better to use your current strategy, with two different sourcetypes.

For more qualified help, you should consider posting a few sample events.


UPDATE:

Course A,_scorm12_ar123,Custom Asset,username1,First,Last,Branch Sales,Group Id1,Group Path,0,2013-11-07,2013-11-07,2,0,,,In Progress,,,,35,14,Direct sales

Course B,_scorm12_cf411,Custom Asset,username2,First,Last,Group name,Group Id2,Group Path2,0,2013-11-16,2013-11-20,3,0,,2013-11-17,Completed,,100.00,100.00,35,62,Indirect sales

Given that I've understood you correctly, the above date fields are the ones you want, based on the Completed/In Progress status.

I would try this with the following regex in TIME_PREFIX (but note that it may not work at all, I haven't tested it);

TIME_PREFIX = ^(([^,]*,){10}(?=([^,]*,){6}In Progress)|([^,]*,){15}(?=[^,]*,Completed))

Good luck,

/K

EDIT: typo
EDIT2: picked the wrong field in the first event - corrected now.

kristian_kolb
Ultra Champion

Did you try the updated TIME_PREFIX reges? It'd be fun to know if it worked or not.

/k

0 Karma

kristian_kolb
Ultra Champion

Well, to answer your question on wheter you can 'look beyond' the timestamp for a field value, yes you can. It's called a positive lookahead, and is seen in my example above.

The second regex (\S+\s+){9}(?=\S+\s+\S+\s+Y) will match a position after 9 sequences (i.e. just before field 10) of non-space-characters-followed-by-space. But only if this is followed by a Y in field 12.

Btw, I don't really understand your regexes, i.e. they dont make sense to me. And I'm quite sure that you can only specify TIME_PREFIX once per stanza in props.conf. One of them will be discarded.

/k

0 Karma

bowesmana
Champion

Thanks Kristian, see my updated description. I guess I'll have to use my existing strategy, but thanks for the great idea.

0 Karma

kristian_kolb
Ultra Champion

good luck.

0 Karma

bowesmana
Champion

Thanks, I should be able to map that to my data, so I'll try it in the morning!

0 Karma

Ayn
Legend

If you can specify a TIME_PREFIX that matches one at a time you can do this. Otherwise the answer is most likely no. But something like

TIME_PREFIX = ...,Y,...,|...,N,...

should work, where you give one regex that matches as far as you want in case the field is "Y" and another one for when the field is "N".

bowesmana
Champion

Thanks Ayn, see my updated description. I guess I'll have to use my existing strategy, but thanks for the great idea.

0 Karma

bowesmana
Champion

Thanks, I'll try this out in the morning, late here!

0 Karma

kristian_kolb
Ultra Champion

I'm too slow by a mile, I see. 🙂

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.