We are getting data from a mainframe system to represent call data from our applications. Data in the events looks like this:
1909|504|2016-11-17 00:00:00|8|28|9|9|0|0
Of course, based on this data I assumed that 2016-11-17 00:00:00
was the timestamp. I, of course, was wrong. Apparently the way this mainframe handles the data, the 00:00:00 is an irrelevant stamp, an artifact of how the data is used is some other reporting systems. Every event comes in with the 00:00:00 notation. The date portion of the stamp is correct, but it's actually the column AFTER the 00:00:00 that represents the hour (there is no minute stamp, all data is run on the hour and that field would run from 0 to 23). So the actual timestamp for this event would be 2016-11-17 08:00:00.
I have not yet been able to make the system recognize the split time stamping properly. Can anyone provide any assistance on how to make these events show with the proper timestamp (i.e. show the YYYY-MM-DD
, skip the 00:00:00
, use the |8
column as the hours, and use 00:00
for minutes/seconds)?
Like this:
props.conf (on your FORWARDER, assuming that you are using INDEXED_EXTRACTIONS=PSV
😞
TIME_PREFIX=^[^\|]*\|[^\|]*\|
TIME_FORMAT=%Y-%m-%d 00:00:00|%H
MAX_TIMESTAMP_LOOKAHEAD=23
The problem is either that you are missing the TWO SPACES between the date and the fake timestamp OR that you are deploying this file to your indexers instead of to your forwarder. Once deployed to your forwarder(s), restart the splunkd there and check for events forwarded AFTER the restart (old events will stay broken).
Hi burras,
try
TIMEFORMAT=%Y-%m-%d\s00:00:00|%H
Bye.
Giuseppe
Unfortunately didn't work - still stamping it with 00:00:00 for the time...
Hi burras,
Sorry, I didn't find a solution!
Why you don't ask to the Splunk Support (using the partner Portal), it sounds like a bug!
Bye.
Giuseppe
That is the next course of action I'm taking. We're also adding a step to our ingest process to do a little bit of data massaging to rewrite a usable timestamp on the data before we move it to our final ingest location.
Can you try to extract first and then assign back to _time and see if it works well for your case. Something like:
your query to return the event
| rex field=_raw "^(?<id1>[^\|]+)\|(?<id2>[^\|]+)\|(?<myDate>[^\s]+)\s(?<scrapTime>[^\|]+)\|(?<usefulHour>[^\|]+)\|"
| eval myTime=if(( usefulHour < 10 ), myDate." 0".usefulHour.":00:00", myDate." ".usefulHour.":00:00")
| eval _time=strptime(myTime, "%Y-%m-%d %H:%M:%S")
| Now _time should be the way you want and can be used here onwards
Can you see if appending the timezone info like this helps out. I am using -6
as timezone, you might wanna use ur own based on the timezone of where the events were created:
...| eval myTime=if(( usefulHour < 10 ), myDate." 0".usefulHour.":00:00 -6", myDate." ".usefulHour.":00:00 -6")
| eval _time=strptime(myTime, "%Y-%m-%d %H:%M:%S %z")
Yeah that would correct the timing issue, but it still leaves me the issue of expecting novice users to remember to add a complicated string to their searches on this data and potentially causes issues if the users attempt to bash this sourcetype against others in the same index. I still think there's gotta be a way to do this on ingest rather than at search time...
I'd love for something like that to be an option. Unfortunately we have well over a 1000000 events per day for this sourcetype and with the way they're being timestamped we can't even run a search on the data (we get an error for more than 1000000 events found at a specific time). It looks like it's going to have to be something on the ingest-side rather than at search time...
But I did go ahead and test this out anyway, using a smaller sample size. It does appear to work. However, it only works if my Splunk settings are such that my time zone actually matches up with the time zone of the data. I tested this by changing my time zone to several different settings and the timestamps on the data were inconsistent - if I was in Eastern, it showed my data as Eastern; if I was in Central, the data appeared in central time. Since I have users in several different time zones that access this data, that would cause issues when they're all analyzing it. Additionally, my users are newbies when it comes to Splunk - asking them to run something like that when they search would be out of the question...
Try this timeformat="%Y-%m-%d 00:00:00|%H"
Unfortunately doesn't work - events still show 00:00:00 as the actual timestamp...