I have some logs but these logs does not have actual time stamp field in each line. Time stamp are recorded Just only first line and all other line does not have actual time stamp, its is only interval sec from just above line.
For example,
----------------SAMPLE LOG-----------------------------------
Lot Start Time 2016/8/31 14:42:20
Lot End Time 2016/8/31 14:59:55
Sampling Interval[sec] 0.1
Interval field2 field3 field4 field5 field6
0 1 84 111.1 114.5 0
0.1 1 84 111.1 114.5 0
0.2 1 84 111.1 114.5 0
0.3 1 84 111.1 114.5 0
0.4 1 84 111.1 114.5 0
0.5 1 84 111.1 114.5 0
0.6 1 84 111.1 114.5 0
0.7 1 84 111.1 114.5 0
0.8 1 84 111.1 114.5 0
0.9 1 78.1 111.1 114.5 0
1 1 78.1 111.1 114.5 0
I mean, when I ingest log, I'd like to calculate actual timestamp, like this,
ACTUAL_TIME=timestamp_of_FIRST_line + INTERVAL
ACTUAL_TIME=timestamp_of_SECOND_line + INTERVAL
ACTUAL_TIME=timestamp_of_THIRD_line + INTERVAL
ACTUAL_TIME=timestamp_of_FOUR_line + INTERVAL
Does anyone help?
The first step, I think, is to treat the entire log as one event (or as one transaction). Have interval, field2 field3 and so on extracted as multi-value fields.
This run-anywhere code gives you the RESULT of that extraction for your one lot above.
|eval mylot="Lot1", mydata="Lot Start Time 2016/8/31 14:42:20!!!!Lot End Time 2016/8/31 14:59:55!!!!Sampling Interval[sec] 0.1!!!! !!!!Interval field2 field3 field4 field5 field6!!!!0 1 84 111.1 114.5 0!!!!0.1 1 84 111.1 114.5 0!!!!0.2 1 84 111.1 114.5 0!!!!0.3 1 84 111.1 114.5 0!!!!0.4 1 84 111.1 114.5 0!!!!0.5 1 84 111.1 114.5 0!!!!0.6 1 84 111.1 114.5 0!!!!0.7 1 84 111.1 114.5 0!!!!0.8 1 84 111.1 114.5 0!!!!0.9 1 78.1 111.1 114.5 0!!!!1 1 78.1 111.1 114.5 0!!!!1.1 1 78.1 111.1 114.5 0"
| rex field=mydata "Lot Start Time (?<lotStartTime>[^!]*)"
| eval _time=strptime(lotStartTime ,"%Y/%m/%d %H:%M:%S")
| rex field=mydata "Lot End Time (?<lotEndTime>[^!]*)"
| rex field=mydata max_match=0 "!(?<interval>[\d.]+)\s+(?<field2>[\d.]+)\s+(?<field3>[\d.]+)\s+(?<field4>[\d.]+)\s+(?<field5>[\d.]+)\s+(?<field6>[\d.]+?)(!|$)"
| table mylot _time lotStartTime lotEndTime interval field2 field3 field4 field5 field6
...resulting in a single transaction that looks like this... (I've omitted _time since it's the same as lotStartTime, and used 2-digit-years to make it fit on the site.)...
lotStartTime lotEndTime interval field2 field3 field4 field5 field6
8/31/16 14:42:20 8/31/16 14:59:55 0 1 84 111.1 114.5 0
0.1 1 84 111.1 114.5 0
0.2 1 84 111.1 114.5 0
0.3 1 84 111.1 114.5 0
0.4 1 84 111.1 114.5 0
0.5 1 84 111.1 114.5 0
0.6 1 84 111.1 114.5 0
0.7 1 84 111.1 114.5 0
0.8 1 84 111.1 114.5 0
0.9 1 78.1 111.1 114.5 0
1 1 78.1 111.1 114.5 0
1.1 1 78.1 111.1 114.5 0
Each set of all six values are in the single field above.
Then you COULD, at search time, extract the details with something like this...
| eval ziplots=mvzip(mvzip(mvzip(mvzip(mvzip(interval,field2),field3),field4),field5),field6)
| table _time mylot lotStartTime lotEndTime ziplots
| mvexpand ziplots
| makemv delim="," ziplots
| eval interval=mvindex(ziplots,0), field2=mvindex(ziplots,1), field3=mvindex(ziplots,2), field4=mvindex(ziplots,3), field5=mvindex(ziplots,4), field6=mvindex(ziplots,5)
| eval _time=_time + interval
| table mylot _time lotStartTime lotEndTime interval field2 field3 field4 field5 field6
...resulting in multiple events that look like this... (I've removed the year completely from the display to make it fit)...
_time lotStartTime lotEndTime interval field2 field3 field4 field5 field6
14:42:20.00 14:42:20.00 14:59:55.00 0 1 84 111.1 114.5 0
14:42:20.10 14:42:20.00 14:59:55.00 0.1 1 84 111.1 114.5 0
14:42:20.20 14:42:20.00 14:59:55.00 0.2 1 84 111.1 114.5 0
14:42:20.30 14:42:20.00 14:59:55.00 0.3 1 84 111.1 114.5 0
14:42:20.40 14:42:20.00 14:59:55.00 0.4 1 84 111.1 114.5 0
14:42:20.50 14:42:20.00 14:59:55.00 0.5 1 84 111.1 114.5 0
14:42:20.60 14:42:20.00 14:59:55.00 0.6 1 84 111.1 114.5 0
14:42:20.70 14:42:20.00 14:59:55.00 0.7 1 84 111.1 114.5 0
14:42:20.80 14:42:20.00 14:59:55.00 0.8 1 84 111.1 114.5 0
14:42:20.90 14:42:20.00 14:59:55.00 0.9 1 78.1 111.1 114.5 0
14:42:21.00 14:42:20.00 14:59:55.00 1 1 78.1 111.1 114.5 0
14:42:21.10 14:42:20.00 14:59:55.00 1.1 1 78.1 111.1 114.5 0
...or, depending on the frequency that this data will be searched as individual events, instead of doing that unpack at search time, you could set up a populating search to run periodically to unpack the events and add them to the appropriate index individually using the | collect
command. When doing this, if you assign them any sourcetype other than stash
, you will incur license usage. So, the least expensive method(license-wise) for doing this would be to have a separate index and use the default sourcetype=stash
.
DalJeanis,
Thank you for you idea.
But, actually the files of the number of line is millions to tens millions. So It is to difficult to treat the entire log to one event.... any idea?