I need to sum up the time differential for two events on a date_hour, date_wday, and date_month basis. Originally I used the join command to accomplish this until I realized events were being dropped.
I changed my query to remove the command and am currently working with this
index=default source=mysql-pipe sourcetype=pipeline_logs ((message="bank** processing" OR message="bank** decompression complete") AND (msg_type="BEGIN" OR msg_type="INFO"))
| eval temp1=split(message, " ")
| eval bank=mvindex(temp1, 0)
| eval run_id_bank=run_name.".".tag_value.".".bank
| eval stime=strptime(time_at,"%Y-%m-%d %H:%M:%S")
| eval temp=date_hour."##".run_id_bank | chart values(stime) over temp by msg_type
| eval TimeDiff=(INFO-BEGIN)/3600
| eval date_month=mvindex(split(temp,"##"),0)
| stats sum(TimeDiff) AS Sum by date_hour
Each run has 16 different banks which need to be differentiated. The issue I'm running into is when the decompression and processing occur in a different hour (or wday/month) then the differential won't be taken. This can be solved by charting over run_id_bank but then I lose the date_hour timestamp. Is there a way to do the calculation but retain the date_hour from the BEGIN (only occurs for processing) to do a summation?
Your comments did make some things clear to me. This is my suggested query. I've made some changes to reduce the lines of code and corrected a type in the 2nd last line (date_month should actually be date_hour).
index=default source=mysql-pipe sourcetype=pipeline_logs ((message="bank** processing" OR message="bank** decompression complete") AND (msg_type="BEGIN" OR msg_type="INFO"))
| eval run_id_bank=run_name.".".tag_value.".".mvindex(split(message, " "), 0)
| eval stime=strptime(time_at,"%Y-%m-%d %H:%M:%S")
| chart values(stime) over run_id_bank by msg_type
| eval TimeDiff=(INFO-BEGIN)/3600
| eval date_hour=strftime(BEGIN,"%H")
| table run_id_bank BEGIN INFO TimeDiff date_hour
YOu can now validate the results first and then add following to get total TimeDiff by date_hour of the BEGIN events.
index=default source=mysql-pipe sourcetype=pipeline_logs ((message="bank** processing" OR message="bank** decompression complete") AND (msg_type="BEGIN" OR msg_type="INFO"))
| eval run_id_bank=run_name.".".tag_value.".".mvindex(split(message, " "), 0)
| eval stime=strptime(time_at,"%Y-%m-%d %H:%M:%S")
| chart values(stime) over run_id_bank by msg_type
| eval TimeDiff=(INFO-BEGIN)/3600
| eval date_hour=strftime(BEGIN,"%H")
| stats sum(TimeDiff) AS Sum by date_hour
Your comments did make some things clear to me. This is my suggested query. I've made some changes to reduce the lines of code and corrected a type in the 2nd last line (date_month should actually be date_hour).
index=default source=mysql-pipe sourcetype=pipeline_logs ((message="bank** processing" OR message="bank** decompression complete") AND (msg_type="BEGIN" OR msg_type="INFO"))
| eval run_id_bank=run_name.".".tag_value.".".mvindex(split(message, " "), 0)
| eval stime=strptime(time_at,"%Y-%m-%d %H:%M:%S")
| chart values(stime) over run_id_bank by msg_type
| eval TimeDiff=(INFO-BEGIN)/3600
| eval date_hour=strftime(BEGIN,"%H")
| table run_id_bank BEGIN INFO TimeDiff date_hour
YOu can now validate the results first and then add following to get total TimeDiff by date_hour of the BEGIN events.
index=default source=mysql-pipe sourcetype=pipeline_logs ((message="bank** processing" OR message="bank** decompression complete") AND (msg_type="BEGIN" OR msg_type="INFO"))
| eval run_id_bank=run_name.".".tag_value.".".mvindex(split(message, " "), 0)
| eval stime=strptime(time_at,"%Y-%m-%d %H:%M:%S")
| chart values(stime) over run_id_bank by msg_type
| eval TimeDiff=(INFO-BEGIN)/3600
| eval date_hour=strftime(BEGIN,"%H")
| stats sum(TimeDiff) AS Sum by date_hour
Could you provide the mock table of what is your final expected output? (or what's wrong with output of current search that you have?)
Let's disregard the summation aspect for now and explain what's wrong prior to that
Take the following situation
Run1: Begins processing at 2:30 and finished decompression at 3:33, giving a TimeDiff of 63 minutes
Run2: Begins processing at 1:03 and finished decompression at 1:23, giving a TimeDiff of 20minutes
Below is the current output that I am getting. The TimeDiff isn't taken for run1 because the different date_hour stamps separate the two messages even though it's the same run_id_bank.
temp BEGIN INFO TimeDiff date_hour
2##run1_id_bank 1481079017.00 2
3##run1_id_bank 1481079161.00 3
1##run2_id_bank 1481079017.00 1481079173.00 20 1
Below is the kind of output I would be looking for
temp BEGIN INFO TimeDiff date_hour
2##run1_id_bank 1481079017.00 1481079161.00 63 2
1#run2_id_bank 1481079017.00 1481079173.00 20 1
In this case I'd be able to sum up the TimeDiff for all the runs by the date_hour returned from the BEGIN log. If I chart over just run_id_bank instead of temp I get the above table, with an empty date_hour column. The temp column isn't necessary in that format as long as the other columns are accurate
Then I'd be able to run a summation on top of that query to get
date_hour Sum
1 20
2 63
...
24