Splunk Search

Alternative to join command to do a summation on date_hour

byu168
Path Finder

I need to sum up the time differential for two events on a date_hour, date_wday, and date_month basis. Originally I used the join command to accomplish this until I realized events were being dropped.

I changed my query to remove the command and am currently working with this

  index=default source=mysql-pipe sourcetype=pipeline_logs ((message="bank** processing" OR message="bank** decompression complete") AND (msg_type="BEGIN" OR msg_type="INFO")) 
    | eval temp1=split(message, " ")
    | eval bank=mvindex(temp1, 0)
    | eval run_id_bank=run_name.".".tag_value.".".bank 
    | eval stime=strptime(time_at,"%Y-%m-%d %H:%M:%S") 
    | eval temp=date_hour."##".run_id_bank | chart values(stime) over temp by msg_type 
    | eval TimeDiff=(INFO-BEGIN)/3600 
    | eval date_month=mvindex(split(temp,"##"),0)
    | stats sum(TimeDiff) AS Sum by date_hour

Each run has 16 different banks which need to be differentiated. The issue I'm running into is when the decompression and processing occur in a different hour (or wday/month) then the differential won't be taken. This can be solved by charting over run_id_bank but then I lose the date_hour timestamp. Is there a way to do the calculation but retain the date_hour from the BEGIN (only occurs for processing) to do a summation?

0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

Your comments did make some things clear to me. This is my suggested query. I've made some changes to reduce the lines of code and corrected a type in the 2nd last line (date_month should actually be date_hour).

index=default source=mysql-pipe sourcetype=pipeline_logs ((message="bank** processing" OR message="bank** decompression complete") AND (msg_type="BEGIN" OR msg_type="INFO")) 
     | eval run_id_bank=run_name.".".tag_value.".".mvindex(split(message, " "), 0) 
     | eval stime=strptime(time_at,"%Y-%m-%d %H:%M:%S") 
     | chart values(stime) over run_id_bank by msg_type 
     | eval TimeDiff=(INFO-BEGIN)/3600 
     | eval date_hour=strftime(BEGIN,"%H")
     | table run_id_bank BEGIN INFO TimeDiff date_hour

YOu can now validate the results first and then add following to get total TimeDiff by date_hour of the BEGIN events.

index=default source=mysql-pipe sourcetype=pipeline_logs ((message="bank** processing" OR message="bank** decompression complete") AND (msg_type="BEGIN" OR msg_type="INFO")) 
     | eval run_id_bank=run_name.".".tag_value.".".mvindex(split(message, " "), 0) 
     | eval stime=strptime(time_at,"%Y-%m-%d %H:%M:%S") 
     | chart values(stime) over run_id_bank by msg_type 
     | eval TimeDiff=(INFO-BEGIN)/3600 
     | eval date_hour=strftime(BEGIN,"%H")
     | stats sum(TimeDiff) AS Sum by date_hour

View solution in original post

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Your comments did make some things clear to me. This is my suggested query. I've made some changes to reduce the lines of code and corrected a type in the 2nd last line (date_month should actually be date_hour).

index=default source=mysql-pipe sourcetype=pipeline_logs ((message="bank** processing" OR message="bank** decompression complete") AND (msg_type="BEGIN" OR msg_type="INFO")) 
     | eval run_id_bank=run_name.".".tag_value.".".mvindex(split(message, " "), 0) 
     | eval stime=strptime(time_at,"%Y-%m-%d %H:%M:%S") 
     | chart values(stime) over run_id_bank by msg_type 
     | eval TimeDiff=(INFO-BEGIN)/3600 
     | eval date_hour=strftime(BEGIN,"%H")
     | table run_id_bank BEGIN INFO TimeDiff date_hour

YOu can now validate the results first and then add following to get total TimeDiff by date_hour of the BEGIN events.

index=default source=mysql-pipe sourcetype=pipeline_logs ((message="bank** processing" OR message="bank** decompression complete") AND (msg_type="BEGIN" OR msg_type="INFO")) 
     | eval run_id_bank=run_name.".".tag_value.".".mvindex(split(message, " "), 0) 
     | eval stime=strptime(time_at,"%Y-%m-%d %H:%M:%S") 
     | chart values(stime) over run_id_bank by msg_type 
     | eval TimeDiff=(INFO-BEGIN)/3600 
     | eval date_hour=strftime(BEGIN,"%H")
     | stats sum(TimeDiff) AS Sum by date_hour
0 Karma

somesoni2
SplunkTrust
SplunkTrust

Could you provide the mock table of what is your final expected output? (or what's wrong with output of current search that you have?)

0 Karma

byu168
Path Finder

Let's disregard the summation aspect for now and explain what's wrong prior to that

Take the following situation

Run1: Begins processing at 2:30 and finished decompression at 3:33, giving a TimeDiff of 63 minutes
Run2: Begins processing at 1:03 and finished decompression at 1:23, giving a TimeDiff of 20minutes

Below is the current output that I am getting. The TimeDiff isn't taken for run1 because the different date_hour stamps separate the two messages even though it's the same run_id_bank.

temp               BEGIN        INFO       TimeDiff date_hour
2##run1_id_bank 1481079017.00                            2
3##run1_id_bank                1481079161.00              3
1##run2_id_bank 1481079017.00  1481079173.00  20          1

Below is the kind of output I would be looking for

temp                     BEGIN            INFO          TimeDiff          date_hour
2##run1_id_bank 1481079017.00   1481079161.00           63                  2
1#run2_id_bank     1481079017.00    1481079173.00           20                    1

In this case I'd be able to sum up the TimeDiff for all the runs by the date_hour returned from the BEGIN log. If I chart over just run_id_bank instead of temp I get the above table, with an empty date_hour column. The temp column isn't necessary in that format as long as the other columns are accurate

Then I'd be able to run a summation on top of that query to get

date_hour               Sum
1                       20
2                       63
...
24
0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud | Unified Identity - Now Available for Existing Splunk ...

Raise your hand if you’ve already forgotten your username or password when logging into an account. (We can’t ...

Index This | How many sides does a circle have?

February 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...