Splunk Search

How to extract the start and end date/time from my sample data, then calculate the duration of a job?

athorat
Communicator

Hi

I have to extract start date, end date, and the duration of a job based on the following two events:

Started:
2016-01-06 07:00:10,314  INFO ActionStartXCommand:543 - SERVER[redacted.redacted.redacted.com] USER[hdfs] GROUP[-] TOKEN[] APP[BOB_CAS_WF_IM_DataStore] JOB[0018443-150810162217214-oozie-oozi-W] ACTION[0018443-150810162217214-oozie-oozi-W@:start:] [***0018443-150810162217214-oozie-oozi-W@:start:***]Action updated in DB!


End:
2016-01-06 08:10:50,301  INFO ActionStartXCommand:543 - SERVER[redacted.redacted.redacted.com] USER[hdfs] GROUP[-] TOKEN[] APP[BOB_CAS_WF_IM_DataStore] JOB[0018443-150810162217214-oozie-oozi-W] ACTION[0018443-150810162217214-oozie-oozi-W@end] [***0018443-150810162217214-oozie-oozi-W@end***]Action updated in DB!

How do I extract the date and time and then calculate the difference.

0 Karma
1 Solution

sundareshr
Legend

Try something like this (untested code, assuming there are only two events per JOB)

... | rex "JOB\[?(?<jobid>[\d-]+)-" | streamstats window=2 current=f range(_time) as diff by jobid | table jobid diff | search diff=*

View solution in original post

jkat54
SplunkTrust
SplunkTrust

In your props.conf:

[sourcetypeName]
SHOULD_LINEMERGE = false
BREAK_ONLY_BEFORE = \d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\,\d{3}
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
MAX_TIMESTAMP_LOOKAHEAD = 24
EXTRACT-myfields = (?<date>\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2},\d{3})\s(?<log_level>\w+)\s(?<unknownfield>\w+):(?<number>\d+)\s-\sSERVER\[(?<servername>\S+)] USER\[(?<username>\S+)]\sGROUP\[(?<groupname>\S+)]\sTOKEN\[]\sAPP\[(?<appname>\S+)]\sJOB\[(?<jobname>\S+)]\sACTION\[\S+@(?<action>.*)]\s\[

Your search:

 index="foo" servername="server.acme.com" action=:start: 
| stats first(_time) AS startTime BY servername 
| append [
   search index="foo" sourcetype="testing" servername="server.acme.com" action=end 
   | stats first(_time) AS endTime BY servername 
] | chart eval(sum(endTime)-sum(startTime)) AS diffTimeInSecs by servername
0 Karma

athorat
Communicator

thanks for the reply @jka154 @sundareshr

I am using the following query to extract job name and Action=:start: or Action=@end

index=aap_prod sourcetype="HDP:PROD:OOZIE"  ("@:start:" OR "@end")  | rex "TOKEN\[\] APP\[(?[^\]]*)"    | rex "ACTION\[[^\@]*(?[^\d\]]*)"

If I append rex "JOB\[?(?[\d-]+)-" | streamstats window=2 current=f range(_time) as diff by jobid | table jobid diff

I get the table but there are jobid's with the same id with value = 0 and then there is one value with the actual duration.
How do we filter only the values with actual duration.
tried using | stats values(jobid) which does not work.

Thanks for looking into this.

Anil.

0 Karma

jkat54
SplunkTrust
SplunkTrust
index=aap_prod sourcetype="HDP:PROD:OOZIE" ("@:start:" OR "@end") 
 | rex field=_raw "(?<date>\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2},\d{3})\s(?<log_level>\w+)\s(?<unknownfield>\w+):(?<number>\d+)\s-\sSERVER\[(?<servername>\S+)] USER\[(?<username>\S+)]\sGROUP\[(?<groupname>\S+)]\sTOKEN\[]\sAPP\[(?<appname>\S+)]\sJOB\[(?<jobname>\S+)]\sACTION\[\S+@(?<action>.*)]\s\["
  | search action=:start:
  | stats first(_time) AS startTime by servername
  | append [ 
   search index=aap_prod sourcetype="HDP:PROD:OOZIE" ("@:start:" OR "@end") 
   | rex field=_raw "(?<date>\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2},\d{3})\s(?<log_level>\w+)\s(?<unknownfield>\w+):(?<number>\d+)\s-\sSERVER\[(?<servername>\S+)] USER\[(?<username>\S+)]\sGROUP\[(?<groupname>\S+)]\sTOKEN\[]\sAPP\[(?<appname>\S+)]\sJOB\[(?<jobname>\S+)]\sACTION\[\S+@(?<action>.*)]\s\["
   | stats first(_time) AS endTime by servername
  ]
  | chart eval(sum(endTime)-sum(startTime)) AS diffTimeInSecs by servername
0 Karma

sundareshr
Legend

Updated my answer above to show how you can filter out where diff is not calculated.

0 Karma

athorat
Communicator

@sundareshr

I tired | search diff!=0.0 and it worked.
How can I add the Start and the end dates, can you throw some light on it ?

Thanks,

0 Karma

sundareshr
Legend

Try something like this (untested code, assuming there are only two events per JOB)

... | rex "JOB\[?(?<jobid>[\d-]+)-" | streamstats window=2 current=f range(_time) as diff by jobid | table jobid diff | search diff=*

jkat54
SplunkTrust
SplunkTrust

What happens if there are more than 2 events per job?

I like how you simplified by jobid and used streamstats... pretty cool.

0 Karma

athorat
Communicator

@sundareshr

I tired | search diff!=0.0 and it worked.
How can I add the Start and the end dates, can you throw some light on it ?

Thanks,

0 Karma

jkat54
SplunkTrust
SplunkTrust

My search works fine with the sample data you provided. It's not as efficient as sundareshr's search but i believe it handles more scenarios / possibilities. If you want to still see start/end, and diff, and servername...

Edit my chart command at the end of my search on the bottom of my reply

...| chart startTime endTime eval(sum(endTime)-sum(startTime)) AS diffTimeInSecs by servername

0 Karma

jkat54
SplunkTrust
SplunkTrust

only issue is the time's will be in epoch...

0 Karma

sundareshr
Legend

Use this

.... | streamstats current=f window=2 range(_time) as diff latest(_time) as end earliest(_time) as start| table jobid, diff, start, end | eval start=strftime(start, "%c") | search diff=*
0 Karma

athorat
Communicator

@sundareshr is there a way we can we convert the range-> diff to seconds or minutes?

0 Karma

sundareshr
Legend
| eval diff=tostring(diff, "duration")
0 Karma
Get Updates on the Splunk Community!

Expert Tips from Splunk Professional Services, Ensuring Compliance, and More New ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Observability Release Update: AI Assistant, AppD + Observability Cloud Integrations & ...

This month’s releases across the Splunk Observability portfolio deliver earlier detection and faster ...

Stay Connected: Your Guide to February Tech Talks, Office Hours, and Webinars!

&#x1f48c;Keep the new year’s momentum going with our February lineup of Community Office Hours, Tech Talks, ...