Solved: Compare The Actual Start Time to The Expect Start ...

tyhopping1 · ‎02-18-2020

I need a query that will compare run statistics from a list of jobs (msg.jobName = RLMMTP*) that run everyday. The statistics that need compared are the Actual Start Time and The Expected Start Time. The Actual Start Time is found fromin splunk logs(timestamp), while Expected Start Time needs to be manually entered into the query (does not come from splunk).

Here is an example of what the output would like:

msg.jobName:     ActualStart:          ExpectedStart:
RLMMTP89           15:31:17               20:30:11       
RLMMTP72           12:34:21               13:22:46

Here is the query I have so far:

sourcetype="cf" source=l_cell msg.jobName = RLMMTP* | spath "msg.status.Code" | search "msg.status.Code"=*| spath "msg.Type" | search "msg.Type"=* | spath "msg.message" | search "msg.message"="RECORD PROCESSED" 
| eval day = strftime(_time, "%d")          
| stats earliest(timestamp) as startTime, latest(timestamp) as endTime count by msg.jobName
| eval startTime=substr(startTime,1,13) 
| eval startTimeD=strftime(startTime/1000, "%H:%M:%S")
 |  eval ExpectedStart="15:31:17"
 |  eval ActualStart= startTimeD
 |  append
     [|  makeresults     
 |  eval ExpectedStart="09:58:24"
 |  eval ActualStart= startTimeD]
 |  append
     [|  makeresults
 |  eval ExpectedStart="17:56:30."
 |  eval ActualStart= startTimeD]

 | table msg.jobName ActualStart ExpectedStart

However, each job has a different Expected Start Time, which this query does not feature. Any help is appreciated

richgalloway · ‎02-18-2020

You may be able to use a lookup file. Create a CSV file (I'll call it starttimes.csv) that looks like this:

jobName,ExpectedStart
RLMMTP89, 20:30:11
RLMMTP72,13:22:46

Then you query looks something like this:

sourcetype="cf" source=l_cell msg.jobName = RLMMTP* | spath "msg.status.Code" | search "msg.status.Code"=*| spath "msg.Type" | search "msg.Type"=* | spath "msg.message" | search "msg.message"="RECORD PROCESSED" 
| eval day = strftime(_time, "%d")          
| stats earliest(timestamp) as startTime, latest(timestamp) as endTime count by msg.jobName
| eval startTime=substr(startTime,1,13) 
| eval ActualStart=strftime(startTime/1000, "%H:%M:%S")
| lookup starttimes.csv msg.jobName as jobName OUTPUT ExpectedStart
| table msg.jobName ActualStart ExpectedStart

---
If this reply helps you, Karma would be appreciated.

View solution in original post

dmarling · ‎02-18-2020

I would suggest you create a lookup file that contains a list of the job names and expected start times. That way you can perform a lookup by jobName and return the expected start time dynamically instead of doing it ad hoc in your query. That will also make it more maintainable in the future as you can simply modify the lookup file to reflect the new time.

If you need to account for the time for those changes, like it was 15:31:17 for jobName RLMMTP89 only from 1/1/2020 to 1/31/2020 you can make your lookup temporal, but you will need to add the timestamp that reflects when the new expected time is supposed to be selected.

Assuming you have do not need to make your lookup temporal, you can create a basic lookup table in Splunk using your sample data:

| makeresults count=1
| fields - _time
| eval data="RLMMTP90 20:30:11,RLMMTP72 13:22:46"
| makemv data delim=","
| mvexpand data
| rex field=data "(?<msgjobName>[^\s]+) (?<ExpectedStart>[^\e]+)"
| fields - data
| rename msgjobName as "msg.jobName"
| outputlookup expectedJobStarts.csv

Now that you have your starter lookup file expectedJobStarts.csv you can put that in your query

sourcetype="cf" source=l_cell msg.jobName = RLMMTP* 
| spath "msg.status.Code" 
| search "msg.status.Code"=* 
| spath "msg.Type" 
| search "msg.Type"=* 
| spath "msg.message" 
| search "msg.message"="RECORD PROCESSED" 
| eval day = strftime(_time, "%d") 
| stats earliest(timestamp) as startTime, latest(timestamp) as endTime count by msg.jobName 
| eval startTime=substr(startTime,1,13) 
| eval startTimeD=strftime(startTime/1000, "%H:%M:%S") 
| lookup expectedJobStarts.csv "msg.jobName" 
| eval ActualStart= startTimeD 
| table msg.jobName ActualStart ExpectedStart

I have a couple additional suggestions with your search after looking at it. I suggest you add an index to your search as that will make your search performance significantly better and reduce load on your indexers memory for that search. It also appears that you are querying a json event which should not require the spath commands you are using. Since I don't know the index your data is under I'll use foo as a place holder. Another thing to consider is when you do a stats by the msg.jobName you may be joining together different jobs that run in that same queried time period if you aren't splitting by another field like the "day" field you created or ideally a transaction id that would hopefully be present in your events.

index=foo sourcetype="cf" source=l_cell msg.jobName = RLMMTP* msg.status.Code=* msg.Type=* "msg.message"="RECORD PROCESSED"
| eval day = strftime(_time, "%d") 
| stats earliest(timestamp) as startTime, latest(timestamp) as endTime count by day msg.jobName 
| eval startTime=substr(startTime,1,13) 
| eval startTimeD=strftime(startTime/1000, "%H:%M:%S") 
| lookup expectedJobStarts.csv "msg.jobName" 
| eval ActualStart= startTimeD 
| table msg.jobName ActualStart ExpectedStart

If this comment/answer was helpful, please up vote it. Thank you.

richgalloway · ‎02-18-2020

You may be able to use a lookup file. Create a CSV file (I'll call it starttimes.csv) that looks like this:

jobName,ExpectedStart
RLMMTP89, 20:30:11
RLMMTP72,13:22:46

Then you query looks something like this:

sourcetype="cf" source=l_cell msg.jobName = RLMMTP* | spath "msg.status.Code" | search "msg.status.Code"=*| spath "msg.Type" | search "msg.Type"=* | spath "msg.message" | search "msg.message"="RECORD PROCESSED" 
| eval day = strftime(_time, "%d")          
| stats earliest(timestamp) as startTime, latest(timestamp) as endTime count by msg.jobName
| eval startTime=substr(startTime,1,13) 
| eval ActualStart=strftime(startTime/1000, "%H:%M:%S")
| lookup starttimes.csv msg.jobName as jobName OUTPUT ExpectedStart
| table msg.jobName ActualStart ExpectedStart

---
If this reply helps you, Karma would be appreciated.

Compare The Actual Start Time to The Expect Start Time

Stay Connected: Your Guide to January Tech Talks, Office Hours, and Webinars!

[Puzzles] Solve, Learn, Repeat: Reprocessing XML into Fixed-Length Events

Data Management Digest – December 2025

Join the Conversation

Compare The Actual Start Time to The Expect Start Time

Stay Connected: Your Guide to January Tech Talks, Office Hours, and Webinars!

[Puzzles] Solve, Learn, Repeat: Reprocessing XML into Fixed-Length Events

Data Management Digest – December 2025