Getting Data In

Compare The Actual Start Time to The Expect Start Time

tyhopping1
Engager

I need a query that will compare run statistics from a list of jobs (msg.jobName = RLMMTP*) that run everyday. The statistics that need compared are the Actual Start Time and The Expected Start Time. The Actual Start Time is found fromin splunk logs(timestamp), while Expected Start Time needs to be manually entered into the query (does not come from splunk).

Here is an example of what the output would like:

msg.jobName:     ActualStart:          ExpectedStart:
RLMMTP89           15:31:17               20:30:11       
RLMMTP72           12:34:21               13:22:46

Here is the query I have so far:

sourcetype="cf" source=l_cell msg.jobName = RLMMTP* | spath "msg.status.Code" | search "msg.status.Code"=*| spath "msg.Type" | search "msg.Type"=* | spath "msg.message" | search "msg.message"="RECORD PROCESSED" 
| eval day = strftime(_time, "%d")          
| stats earliest(timestamp) as startTime, latest(timestamp) as endTime count by msg.jobName
| eval startTime=substr(startTime,1,13) 
| eval startTimeD=strftime(startTime/1000, "%H:%M:%S")
 |  eval ExpectedStart="15:31:17"
 |  eval ActualStart= startTimeD
 |  append
     [|  makeresults     
 |  eval ExpectedStart="09:58:24"
 |  eval ActualStart= startTimeD]
 |  append
     [|  makeresults
 |  eval ExpectedStart="17:56:30."
 |  eval ActualStart= startTimeD]

 | table msg.jobName ActualStart ExpectedStart

However, each job has a different Expected Start Time, which this query does not feature. Any help is appreciated

0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

You may be able to use a lookup file. Create a CSV file (I'll call it starttimes.csv) that looks like this:

jobName,ExpectedStart
RLMMTP89, 20:30:11
RLMMTP72,13:22:46

Then you query looks something like this:

sourcetype="cf" source=l_cell msg.jobName = RLMMTP* | spath "msg.status.Code" | search "msg.status.Code"=*| spath "msg.Type" | search "msg.Type"=* | spath "msg.message" | search "msg.message"="RECORD PROCESSED" 
| eval day = strftime(_time, "%d")          
| stats earliest(timestamp) as startTime, latest(timestamp) as endTime count by msg.jobName
| eval startTime=substr(startTime,1,13) 
| eval ActualStart=strftime(startTime/1000, "%H:%M:%S")
| lookup starttimes.csv msg.jobName as jobName OUTPUT ExpectedStart
| table msg.jobName ActualStart ExpectedStart
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

0 Karma

dmarling
Builder

I would suggest you create a lookup file that contains a list of the job names and expected start times. That way you can perform a lookup by jobName and return the expected start time dynamically instead of doing it ad hoc in your query. That will also make it more maintainable in the future as you can simply modify the lookup file to reflect the new time.

If you need to account for the time for those changes, like it was 15:31:17 for jobName RLMMTP89 only from 1/1/2020 to 1/31/2020 you can make your lookup temporal, but you will need to add the timestamp that reflects when the new expected time is supposed to be selected.

Assuming you have do not need to make your lookup temporal, you can create a basic lookup table in Splunk using your sample data:

| makeresults count=1
| fields - _time
| eval data="RLMMTP90 20:30:11,RLMMTP72 13:22:46"
| makemv data delim=","
| mvexpand data
| rex field=data "(?<msgjobName>[^\s]+) (?<ExpectedStart>[^\e]+)"
| fields - data
| rename msgjobName as "msg.jobName"
| outputlookup expectedJobStarts.csv

Now that you have your starter lookup file expectedJobStarts.csv you can put that in your query

sourcetype="cf" source=l_cell msg.jobName = RLMMTP* 
| spath "msg.status.Code" 
| search "msg.status.Code"=* 
| spath "msg.Type" 
| search "msg.Type"=* 
| spath "msg.message" 
| search "msg.message"="RECORD PROCESSED" 
| eval day = strftime(_time, "%d") 
| stats earliest(timestamp) as startTime, latest(timestamp) as endTime count by msg.jobName 
| eval startTime=substr(startTime,1,13) 
| eval startTimeD=strftime(startTime/1000, "%H:%M:%S") 
| lookup expectedJobStarts.csv "msg.jobName" 
| eval ActualStart= startTimeD 
| table msg.jobName ActualStart ExpectedStart

I have a couple additional suggestions with your search after looking at it. I suggest you add an index to your search as that will make your search performance significantly better and reduce load on your indexers memory for that search. It also appears that you are querying a json event which should not require the spath commands you are using. Since I don't know the index your data is under I'll use foo as a place holder. Another thing to consider is when you do a stats by the msg.jobName you may be joining together different jobs that run in that same queried time period if you aren't splitting by another field like the "day" field you created or ideally a transaction id that would hopefully be present in your events.

index=foo sourcetype="cf" source=l_cell msg.jobName = RLMMTP* msg.status.Code=* msg.Type=* "msg.message"="RECORD PROCESSED"
| eval day = strftime(_time, "%d") 
| stats earliest(timestamp) as startTime, latest(timestamp) as endTime count by day msg.jobName 
| eval startTime=substr(startTime,1,13) 
| eval startTimeD=strftime(startTime/1000, "%H:%M:%S") 
| lookup expectedJobStarts.csv "msg.jobName" 
| eval ActualStart= startTimeD 
| table msg.jobName ActualStart ExpectedStart
If this comment/answer was helpful, please up vote it. Thank you.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

You may be able to use a lookup file. Create a CSV file (I'll call it starttimes.csv) that looks like this:

jobName,ExpectedStart
RLMMTP89, 20:30:11
RLMMTP72,13:22:46

Then you query looks something like this:

sourcetype="cf" source=l_cell msg.jobName = RLMMTP* | spath "msg.status.Code" | search "msg.status.Code"=*| spath "msg.Type" | search "msg.Type"=* | spath "msg.message" | search "msg.message"="RECORD PROCESSED" 
| eval day = strftime(_time, "%d")          
| stats earliest(timestamp) as startTime, latest(timestamp) as endTime count by msg.jobName
| eval startTime=substr(startTime,1,13) 
| eval ActualStart=strftime(startTime/1000, "%H:%M:%S")
| lookup starttimes.csv msg.jobName as jobName OUTPUT ExpectedStart
| table msg.jobName ActualStart ExpectedStart
---
If this reply helps you, an upvote would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...