Splunk Search

Do I use Last(..) or Latest(..) after match(..), or something else?

sjringo
Contributor

I am performing a search for two events. A start event and a stop event for a specific job Name.

I have ran into an issue where I am getting two start events as a previous days job (2 days ago) started late and started after midnight (yesterday)  and yesterdays job started on time before midnight (yesterday).

I have tried last(startTimeRaw) and latest(startTimeRaw) but not getting the results expected.

index=anIndex sourcetype=aSourceType aJobName ("START of script" OR "COMPLETED OK")

| eval startTimeRaw = if (match(_raw, "START of script"), _time, null())

| eval endTimeRaw = if (match(_raw, "COMPLETED OK"), _time, null())

--------------------------------------------

I am getting this as my results for startTimeRaw in my results:

-2 days: startTimeRaw = 1664770565

-1 day: startTimeRaw = 1664868051 & 1664948397

In the above I am looking to assign startTimeRaw the last value found or matched (1664948397)

Labels (3)
0 Karma
1 Solution

johnhuang
Motivator

This is a good use case for transactions.

Note: the line with| eval job_name="job_name" is a placeholder for the actual job_name. Please modify it accordingly.

 

 

index=anIndex sourcetype=aSourceType aJobName ("START of script" OR "COMPLETED OK") earliest=-8d@d

| eval job_name="<job_name>"

| rex "(?<event_name>(START of script)|(COMPLETED OK))"
| eval event_name=CASE(event_name="START of script", "script_start", event_name="COMPLETED OK", "script_complete") 
| eval event_time=strftime(_time, "%Y-%m-%d %H:%M:%S") | eval event_desc="[".event_time."] - ".event_name
| eval {event_name}_time=_time
| transaction host job_name startswith=(event_name="script_start") endswith=(event_name="script_complete")
| rename duration AS run_time_secs
| eval script_start_time=strftime(script_start_time, "%Y-%m-%d %H:%M:%S")
| eval script_complete_time=strftime(script_complete_time, "%Y-%m-%d %H:%M:%S")
| table _time host job_name script_start_time script_complete_time run_time_secs event_desc

 

 

 

View solution in original post

0 Karma

sjringo
Contributor

What I am doing is looking back 8 days (earliest -8d@d latest now) and calculating the duration of job run's.  Where I am running into an issue is where a job started from one day and the previous day instead of just starting on a unique day.  Here is my whole SPL:

I tried Min(...) & max(...) but it assigned one value for every row, instead of on the one row that has two results.

index=anIndex sourcetype=aSourceType aJobName ("START of script" OR "COMPLETED OK")

| eval startTimeRaw = if (match(_raw, "START of script"), _time, null())

| eval endTimeRaw = if (match(_raw, "COMPLETED OK"), _time, null())

 

| eval eventDate=strftime(_time,"%Y-%m-%d")

 

| stats range(_time) as duration values(startTimeRaw) as startTimeRaw values(endTimeRaw) as endTimeRaw by eventDate, host

| eval durationTime = tostring (duration, "duration")

 

| eval aTime1= startTimeRaw- (strptime(strftime(startTimeRaw,"%Y-%m-%dT%H:%M:%S.%3N")." CDT","%Y-%m-%dT%H:%M:%S.%N%Z") - startTimeRaw)

| eval startDateTimeCDT=strftime(aTime1, "%Y-%m-%d %H:%M:%S")

 

| eval aTime2= endTimeRaw - (strptime(strftime(endTimeRaw,"%Y-%m-%dT%H:%M:%S.%3N")." CDT","%Y-%m-%dT%H:%M:%S.%N%Z") - endTimeRaw)

| eval endDateTimeCDT=strftime(aTime2, "%Y-%m-%d %H:%M:%S")

 

| eval startTimeCDT=strftime(aTime1, "%H:%M:%S")

| eval endTimeCDT=strftime(aTime2, "%H:%M:%S")

| eval host = if (host="aHost1", "aHost1-Name", "aHost2-Name")

| eval "Duration"=durationTime

| eval "End Time (CDT)"=endTimeCDT

| eval "Start Time (CDT)"=startTimeCDT

| eval "Start Date / Time (CDT)"=startDateTimeCDT

| eval "End Date / Time (CDT)"=endDateTimeCDT

| eval "Day Of Week"=strftime(aTime2,"%A")

 

| table duration, startTimeRaw, endTimeRaw, "Host Name", "Day Of Week", "Job Name", "Start Date / Time (CDT)" "End Date / Time (CDT)", "Start Time (CDT)", "End Time (CDT)", "Duration" 

 

I have attached a screenshot of my results.  The line that is causing issues is the one with two startTimeRaw(s)...

0 Karma

johnhuang
Motivator

This is a good use case for transactions.

Note: the line with| eval job_name="job_name" is a placeholder for the actual job_name. Please modify it accordingly.

 

 

index=anIndex sourcetype=aSourceType aJobName ("START of script" OR "COMPLETED OK") earliest=-8d@d

| eval job_name="<job_name>"

| rex "(?<event_name>(START of script)|(COMPLETED OK))"
| eval event_name=CASE(event_name="START of script", "script_start", event_name="COMPLETED OK", "script_complete") 
| eval event_time=strftime(_time, "%Y-%m-%d %H:%M:%S") | eval event_desc="[".event_time."] - ".event_name
| eval {event_name}_time=_time
| transaction host job_name startswith=(event_name="script_start") endswith=(event_name="script_complete")
| rename duration AS run_time_secs
| eval script_start_time=strftime(script_start_time, "%Y-%m-%d %H:%M:%S")
| eval script_complete_time=strftime(script_complete_time, "%Y-%m-%d %H:%M:%S")
| table _time host job_name script_start_time script_complete_time run_time_secs event_desc

 

 

 

0 Karma

sjringo
Contributor

I got around to looking further into this today and works as advertised...

Here is what I ended up with, still need to do a little date formatting but the content is 100%.

index=anIndex sourcetype=aSourceType aJobName ("START of script" OR "COMPLETED OK") earliest=-7d@d

 

| rex "(?<event_name>(START of script)|(COMPLETED OK))"

| eval event_name=CASE(event_name="START of script", "script_start", event_name="COMPLETED OK", "script_complete")

| eval event_time=strftime(_time, "%Y-%m-%d %H:%M:%S")

| eval aSortTime=_time

| eval {event_name}_time=_time

| transaction host job_name startswith=(event_name="script_start") endswith=(event_name="script_complete")

 

| eval "Job Name"="Fake Job Name"

| eval "Host Name"=if (host="HOST1", "Host1-Name", "Host2-Name")

| eval "Start Date/Time"=strftime(script_start_time, "%Y-%m-%d %H:%M:%S")

| eval "Stop Date/Time"=strftime(script_complete_time, "%Y-%m-%d %H:%M:%S")

| eval "Day Of Week"=strftime(script_complete_time, "%A")

| eval "Duration"=tostring(duration, "duration")

 

| sort aSortTime

| table "Host Name", "Day Of Week", "Job Name", "Start Date/Time", "Stop Date/Time", , "Duration"

0 Karma

sjringo
Contributor

Interesting, I ran your SPL with the real index, sourcetype, jobname, etc... and got what appears to be some pretty good results.

I will need to go back and verify the data from your SPL and my SPL, check conversions to CDT, etc... and then try to understand what exactly its doing since I am somewhat new to Splunk and have never worked with transactions before today.

It will take some time to go through validation, etc..  and am still hoping there is a simple one line addition / change to my original SPL to get the latest start time for the one row that is causing me issues.

 

richgalloway
SplunkTrust
SplunkTrust

When selecting the most recent timestamp value,. I find max() works well and is clearer than first()/last().

---
If this reply helps you, Karma would be appreciated.
0 Karma

sjringo
Contributor

I tried this and it looks like its taking the last raw time for every row, not just the one with two ?

index=anIndex sourcetype=aSourceType aJobName ("START of script" OR "COMPLETED OK")

| eval startTimeRaw = if (match(_raw, "START of script"), _time, null())

| eval endTimeRaw = if (match(_raw, "COMPLETED OK"), _time, null())

| eventstats last(startTimeRaw) AS startTimeRaw

Tags (1)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

To get the max value from a multi-value field, install the mvstats app (https://splunkbase.splunk.com/app/5198) and use

| mvstats max startTimeRaw as startTime

 

---
If this reply helps you, Karma would be appreciated.
0 Karma

sjringo
Contributor

Is it possible to get the value or startTimeRaw using an index ?

My thinking is set the index to 0 and then if there are more than 1 value in startTimeRaw set the index to 1 instead of 0, then get the value of startTimeRaw at that index.

Original SPL:

| stats range(_time) as duration values(startTimeRaw) as startTimeRaw values(endTimeRaw) as endTimeRaw by eventDate, host

Pseudo Code:

| eval anIndex =0

Then check the # of values in startTimeRaw and if > 0 set anIndex to the value of startTimeRaw at the last values index. ( | eval startTimeRawUsingIndex = startTimeRaw @ anIndex)

| stats range(_time) as duration values(startTimeRawUsingIndex) as startTimeRaw values(endTimeRaw) as endTimeRaw by eventDate, host

0 Karma

jdunlea
Contributor

Did the transaction search from @johnhuang not provide you what you needed? Looking at that search, it should give you what you need (assuming that two jobs will never overlap each other, even if one of them starts late).

 

If that transaction search is not what you need, can you explain why it did not work, so that we can get a better idea of what you are trying to achieve?

0 Karma

sjringo
Contributor

I am waiting for some new log files (sourcetypes) to be added to my Splunk index. They should be available next week and will look at implementing the solution provided by  @johnhua.

Since I am somewhat new to Splunk it will take me some time to understand what it is doing and then add / change the SPL to fit the formating on the output.

In the meantime I am looking for a solution to my existing SPL, if there is one...

0 Karma

sjringo
Contributor

I checked and for my Splunk environment mvstats is not a valid command.  I have reached out to my Splunk sys admin to see if it can be added.

 

In the meantime anything else you can think of that I can try ?

Tags (1)
0 Karma

jdunlea
Contributor

I am not sure what you mean by "1 day" and "2 days"  in your results, but I think you can do this easily with eventstats or stats. 

 

For example, you could do:

| eventstats min(startTimeRaw) as earliest_seen_startTime by JobName

 

This will assign a new field to each event called "earliest_seen_startTime" which will contain the earliest startTime that was seen for the job in question. 

 

You could also do this with stats, if you wanted to transform  the data. 

 

Can you tell me more about how you want the results to be formatted, so I could possibly work up a solution with stats?

0 Karma
Get Updates on the Splunk Community!

Index This | Why did the turkey cross the road?

November 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Feel the Splunk Love: Real Stories from Real Customers

Hello Splunk Community,    What’s the best part of hearing how our customers use Splunk? Easy: the positive ...