Splunk Search

How to edit my search to avoid overlapping events from occurring in timeline?

Communicator

Hi all!

I have something which sends me the START and the STOP of some processes.

I have this search that creates a timeline chart and, if in some processes we have just the START event, assumes that this process must be GOING so it attributes to that process endtime = now (just for the sake of drawing the bar) .

MY BASE SEARCH
| eval StartTime = if(LogType = "START", _time, null)
| eval EndTime = if(LogType = "END", _time, null)
| stats Latest(StartTime) as StartTime Latest(EndTime) as EndTime Latest(Result) as Result by process id
| eval EndTime = if(isnull(EndTime), now(), EndTime)
| eval Status = if(isnull(Result), "GOING", Result)
| eval duration = (EndTime-StartTime) * 1000
| where (duration != "" OR duration >= 0) 
| table StartTime process Status duration
| sort StartTime

Anyway sometimes this process has been killed and does not send the STOP event.
The consequence of this is that a process that has been killed continues to draw this bar and if the same process started again, we have overlapping, like in the image:

alt text

What I would like to do is to exclude those instance of the process which have just a start without having an end but only if another instance of the process is already started.
(Instances of the process are identified in the search by the field "id")

Example:
Event1 -> processA id1 Start
Event2 -> processA id2 Start
Event3 -> processA id2 end
In the previous example, with those events a chart with overlapping (like the image) would come out.
So I would like the search to exclude the First start because there is a second start and no end, which means that the process was killed.

I hope I was clear in my explanation.
Please don't hesitate to ask me any further details.

Thank you

1 Solution

Communicator

I found the solution myself.
This is the code:

 MY BASE SEARCH
| eval StartTime = if(LogType = "START", _time, null)
| eval EndTime = if(LogType = "END", _time, null)
| stats Latest(StartTime) as StartTime Latest(EndTime) as EndTime Latest(Result) as Result by precess id
| join type=left precess [search
  MY BASE SEARCH LogType ="START"
  | stats latest(_time) AS LatestStartTime by precess]
| eval EndTime = if(isnull(EndTime), if(StartTime = LatestStartTime, now(), null() ), EndTime)
| eval Status = if(isnull(Result), "GOING", Result)
| eval duration = (EndTime-StartTime) * 1000 
| where duration >= 0
| table StartTime precess Status duration
| sort StartTime

Thanks to all the people who answered!
:)

View solution in original post

0 Karma

Communicator

I found the solution myself.
This is the code:

 MY BASE SEARCH
| eval StartTime = if(LogType = "START", _time, null)
| eval EndTime = if(LogType = "END", _time, null)
| stats Latest(StartTime) as StartTime Latest(EndTime) as EndTime Latest(Result) as Result by precess id
| join type=left precess [search
  MY BASE SEARCH LogType ="START"
  | stats latest(_time) AS LatestStartTime by precess]
| eval EndTime = if(isnull(EndTime), if(StartTime = LatestStartTime, now(), null() ), EndTime)
| eval Status = if(isnull(Result), "GOING", Result)
| eval duration = (EndTime-StartTime) * 1000 
| where duration >= 0
| table StartTime precess Status duration
| sort StartTime

Thanks to all the people who answered!
:)

View solution in original post

0 Karma

Esteemed Legend

You basically had it, I think. I tweaked a few tiny typos and optimizations but it is essentially your search:

MY BASE SEARCH
| eval StartTime = if(LogType = "START", _time, null())
| eval EndTime = if(LogType = "END", _time, null())
| stats latest(StartTime) AS StartTime latest(EndTime) AS EndTime latest(Result) AS Result BY host process id
| eval EndTime = if(isnull(EndTime), now(), EndTime)
| eval Status = if(isnull(Result), "GOING", Result)
| eval duration = (EndTime-StartTime) * 1000
| search duration >= 0
| table StartTime process Status duration
| sort 0 StartTime
0 Karma

Communicator

It still creates overlapping 😞

0 Karma

Communicator

What is the difference between | search duration >= 0 and |where duration >= 0?

What is the difference between | sort 0 StartTime and | sort StartTime?

0 Karma

Esteemed Legend

Best practices: I use where when there is a field name on the RHS and seaerch when there is a literal. You should get in the habit of using sort 0 always; otherwise it is limited to 1000 results!

Communicator

What is RHS and what is literal?
So 0 menas no limit in the sort?

0 Karma

Esteemed Legend

RHS is Right-Hand-Side. Use where if there is a field name on both sides of the comparison operator; Use search if there is a field name on the LHS and a non-field-name (a literal value) on the RHS. For sort using sort 0 means unlimited.

Communicator

And using search instead of where in that case enhance the performance?

0 Karma

Esteemed Legend

No, this is for clarity and to avoid bugs. If you see my code using a single search test you know that the RHS is a literal and if you see where, you know that I am trying to have the RHS be a field name. The problem is sometimes a RHS thing (if it is written ambiguously) is intended to be a field name but that field name doesn't exists, and in such cases it will be treated as a literal. This fallback can cause unintended results.

SplunkTrust
SplunkTrust

Give this a try. It works best if you don't have too much data to be processed. It's basically taking both type of events separately. Doing so you're able to dedup on START event of a process to keep the last start.

index=yourindex sourcetype=yoursourcetype logType="START"
| dedup process | eval StartTime=_time
| append[search index=yourindex sourcetype=yoursourcetype logType="END" | eval EndTime=_time]
| sort 0 -_time
| stats Latest(StartTime) as StartTime Latest(EndTime) as EndTime Latest(Result) as Result by process id
 | eval EndTime = if(isnull(EndTime), now(), EndTime)
 | eval Status = if(isnull(Result), "GOING", Result)
 | eval duration = (EndTime-StartTime) * 1000
 | where (duration != "" OR duration >= 0) 
 | table StartTime process Status duration
 | sort StartTime
0 Karma

Communicator

Does not work, it keeps overlapping.

0 Karma

SplunkTrust
SplunkTrust

What is the timerange for which you run this query??
Could you also provide your base search to under the conditions to identify START AND STOP events.

0 Karma

Communicator

the base query only contains an index and a source.
The events are jsons with fisically written, between the other things, "START" or "END".

0 Karma

SplunkTrust
SplunkTrust

Time range??? This is important as if it's a daily process, and time range is 2 you'll have 2 instance of that process running.

0 Karma

Communicator

Its possible to have 2 instances running in the same day but never together, so if there is overlapping the fist one is been killed.

0 Karma

Splunk Employee
Splunk Employee

Hi andreafebbo,

Seems the transaction command is a very good candidate for your use case. Please try the following example search, assuming processId is the name of the field based on which to pair up your events:

... | transaction processId startswith="Start" endswith="End" maxevents=2 | where duration > 0 

For detailed information about the transaction command, please refer to documentation:
http://docs.splunk.com/Documentation/Splunk/6.5.1/SearchReference/Transaction

Hope this helps. Thanks!
Hunter

Communicator

I read the documentation and I think it might help me but is not easy to understand deeply how it works.
Could you please try to integrate it in my code so I can understand it better?

Thanks

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!