Splunk Search

How to edit my search to avoid overlapping events from occurring in timeline?

Communicator

Hi all!

I have something which sends me the START and the STOP of some processes.

I have this search that creates a timeline chart and, if in some processes we have just the START event, assumes that this process must be GOING so it attributes to that process endtime = now (just for the sake of drawing the bar) .

MY BASE SEARCH
| eval StartTime = if(LogType = "START", _time, null)
| eval EndTime = if(LogType = "END", _time, null)
| stats Latest(StartTime) as StartTime Latest(EndTime) as EndTime Latest(Result) as Result by process id
| eval EndTime = if(isnull(EndTime), now(), EndTime)
| eval Status = if(isnull(Result), "GOING", Result)
| eval duration = (EndTime-StartTime) * 1000
| where (duration != "" OR duration >= 0) 
| table StartTime process Status duration
| sort StartTime

Anyway sometimes this process has been killed and does not send the STOP event.
The consequence of this is that a process that has been killed continues to draw this bar and if the same process started again, we have overlapping, like in the image:

alt text

What I would like to do is to exclude those instance of the process which have just a start without having an end but only if another instance of the process is already started.
(Instances of the process are identified in the search by the field "id")

Example:
Event1 -> processA id1 Start
Event2 -> processA id2 Start
Event3 -> processA id2 end
In the previous example, with those events a chart with overlapping (like the image) would come out.
So I would like the search to exclude the First start because there is a second start and no end, which means that the process was killed.

I hope I was clear in my explanation.
Please don't hesitate to ask me any further details.

Thank you

1 Solution

Communicator

I found the solution myself.
This is the code:

 MY BASE SEARCH
| eval StartTime = if(LogType = "START", _time, null)
| eval EndTime = if(LogType = "END", _time, null)
| stats Latest(StartTime) as StartTime Latest(EndTime) as EndTime Latest(Result) as Result by precess id
| join type=left precess [search
  MY BASE SEARCH LogType ="START"
  | stats latest(_time) AS LatestStartTime by precess]
| eval EndTime = if(isnull(EndTime), if(StartTime = LatestStartTime, now(), null() ), EndTime)
| eval Status = if(isnull(Result), "GOING", Result)
| eval duration = (EndTime-StartTime) * 1000 
| where duration >= 0
| table StartTime precess Status duration
| sort StartTime

Thanks to all the people who answered!
:)

View solution in original post

0 Karma

Communicator

I found the solution myself.
This is the code:

 MY BASE SEARCH
| eval StartTime = if(LogType = "START", _time, null)
| eval EndTime = if(LogType = "END", _time, null)
| stats Latest(StartTime) as StartTime Latest(EndTime) as EndTime Latest(Result) as Result by precess id
| join type=left precess [search
  MY BASE SEARCH LogType ="START"
  | stats latest(_time) AS LatestStartTime by precess]
| eval EndTime = if(isnull(EndTime), if(StartTime = LatestStartTime, now(), null() ), EndTime)
| eval Status = if(isnull(Result), "GOING", Result)
| eval duration = (EndTime-StartTime) * 1000 
| where duration >= 0
| table StartTime precess Status duration
| sort StartTime

Thanks to all the people who answered!
:)

View solution in original post

0 Karma

Esteemed Legend

You basically had it, I think. I tweaked a few tiny typos and optimizations but it is essentially your search:

MY BASE SEARCH
| eval StartTime = if(LogType = "START", _time, null())
| eval EndTime = if(LogType = "END", _time, null())
| stats latest(StartTime) AS StartTime latest(EndTime) AS EndTime latest(Result) AS Result BY host process id
| eval EndTime = if(isnull(EndTime), now(), EndTime)
| eval Status = if(isnull(Result), "GOING", Result)
| eval duration = (EndTime-StartTime) * 1000
| search duration >= 0
| table StartTime process Status duration
| sort 0 StartTime
0 Karma

Communicator

It still creates overlapping 😞

0 Karma

Communicator

What is the difference between | search duration >= 0 and |where duration >= 0?

What is the difference between | sort 0 StartTime and | sort StartTime?

0 Karma

Esteemed Legend

Best practices: I use where when there is a field name on the RHS and seaerch when there is a literal. You should get in the habit of using sort 0 always; otherwise it is limited to 1000 results!

Communicator

What is RHS and what is literal?
So 0 menas no limit in the sort?

0 Karma

Esteemed Legend

RHS is Right-Hand-Side. Use where if there is a field name on both sides of the comparison operator; Use search if there is a field name on the LHS and a non-field-name (a literal value) on the RHS. For sort using sort 0 means unlimited.

Communicator

And using search instead of where in that case enhance the performance?

0 Karma

Esteemed Legend

No, this is for clarity and to avoid bugs. If you see my code using a single search test you know that the RHS is a literal and if you see where, you know that I am trying to have the RHS be a field name. The problem is sometimes a RHS thing (if it is written ambiguously) is intended to be a field name but that field name doesn't exists, and in such cases it will be treated as a literal. This fallback can cause unintended results.

Revered Legend

Give this a try. It works best if you don't have too much data to be processed. It's basically taking both type of events separately. Doing so you're able to dedup on START event of a process to keep the last start.

index=yourindex sourcetype=yoursourcetype logType="START"
| dedup process | eval StartTime=_time
| append[search index=yourindex sourcetype=yoursourcetype logType="END" | eval EndTime=_time]
| sort 0 -_time
| stats Latest(StartTime) as StartTime Latest(EndTime) as EndTime Latest(Result) as Result by process id
 | eval EndTime = if(isnull(EndTime), now(), EndTime)
 | eval Status = if(isnull(Result), "GOING", Result)
 | eval duration = (EndTime-StartTime) * 1000
 | where (duration != "" OR duration >= 0) 
 | table StartTime process Status duration
 | sort StartTime
0 Karma

Communicator

Does not work, it keeps overlapping.

0 Karma

Revered Legend

What is the timerange for which you run this query??
Could you also provide your base search to under the conditions to identify START AND STOP events.

0 Karma

Communicator

the base query only contains an index and a source.
The events are jsons with fisically written, between the other things, "START" or "END".

0 Karma

Revered Legend

Time range??? This is important as if it's a daily process, and time range is 2 you'll have 2 instance of that process running.

0 Karma

Communicator

Its possible to have 2 instances running in the same day but never together, so if there is overlapping the fist one is been killed.

0 Karma

Splunk Employee
Splunk Employee

Hi andreafebbo,

Seems the transaction command is a very good candidate for your use case. Please try the following example search, assuming processId is the name of the field based on which to pair up your events:

... | transaction processId startswith="Start" endswith="End" maxevents=2 | where duration > 0 

For detailed information about the transaction command, please refer to documentation:
http://docs.splunk.com/Documentation/Splunk/6.5.1/SearchReference/Transaction

Hope this helps. Thanks!
Hunter

Communicator

I read the documentation and I think it might help me but is not easy to understand deeply how it works.
Could you please try to integrate it in my code so I can understand it better?

Thanks

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!