The below query is the original one, it was working fine in the lower environments, once moved to the production with the over head, i m facing very bad performance and jobs are queuing.
I am trying to optimize the query to use join instead of OR between 2 different indexes but it is not extracting the messages i m looking for .
My original Query
(index=A source=test message="PUBLISH message recieved" ) OR (index=B sourcetype=test2 "Message Successfully Processed to x")
| eval Vin=if(isNull(clientId),Vin,clientId)
| eval activityId=if(isNull(activityId),ActivityID,activityId)
| eval Entry= if(match(message,"PUBLISH message recieved"),_time,NULL)
| eval Exit=if(match(Message,"Message Successfully Processed to x"),_time,NULL)
|stats min(Entry) as Entry,max(Exit) as Exit, values(Vin) as Vin, values(deviceid) as ESN by activityId
| rename activityId as traceid
| fillnull value="NULL" | where Exit!="NULL"
|eval duration=Exit-Entry | eval Durations= if(duration<0,0,duration)
| convert ctime(Entry) |convert ctime(Exit) |table traceid, Entry,Exit,Vin,ESN,Durations | rename traceid as "Trace Id"
The one i m trying to build with the join command
(index=A source=test message="PUBLISH message recieved" )
|join activityId [search index=B sourcetype="test2 " "Message Successfully Processed to x"
| eval Vin=if(isNull(clientId),Vin,clientId)
| eval activityId=if(isNull(activityId),ActivityID,activityId) ]
| eval Entry= if(match(message,"PUBLISH message recieved"),_time,NULL)
| eval Exit=if(match(Message,"Message Successfully Processed to x"),_time,NULL)
|stats min(Entry) as Entry,max(Exit) as Exit, values(Vin) as Vin, values(deviceid) as ESN by activityId
| rename activityId as traceid
| fillnull value="NULL" | where Exit!="NULL"
|eval duration=Exit-Entry | eval Durations= if(duration<0,0,duration)
| convert ctime(Entry) |convert ctime(Exit) |table traceid, Entry,Exit,Vin,ESN,Durations | rename traceid as "Trace Id"
Thanks and appreciate the support!
join
is almost always slower than a direct combined search. The way you've done it, it should be much slower, except that as an inner join, it will drop all records which do not have members on both sides of the join.
Just to verify - over the time range that you normally run this report, are duplicate traceids / activityids ever expected?
Here's a version of your original with some slight performance enhancements. You should look at this, and incorporate most of these changes into your join if you proceed that way. For instance, add the fields
command at the top, and use the index=
rather than the match
code to test which record you have. An index=
command will be able to reject the test at the first non-matching character, whereas match will have to scan the entire message field.
(index=A source=test message="PUBLISH message recieved" ) OR
(index=B sourcetype=test2 "Message Successfully Processed to x")
| fields index, Vin, clientId, activityId, ActivityID, deviceid
| eval Vin=coalesce(clientId,Vin)
| eval traceid=coalesce(activityId,ActivityID)
| eval Entry = if(index="A",_time,null())
| eval Exit = if(index="B",_time,null())
| stats min(Entry) as Entry, max(Exit) as Exit, values(Vin) as Vin, values(deviceid) as ESN by traceid
| where isnotnull(Exit)
| eval Durations= if(Exit<Entry,0,Exit-Entry)
| convert ctime(Entry)
| convert ctime(Exit)
| table traceid, Entry, Exit, Vin, ESN, Durations
| rename traceid as "Trace Id"
join
is almost always slower than a direct combined search. The way you've done it, it should be much slower, except that as an inner join, it will drop all records which do not have members on both sides of the join.
Just to verify - over the time range that you normally run this report, are duplicate traceids / activityids ever expected?
Here's a version of your original with some slight performance enhancements. You should look at this, and incorporate most of these changes into your join if you proceed that way. For instance, add the fields
command at the top, and use the index=
rather than the match
code to test which record you have. An index=
command will be able to reject the test at the first non-matching character, whereas match will have to scan the entire message field.
(index=A source=test message="PUBLISH message recieved" ) OR
(index=B sourcetype=test2 "Message Successfully Processed to x")
| fields index, Vin, clientId, activityId, ActivityID, deviceid
| eval Vin=coalesce(clientId,Vin)
| eval traceid=coalesce(activityId,ActivityID)
| eval Entry = if(index="A",_time,null())
| eval Exit = if(index="B",_time,null())
| stats min(Entry) as Entry, max(Exit) as Exit, values(Vin) as Vin, values(deviceid) as ESN by traceid
| where isnotnull(Exit)
| eval Durations= if(Exit<Entry,0,Exit-Entry)
| convert ctime(Entry)
| convert ctime(Exit)
| table traceid, Entry, Exit, Vin, ESN, Durations
| rename traceid as "Trace Id"
Thanks DalJeanis so much this was helpful
I have made little changes on the query as well with your performance recommendation, but i want to perform timechart for the 95th perc of the Total Duration which is not working
index=A source=test
message="PUBLISH message recieved" OR logtype="Dog.Exit"
| eval Entry= if(match(message,"PUBLISH message recieved"),_time,NULL)
| eval Exit= if(match(logtype,"Dog.Exit"),_time,NULL)
| eval traceid=coalesce(traceid,activityId) | eval esn=coalesce(esn,deviceid)
|stats min(Entry) as Entry max(Exit) as Exit values(esn) as ESN by traceid|where isnotnull(Exit)
|eval VsdnDurations=if(Exit
@delgendy - your code cut off, and I'm not sure what your question is.
A percentile is only valid across a population, so you probably just need to do something like eventstats
...
| eventstats perc95(Durations) as p95Durations (((by field???)))
But what you want to chart against that, I'm not really clear on. If you want each time period's 95th percentile, then you'll need to bin
the appropriate time field (Entry
, probably) and then run eventstats on that. If you just want a horizontal line for the 95th percentile across the entire time period, that's another method. And if you want a running calculation of the 95th percentile, changing daily, then that's a third one (probably streamstats).
It would probably be most clear to start a new question devoted to what you are trying to achieve there.
By the way, you can probably use =
instead of match()
with the logtype
comparison.
Why use join? It's slow.
cause i want to search first in the index after the search, get the activityId then search with those in the first index .
i think i found the solution already , i just need to correlate between the 2 indexes by | fields message activityId so i can connect the 2 indexes . This worked for me . Thanks !