What is the best (the most efficient) way of finding last (the most recent) events for certain hosts?
For example, I have a log in which multiple hosts log their AV definition number. I want to compare this with something else so I just need the most recent log per each server.
Currently I can do this by creating transactions for hosts and then using mvcount and mvindex to extract the most recent value, but this sounds awfully inefficient to me. Is there a better way to do this? (the map command sounds perfect but I've never been able to get it to work ..).
Assuming you have fields extracted, have you tried:
YourSearch | stats first(DefNumber) by host
First will grab the first log that Splunk finds, which should always the most recent event, in this scenario.
If you just want the events, vs a table of extracted fields (or if you need multiple fields), you can use
YourSearch | dedup host, and if you know how many hosts you have, you might be able to make it finish faster with:
YourSearch | dedup host | head X, where X is the number of hosts you want to see.
Very nice! Created a multiple process monitor like
index=os sourcetype=ps host=dcagsm*
| eval gsaisrunning=if(match(raw, "GSA"), "GSA Running", "GSA Not Running")
| eval GSCisrunning=if(match(raw, "GSC"), "GSC Running", "GSC Not Running")
| eval GSMisrunning=if(match(raw, "GSM"), "GSM Running", "GSM Not Running")
| eval LHisrunning=if(match(raw, "LH"), "LH Running", "LH Not Running")
| stats first(gsaisrunning) as GSA,
first(GSCisrunning) as GSC,
first(GSMisrunning) as GSM,
first(LHisrunning) as LH, by host
| search "Not Running"
Expanding the timeframe of such a search increases its "cost" - its time to run. I.e. Splunk does not stop searching when it finds the most recent event - it keeps going through all of them. This doesn't feel efficient to me.
Why this matters: if the task is get the value of a field in the last event in a search no matter when that last event happened (30 seconds ago? Two years?) - one should search "all time", and this will take a long, long time to complete across a decent size dataset. If on the other hand
stats first could stop searching once it found the last event - that would dramatically decrease the cost.
This isn't exactly what you're asking for, but it may be a starting point. You can use
dedup to get the most recent "AV Definition" log event. And, from there, you can use
addinfo to add the current time (of the search) to the search. Using these, you may be able to perform a search that gives you in effect "How long, from now, has it been since this system reported its AV definition number?"
AV definition | addinfo | dedup 1 host sortby -_time | eval deltatime=((info_search_time-_time)/3600) | where deltatime > 24 | table _time,host,deltatime