I have a feed coming in from DB connect for a list of groups and a count of members in that group at that time. This query runs every 24 hours and pulls in tomorrow's list of groups and members. Group and member counts can change.
Trying to only see the latest (i.e. most recent date time) records and show the groups / members. I only want to see the Groups and Members from the latest date/time (in this example 2018-11-06). I tried several revisions of
|stats latest... just not figuring this one out.
index=sample |table _time Groups Members
_time Groups Memebers 2018-11-06 12:00:00.000 Group1 21 2018-11-06 12:00:00.000 Group2 24 2018-11-06 12:00:00.000 Group3 10 2018-11-06 12:00:00.000 Group4 18 2018-11-06 12:00:00.000 Group5 20 2018-11-05 12:00:00.000 Group1 19 2018-11-05 12:00:00.000 Group2 20 2018-11-05 12:00:00.000 Group3 10 2018-11-05 12:00:00.000 Group4 15 2018-11-05 12:00:00.000 Group5 22 2018-11-04 12:00:00.000 Group1 20 2018-11-04 12:00:00.000 Group2 20 2018-11-04 12:00:00.000 Group3 10 2018-11-04 12:00:00.000 Group4 20 2018-11-04 12:00:00.000 Group5 20
If the import is every 24h, you can simply set your timepicker to "last 24h", right? That should only give you the events from the latest import?
Or am I completely missing something?
If that were the only use case... true. But There are times that we may need to run this search for the last 7 days, 30day, etc... and in those cases I have to make sure we're only showing the latest records in that timeframe selected. Plus we are going to move this import to hourly data pulls to be more in-tune with the changes happening. So this way, regardless of time selection, the data will reflect the "most current" for that time selection. 😃
If @richgalloway's methods don't get you what you want, you can try something like this, assuming they all receive the same timestamp...
your query that pulls the records | eventstats max(_time) as maxtime | where _time=maxtime
OR like this...
index=sample [index=sample | head 1 | table _time]
The first version copies the highest _time value to all the records, then throws away all records that don't have that time.
The second method first runs a subsearch to find the first record, which should have the highest _time value, and then returns only its _time field. The actual search then selects only the records with the matching _time.
Which one of those two will be more efficient will depend on data and architecture. In your use case, I'd go with the first version as more readable and understandable.
| eventstats max(_time) as maxtime
| where _time = maxtime
seems to have worked like a CHAMP. Luckliy the _time stamp is exactly the same for this data pull, so this works. Waiting for the next few data pulls to verify, but so far so good. Thank you all for the help!
You could try
dedup. It selects the events containing the most recent unique values for the given field(s). It will fail, however, if a group goes away. For example, if Group5 disappears on 11/07, dedup will select Groups 1-4 on 11-07 and Group5 from 11-06.
index=sample | dedup Groups |table _time Groups Members
... | stats latest(_time) as _time values(Members) as Members by Groups | ... but it suffers from the same deficiency as
This may work better. It looks for results with a timestamp newer than 0:00 today.
index=sample | where _time>relative_time(now(), "@d") | table _time Groups Members