I have a tstats query that pulls its data from an accelerated data model. I need to grab only the most up to date host event with the latest IP value. I cannot dedup in the data model root search itself as I need to keep track of _time to get point-in-time results as well.
Anyways, for the most current point-in-time IP value (right now), dedup is not working as intended. It's showing me the older value.
Query without dedup:
| tstats latest(_time) as _time FROM datamodel="Host_Info" WHERE nodename="hostinfo" hostname=bobs by hostinfo.hostname hostinfo.ip
Results (two values for ip)
hostninfo.hostname hostinfo.ip _time
bobs | 10.10.10.10 | 2021-10-22 19:55:03 |
bobs | 33.33.33.33 | 2021-10-22 21:23:06 |
Query with dedup:
| tstats latest(_time) as _time FROM datamodel="Host_Info" WHERE nodename="hostinfo" hostname=bobs by hostinfo.hostname hostinfo.ip | dedup hostname
Results (older value, not newer):
hostninfo.hostname hostinfo.ip _time
bobs | 10.10.10.10 | 2021-10-22 19:55:03 |
Why isn't dedup working correctly? If I dedup the actual indexed data, before it hits the datamodel, it works fine and shows me the latest hostname and IP.
https://docs.splunk.com/Documentation/Splunk/8.2.2/SearchReference/Dedup
Events returned by dedup are based on search order.
Ok. So I'm left wondering why the data coming back from the accelerated data model is out of order.
I haven't figured out why this is happening but the current workaround is to add a latest(hostname.ip) and removing hostname.ip from the by clause.
Not sure why latest() understands the timestamps but dedup doesn't. Maybe dedup works off of something else than _time?