Splunk Search

Subsearch with stats first not working as expected

robertlabrie
Path Finder

My dataset has multiple events for a single _time. Batches get loaded whenever it's sent by a 3rd party. I have 25 unique sites that gets data sent. I have a query which finds the most recent _time for the data by site_code:

index=tgem sourcetype=tgem | stats first(_time) as _time by site_code

It works fine, and returns the expected value, which is 2014-01-12 for the site_code=BR. So far so good, so now I want all matching records for that _time, so I glue it into a subquery like this:

index=tgem sourcetype=tgem [search index=tgem sourcetype=tgem | stats first(_time) as _time by site_code | return 25 _time] | search site_code=BR

It returns all records for 2014-01-04, which is the earliest _time in my period (month to date). Any ideas? I've been pulling my hair out all day.

My requirement is to do some stats (min, max, avg) of a field in the dataset, by site_code, for wahtever the most recent set of data is for that site_code

Tags (3)
0 Karma

robertlabrie
Path Finder

So in the end two things happened.

  1. _time never worked for me as expected. I used another field in the data which uniquely identified the batch.
  2. I used format to customize the subsearch expansion. The job inspector was king to sorting that out. Seeing how the subsearch expanded.

The format sting I used was

format "(" "(" "AND" ")" "OR" ")"

Thanks to everyone for your replies.

lguinn2
Legend

The first and last functions are relative to the order of events - when you are dealing with time, earliest and latest are usually better. Also, you need to return the site code as well as the time; you don't need the return comand.

index=tgem sourcetype=tgem [search  index=tgem sourcetype=tgem | stats latest(_time) as _time by site_code ]

Use the search job inspector to see how the subsearch values are returned.

martin_mueller
SplunkTrust
SplunkTrust

For debugging subsearches you can take a look at the job inspector. That will show you what the subsearch evaluated to, giving you a chance to narrow the problem to either the subsearch or the outer search.

A different thought, merge the site_code filter into the first search command for potentially greater performance - unrelated to the results though.

0 Karma

somesoni2
Revered Legend

Use max(_time) instead of first(_time)

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...