Splunk Search
Highlighted

Subsearch with stats first not working as expected

Path Finder

My dataset has multiple events for a single time. Batches get loaded whenever it's sent by a 3rd party. I have 25 unique sites that gets data sent. I have a query which finds the most recent _time for the data by sitecode:

index=tgem sourcetype=tgem | stats first(_time) as _time by site_code

It works fine, and returns the expected value, which is 2014-01-12 for the site_code=BR. So far so good, so now I want all matching records for that _time, so I glue it into a subquery like this:

index=tgem sourcetype=tgem [search index=tgem sourcetype=tgem | stats first(_time) as _time by site_code | return 25 _time] | search site_code=BR

It returns all records for 2014-01-04, which is the earliest _time in my period (month to date). Any ideas? I've been pulling my hair out all day.

My requirement is to do some stats (min, max, avg) of a field in the dataset, by sitecode, for wahtever the most recent set of data is for that sitecode

Tags (3)
0 Karma
Highlighted

Re: Subsearch with stats first not working as expected

SplunkTrust
SplunkTrust

Use max(time) instead of first(time)

Highlighted

Re: Subsearch with stats first not working as expected

SplunkTrust
SplunkTrust

For debugging subsearches you can take a look at the job inspector. That will show you what the subsearch evaluated to, giving you a chance to narrow the problem to either the subsearch or the outer search.

A different thought, merge the site_code filter into the first search command for potentially greater performance - unrelated to the results though.

0 Karma
Highlighted

Re: Subsearch with stats first not working as expected

Legend

The first and last functions are relative to the order of events - when you are dealing with time, earliest and latest are usually better. Also, you need to return the site code as well as the time; you don't need the return comand.

index=tgem sourcetype=tgem [search  index=tgem sourcetype=tgem | stats latest(_time) as _time by site_code ]

Use the search job inspector to see how the subsearch values are returned.

Highlighted

Re: Subsearch with stats first not working as expected

Path Finder

So in the end two things happened.

  1. _time never worked for me as expected. I used another field in the data which uniquely identified the batch.
  2. I used format to customize the subsearch expansion. The job inspector was king to sorting that out. Seeing how the subsearch expanded.

The format sting I used was

format "(" "(" "AND" ")" "OR" ")"

Thanks to everyone for your replies.