Reporting

Why is the scheduled report showing 0 results but running the same search run manually and yielding events?

meganarellano
Explorer

Hello,

I have 6 reports that have to run in a specific order to get the results I need for the 6th report which is emailed out. (It's very convoluted but with our data structure, it's the only thing I could get working). The different automated report/searches use outputlookup and inputlookup with the previous reports. If one does not run in sequence this causes the rest to not pull the correct data.

I've noticed that inconsistently, some of the scheduled reports will run (per Job Inspector) but show 0 events. When I run the searches manually, there are events.

One particular search that isn't populating when scheduled (but is if I run manually) is #3 which is
|set diff [inputlookup PlatinumYesterday.csv |table account_number] [inputlookup PlatinumToday.csv |table account_number]
|outputlookup PlatinumDiff.csv

The two inputs it's pulling were populated in search 1 + 2 and these did run successfully with results on schedule.

I've played around with the timing and priority. I have them on a cron scheduled, running 10 minutes apart in an "Auto" window with priority set to "Highest". Each search is taking < 4 seconds so there shouldn't be an issue with timing. Also the time range of the reports are the last 24 hours.

Any ideas?

1 Solution

DalJeanis
Legend

Okay, my guess is you are working too hard. In my opinion, set diff is seldom the right way to go about a comparison anyway.

Your third search being empty indicates that the two prior files were identical at the moment that the third job ran. That would probably mean that the new Today file was not yet complete generating... except you seem to have eliminated that as a possibility.

Let's rethink your design. With set diff, you don't know whether an account_number might be new, or might have disappeared. Let's try that a different way:

  Your search that produces the ALL records that would be in Yesterday or Today, 
     with at a minimum,  _time and account_number
  | bin _time as Day span=1d
  | eventstats min(Day) as Yesterday max(Day) as Today
  | stats values(Day) as Day values(Yesterday) as Yesterday values(Today) as Today by account_number

Given the above, every account_number present either day is represented, and they either have two different Day values, or they have a single one. If it is a single one, then it is either Yesterday (meaning it disappeared today) or Today (meaning it is new today). Or perhaps we ended up with only one Day or more than two, we'll have to check.

  | where mvcount(Day) = 1
  | eval flag=case(Today=Yesterday, "ERROR - Only a single Day found", 
        Today>Yesterday+86400, "ERROR - More than two days found,
        Day=Today, "New", 
        Day=Yesterday, "Missing, 
        true(), "ERROR - Mental failure, seek counseling")

You can remove the second test if the job will not run over weekends, for example.


Since you didn't say what functions the next three reports performed, I could not pseudocode them for you. I suspect that they process the set diff results against the two data sets and extract information. If that is the case, then collapse the above as follows:

  Your search that produces the all records that would be in Yesterday or Today
     with at a minimum,  _time and account_number
  | bin _time as Day span=1d
  | eventstats min(Day) as Yesterday max(Day) as Today
  | eventstats values(Day) as Day by account_number
  | where mvcount(Day) != 2
  | eval flag=case(Today=Yesterday, "ERROR - Only a single Day found", 
        Today>Yesterday+86400, "ERROR - More than two days found,
        Day=Today, "New", 
        Day=Yesterday, "Missing, 
        true(), "ERROR - Mental failure, seek counseling")

Using eventstats gives you all the original information on the event, plus the additional info that has been copied over. You now have all the records that would have been in reports 4 and 5. Just format them for your user, and you are off to the races.

View solution in original post

DalJeanis
Legend

Okay, my guess is you are working too hard. In my opinion, set diff is seldom the right way to go about a comparison anyway.

Your third search being empty indicates that the two prior files were identical at the moment that the third job ran. That would probably mean that the new Today file was not yet complete generating... except you seem to have eliminated that as a possibility.

Let's rethink your design. With set diff, you don't know whether an account_number might be new, or might have disappeared. Let's try that a different way:

  Your search that produces the ALL records that would be in Yesterday or Today, 
     with at a minimum,  _time and account_number
  | bin _time as Day span=1d
  | eventstats min(Day) as Yesterday max(Day) as Today
  | stats values(Day) as Day values(Yesterday) as Yesterday values(Today) as Today by account_number

Given the above, every account_number present either day is represented, and they either have two different Day values, or they have a single one. If it is a single one, then it is either Yesterday (meaning it disappeared today) or Today (meaning it is new today). Or perhaps we ended up with only one Day or more than two, we'll have to check.

  | where mvcount(Day) = 1
  | eval flag=case(Today=Yesterday, "ERROR - Only a single Day found", 
        Today>Yesterday+86400, "ERROR - More than two days found,
        Day=Today, "New", 
        Day=Yesterday, "Missing, 
        true(), "ERROR - Mental failure, seek counseling")

You can remove the second test if the job will not run over weekends, for example.


Since you didn't say what functions the next three reports performed, I could not pseudocode them for you. I suspect that they process the set diff results against the two data sets and extract information. If that is the case, then collapse the above as follows:

  Your search that produces the all records that would be in Yesterday or Today
     with at a minimum,  _time and account_number
  | bin _time as Day span=1d
  | eventstats min(Day) as Yesterday max(Day) as Today
  | eventstats values(Day) as Day by account_number
  | where mvcount(Day) != 2
  | eval flag=case(Today=Yesterday, "ERROR - Only a single Day found", 
        Today>Yesterday+86400, "ERROR - More than two days found,
        Day=Today, "New", 
        Day=Yesterday, "Missing, 
        true(), "ERROR - Mental failure, seek counseling")

Using eventstats gives you all the original information on the event, plus the additional info that has been copied over. You now have all the records that would have been in reports 4 and 5. Just format them for your user, and you are off to the races.

meganarellano
Explorer

Yes definitely working too hard. You are a miracle worker. I played around with eventstats earlier on and couldn't get it to work for me. Added a few missing quotes and boom, 6 searches condensed into one!

The purpose of this report is to show accounts that were present at indexing yesterday but deleted over the course of the day and therefore not present today at time of indexing (once deleted they are gone from the database). The subsequent searches after finding the difference between yesterday and today were meant to decipher whether they were new or missing and then the last search just populates the list of those that were deleted. Your search did all of that.

Final search in case curious:

index="clients" clientType="Demo Account" earliest=-1d@d latest=now
| bin _time as Day span=1d
| eventstats min(Day) as Yesterday max(Day) as Today
| stats values(Day) as Day values(Yesterday) as Yesterday values(Today) as Today by account_number, name
| where mvcount(Day) = 1
| eval Flag=case(Today=Yesterday, "ERROR - Only a single Day found", 
         Today>Yesterday+86400, "ERROR - More than two days found",
         Day=Today, "New", 
         Day=Yesterday, "Missing", 
         true(), "ERROR - Mental failure, seek counseling")
| where Flag = "Missing"
| fields - "Yesterday" "Today" "Day"
| rename name as "Location Name", account_number as "Account Number"

As those who are getting the results only need to know when there are "Missing" results I've added an addition |where condition and just setup an alert instead of a report.

Thank you so much for your help! 😄

-Megan

DalJeanis
Legend

@meganarellano - Thank you so much for presenting your final solution. It's good for the community to be able to see exactly what works! Dal

By the way, most places, I probably wouldn't put "seek counseling" in production, even if I expect it never to be reached. Depends on the sense of humor of your colleagues, though. 😉

Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...