Hello,
I have 6 reports that have to run in a specific order to get the results I need for the 6th report which is emailed out. (It's very convoluted but with our data structure, it's the only thing I could get working). The different automated report/searches use outputlookup and inputlookup with the previous reports. If one does not run in sequence this causes the rest to not pull the correct data.
I've noticed that inconsistently, some of the scheduled reports will run (per Job Inspector) but show 0 events. When I run the searches manually, there are events.
One particular search that isn't populating when scheduled (but is if I run manually) is #3 which is
|set diff [inputlookup PlatinumYesterday.csv |table account_number] [inputlookup PlatinumToday.csv |table account_number]
|outputlookup PlatinumDiff.csv
The two inputs it's pulling were populated in search 1 + 2 and these did run successfully with results on schedule.
I've played around with the timing and priority. I have them on a cron scheduled, running 10 minutes apart in an "Auto" window with priority set to "Highest". Each search is taking < 4 seconds so there shouldn't be an issue with timing. Also the time range of the reports are the last 24 hours.
Any ideas?
Okay, my guess is you are working too hard. In my opinion, set diff
is seldom the right way to go about a comparison anyway.
Your third search being empty indicates that the two prior files were identical at the moment that the third job ran. That would probably mean that the new Today file was not yet complete generating... except you seem to have eliminated that as a possibility.
Let's rethink your design. With set diff
, you don't know whether an account_number
might be new, or might have disappeared. Let's try that a different way:
Your search that produces the ALL records that would be in Yesterday or Today,
with at a minimum, _time and account_number
| bin _time as Day span=1d
| eventstats min(Day) as Yesterday max(Day) as Today
| stats values(Day) as Day values(Yesterday) as Yesterday values(Today) as Today by account_number
Given the above, every account_number
present either day is represented, and they either have two different Day values, or they have a single one. If it is a single one, then it is either Yesterday (meaning it disappeared today) or Today (meaning it is new today). Or perhaps we ended up with only one Day or more than two, we'll have to check.
| where mvcount(Day) = 1
| eval flag=case(Today=Yesterday, "ERROR - Only a single Day found",
Today>Yesterday+86400, "ERROR - More than two days found,
Day=Today, "New",
Day=Yesterday, "Missing,
true(), "ERROR - Mental failure, seek counseling")
You can remove the second test if the job will not run over weekends, for example.
Since you didn't say what functions the next three reports performed, I could not pseudocode them for you. I suspect that they process the set diff
results against the two data sets and extract information. If that is the case, then collapse the above as follows:
Your search that produces the all records that would be in Yesterday or Today
with at a minimum, _time and account_number
| bin _time as Day span=1d
| eventstats min(Day) as Yesterday max(Day) as Today
| eventstats values(Day) as Day by account_number
| where mvcount(Day) != 2
| eval flag=case(Today=Yesterday, "ERROR - Only a single Day found",
Today>Yesterday+86400, "ERROR - More than two days found,
Day=Today, "New",
Day=Yesterday, "Missing,
true(), "ERROR - Mental failure, seek counseling")
Using eventstats
gives you all the original information on the event, plus the additional info that has been copied over. You now have all the records that would have been in reports 4 and 5. Just format them for your user, and you are off to the races.
Okay, my guess is you are working too hard. In my opinion, set diff
is seldom the right way to go about a comparison anyway.
Your third search being empty indicates that the two prior files were identical at the moment that the third job ran. That would probably mean that the new Today file was not yet complete generating... except you seem to have eliminated that as a possibility.
Let's rethink your design. With set diff
, you don't know whether an account_number
might be new, or might have disappeared. Let's try that a different way:
Your search that produces the ALL records that would be in Yesterday or Today,
with at a minimum, _time and account_number
| bin _time as Day span=1d
| eventstats min(Day) as Yesterday max(Day) as Today
| stats values(Day) as Day values(Yesterday) as Yesterday values(Today) as Today by account_number
Given the above, every account_number
present either day is represented, and they either have two different Day values, or they have a single one. If it is a single one, then it is either Yesterday (meaning it disappeared today) or Today (meaning it is new today). Or perhaps we ended up with only one Day or more than two, we'll have to check.
| where mvcount(Day) = 1
| eval flag=case(Today=Yesterday, "ERROR - Only a single Day found",
Today>Yesterday+86400, "ERROR - More than two days found,
Day=Today, "New",
Day=Yesterday, "Missing,
true(), "ERROR - Mental failure, seek counseling")
You can remove the second test if the job will not run over weekends, for example.
Since you didn't say what functions the next three reports performed, I could not pseudocode them for you. I suspect that they process the set diff
results against the two data sets and extract information. If that is the case, then collapse the above as follows:
Your search that produces the all records that would be in Yesterday or Today
with at a minimum, _time and account_number
| bin _time as Day span=1d
| eventstats min(Day) as Yesterday max(Day) as Today
| eventstats values(Day) as Day by account_number
| where mvcount(Day) != 2
| eval flag=case(Today=Yesterday, "ERROR - Only a single Day found",
Today>Yesterday+86400, "ERROR - More than two days found,
Day=Today, "New",
Day=Yesterday, "Missing,
true(), "ERROR - Mental failure, seek counseling")
Using eventstats
gives you all the original information on the event, plus the additional info that has been copied over. You now have all the records that would have been in reports 4 and 5. Just format them for your user, and you are off to the races.
Yes definitely working too hard. You are a miracle worker. I played around with eventstats earlier on and couldn't get it to work for me. Added a few missing quotes and boom, 6 searches condensed into one!
The purpose of this report is to show accounts that were present at indexing yesterday but deleted over the course of the day and therefore not present today at time of indexing (once deleted they are gone from the database). The subsequent searches after finding the difference between yesterday and today were meant to decipher whether they were new or missing and then the last search just populates the list of those that were deleted. Your search did all of that.
Final search in case curious:
index="clients" clientType="Demo Account" earliest=-1d@d latest=now
| bin _time as Day span=1d
| eventstats min(Day) as Yesterday max(Day) as Today
| stats values(Day) as Day values(Yesterday) as Yesterday values(Today) as Today by account_number, name
| where mvcount(Day) = 1
| eval Flag=case(Today=Yesterday, "ERROR - Only a single Day found",
Today>Yesterday+86400, "ERROR - More than two days found",
Day=Today, "New",
Day=Yesterday, "Missing",
true(), "ERROR - Mental failure, seek counseling")
| where Flag = "Missing"
| fields - "Yesterday" "Today" "Day"
| rename name as "Location Name", account_number as "Account Number"
As those who are getting the results only need to know when there are "Missing" results I've added an addition |where condition and just setup an alert instead of a report.
Thank you so much for your help! 😄
-Megan
@meganarellano - Thank you so much for presenting your final solution. It's good for the community to be able to see exactly what works! Dal
By the way, most places, I probably wouldn't put "seek counseling" in production, even if I expect it never to be reached. Depends on the sense of humor of your colleagues, though. 😉