Splunk Search
Highlighted

How do you run a second search based on the value/results of a first search?

Explorer

My goal for this search is to find if a file was not imported. If the file is imported "Could not find a file in the" text will be present. If the file is imported, "Moved" text will be present. Our system looks to import the file consistantly. So, at 07:30:00 the file may be imported so "Moved" will appear in the log file. But at 08:00:00 the system will look to import the file, but since there is nothing to import (because it was imported at 07:30:00) "Could not file the file in the" will be written to the log. I am trying to write a search that if "Moved" is present then I do not even want to know the count of "Could not find the file in the". However, if "Moved" is never found in the log file during the specified time frame, then I do want to run a search and find all instances of "Could not find the file in the". So in short, I am seeking help of writing a search that does not include false negatives.

Below is the search I have put together. I attemped to search for the expected files and find a count. Then in the second query, if the count was less than expected (so I did not find the number of imported files as expected), then I would look for "Could not find the file in the", the files that were not found. This search is not working as I have found I can not pass a count's value into a sub-search/second query. So "FilesImported" from the first query does not have the value I expect in the second query's "WHERE" clause. I have shortened the query for just two instances of a customers (instead of all 20+) as each customer has different files that are to be imported at different times on different days.

source=*D:\\redacted\\redacted* source=*IH_Daily\\redacted* Moved earliest=-48h@h
| eval dayBuffer=strftime(now(), "%d") | eval day=ltrim(tostring(dayBuffer),"0") 
| eval todayBuffer=strftime(now(), "%m_"+day+"_%Y") | eval today=ltrim(tostring(todayBuffer),"0") | where like(source,"%10_19_2017%")
| eval time=strftime(round(strptime(file_Time, "%I:%M:%S %P")), "%H:%M:%S")
| eval dow=strftime(strptime(file_Date, "%m/%d/%Y"), "%A")
| rex field=source "redacted\\\+(?<ClientID>[^\\\]+)"
| where ClientID="$clientID$"
| where (like(source,"%"."client1"."%") AND (dow!="Sunday" AND dow!="Monday") AND (time>"07:27:00" AND time<"08:27:00") AND (FileImported="file1")
OR (like(source,"%"."client2"."%")) AND (FilesImported!=2) AND (dow!="Sunday" AND dow!="Tuesday") AND (time>"09:00:00"AND time<"11:30:00") AND ((FileImported="file1") OR (FileImported="file2") OR (FileImported="file3"))))
| stats count as FilesImported 

| append [ search source=*D:\\redacted\\redacted* source=*IH_Daily\\redacted* ("Could not find a file in the") earliest=-48h@h
| eval dayBuffer=strftime(now(), "%d") | eval day=ltrim(tostring(dayBuffer),"0") 
| eval todayBuffer=strftime(now(), "%m_"+day+"_%Y") | eval today=ltrim(tostring(todayBuffer),"0") | where like(source,"%10_19_2017%")
| eval time=strftime(round(strptime(file_Time, "%I:%M:%S %P")), "%H:%M:%S")
| eval dow=strftime(strptime(file_Date, "%m/%d/%Y"), "%A")
| rex field=source "redacted\\\+(?<ClientID>[^\\\]+)"
| where ClientID="$clientID$"
| where ((like(source,"%"."client1"."%")) AND (FilesImported<1) AND (dow!="Sunday" AND dow!="Monday") AND (time>"07:27:00"AND time<"08:27:00") AND (file_Missing="file1")
OR (like(source,"%"."client2"."%")) AND (FilesImported!<3) AND (dow!="Sunday" AND dow!="Tuesday") AND (time>"09:00:00"AND time<"11:30:00") AND ((file_Missing="file1") OR (file_Missing="file2") OR (file_Missing="file3"))) 
| stats count as "File Missed" ]
|table "File Missed"
0 Karma
Highlighted

Re: How do you run a second search based on the value/results of a first search?

I'm having trouble parsing the sample code you posted - not sure how all the dates and such are being used. But at a high level - does this describe your situation:
- Each event (either "Moved" or "Could not find...") contains a file name
- You need to locate the files names of files that have "Could not find..." events but no "Moved" events

Is that correct? Or maybe there is a ClientID that also needs to be taken into account?

0 Karma
Highlighted

Re: How do you run a second search based on the value/results of a first search?

Explorer

Sorry. The dates and times in the first several lines are pulled from the logfile and reformatted for searching. The point of those is to pull only certain logfiles, so you can remove those if it makes parsing easier as I can add it back as needed since it does not pertain to the question at hand.

  • Each event (either "Moved" or "Could not find...") contains a file name:
    Yes. The event contains "Moved file1" and "Could not find a file in the file1 path"

  • You need to locate the files names of files that have "Could not find..." events but no "Moved" events:
    Yes, This is correct. Just fyi, "Moved" event would be before "Could not find..." event.

Is that correct? Or maybe there is a ClientID that also needs to be taken into account?:
This does need to be taken into account as well, but only to because each client should be expecting certain files imported at different time. In my search example:
- client1 is expecting ONLY file1 on Tues-Sat from 07:27:00 - 08:27:00
- client2 is expecting file1, file2, and file 3 Mon., Wed.-Sat. from 09:00:00-11:30:00.

There are 20+ other clients with different scenarios that I did not add in order to make the search more readable.

0 Karma
Highlighted

Re: How do you run a second search based on the value/results of a first search?

If you only need to locate all files by, for example, file_name for which there are "Could not find..." events but no "Moved" events (and let's say those values are in a field called message), this might work:
index=something source=yoursource "Could not find a file" OR "Moved" | eval moved=if(match(message, "Moved"), 1, NULL) | eventstats sum(moved) AS moved BY file_name | where isnull(moved)

This will create a field called moved and give it a value of 1 if the event contains "Moved" in the message field. It will then sum up all the number of "Moved" events for each file_name and apply that sum to each event with the same file_name. Finally, it filters out all events where the moved field contains any value - leaving you with only the events for which there is a file_name with no moved events.

0 Karma
Highlighted

Re: How do you run a second search based on the value/results of a first search?

Explorer

The problem with this is that a file has potentially missed an import, and thus is an error, if there is an event that is contains "Could not find a file in the" not just because it the "Moved" event hasn't occurred yet.
For example, client1 in the above search expects file1 between 07:27:00-08:27:00. If it is 07:30:00 and there is no "Moved" event that does not mean there was an error importing the file. However, if there is a "Could not find a file in the" event with NO "Moved" event, then there was an issue importing an expected file and I would like a count. However, if there IS a "Moved" event but then there is a "Could not find a file in the" event (this happens just based on our system), I do not want to see a count of the "Could not find a file in the" events.

I do like your logic here though. Trying to find "Moved" events based on filename and find where we are missing one we expected to find. Maybe we can incorporate this same logic with "Could not find a file in the" events based on filename too and compare the two? Just brain-storming here.

Thanks for your comment!

0 Karma
Highlighted

Re: How do you run a second search based on the value/results of a first search?

So the base search here looks only for events that either have "Could not find file" or "Moved" - meaning the end result will be "Could not find..." events that have no corresponding "Moved" events. It won't list any events for files that that:
1. Have no events of either "Could not find..." or "Moved"
2. Have both "Could not find..." and "Moved"
3. Have only "Moved"

I still want to help, but I'm trying to read your comment and understand what use case this solution is missing. Can you help clarify?

0 Karma
Highlighted

Re: How do you run a second search based on the value/results of a first search?

Explorer

The main use case is:
I want to see events that have a "Could not find a file in the" with no corresponding "Moved" event.

The three cases you listed above all work for me they will achieve my main case above, just in a different way.

0 Karma