Splunk Search

Why does the appendcols command generate an incorrect stats count when searching a period of time greater than four hours?

Explorer

The following search produces the expected result when querying the "Last 4 hours" timed period. However, the stats count for Important_Events actually decreases when querying the "Last 24 hours" time period. Can someone please explain why this occurs?

index="\*"
|stats count as Total_Events
|appendcols
      [search (index="windows" source="WinEventLog:Security" EventCode="4624" Logon_Type="2" OR Logon_Type="7" OR Logon_Type="10")
      OR
      (index="windows" source="WinEventLog:Security" EventCode="4672" Account_Name="\*" NOT Account_Name="SYSTEM")
      OR
      (index="network")
      | stats count as Important_Events]
| eval Important_Events_Percentage=(Important_Events/Total_Events)*100
| eval Unimportant_Events=Total_Events-Important_Events
| eval Unimportant_Events_Percentage=100-Important_Events_Percentage
| fields Unimportant_Events, Unimportant_Events_Percentage, Important_Events, Important_Events_Percentage

The crazy thing is that I can run the appended search by itself and return the appropriate result, regardless of which time period I search, but this won't allow me to evaluate the total event count of the parent search:

(index="windows" source="WinEventLog:Security" EventCode="4624" Logon_Type="2" OR Logon_Type="7" OR Logon_Type="10")
OR
(index="windows" source="WinEventLog:Security" EventCode="4672" Account_Name="*" NOT Account_Name="SYSTEM")
OR
(index="network")
| stats count
0 Karma

Revered Legend

Try something like this (no subsearch)

 index="*"
| eval Important=if((index="windows" AND source="WinEventLog:Security" AND EventCode="4624" AND (Logon_Type="2" OR Logon_Type="7" OR Logon_Type="10"))OR (index="windows" AND source="WinEventLog:Security" AND EventCode="4672" AND Account_Name="*" AND NOT Account_Name="SYSTEM") OR   (index="network"),1,0)
  | stats count as Total_Events sum(Important) as Important_Events
 | eval Important_Events_Percentage=(Important_Events/Total_Events)*100
 | eval Unimportant_Events=Total_Events-Important_Events
 | eval Unimportant_Events_Percentage=100-Important_Events_Percentage
 | fields Unimportant_Events, Unimportant_Events_Percentage, Important_Events, Important_Events_Percentage
0 Karma

Explorer

Thank you again for the great response @somesoni2 and a hat tip to @sundareshr as well! This looks like the appropriate solution, but for some inexplicable reason the count for the following two queries differs, even when configured over the same static time period:

1) This search generates a count of around 14,500:

(index="windows" AND source="WinEventLog:Security" AND EventCode="4624" AND (Logon_Type="2" OR Logon_Type="7" OR Logon_Type="10"))
OR
(index="windows" AND source="WinEventLog:Security" AND EventCode="4672" AND Account_Name="*" AND NOT Account_Name="SYSTEM")
OR
(index="network")
| stats count

2) Whereas this search generates a count for Important_Events of around 13,000:

index="*"
| eval Important=if((index="windows" AND source="WinEventLog:Security" AND EventCode="4624" AND (Logon_Type="2" OR Logon_Type="7" OR Logon_Type="10"))
OR
(index="windows" AND source="WinEventLog:Security" AND EventCode="4672" AND Account_Name="*" AND NOT Account_Name="SYSTEM")
OR
(index="network"),1,0)
| stats count as Total_Events sum(Important) as Important_Events
| eval Important_Events_Percentage=(Important_Events/Total_Events)*100
| eval Unimportant_Events=Total_Events-Important_Events
| eval Unimportant_Events_Percentage=100-Important_Events_Percentage
| fields Unimportant_Events, Unimportant_Events_Percentage, Important_Events, Important_Events_Percentage

It's practically the exact same search, yet it renders two different results; any ideas?

Also, this search takes an incredible amount of time to process in comparison to the original appendcols subsearch method. That is to say, when the original subsearch method actually ran successfully.

0 Karma

Revered Legend

How much time your first search is taking? The subsearch auto-finalized after time limit is reached and the results may be incomplete. So check the job inspector.

Explorer

I had previously reviewed the job inspector and noticed that most of my queries throw a warning of "Configuration initialization for ... took longer than expected when dispatching a search..." probably due to hardware limitations, but I had completely missed the INFO flag which shows "[subsearch]: Search auto-finalized after time limit (60 seconds) reached".

Good catch @somesoni2! Do you happen to know of a method for skipping the "auto-finalize" functionality of sub-searches? My main concern is the accuracy of the report going forward.

*EDIT: Specifically, is there something I can append to my search string? I'm not sure I'm entirely comfortable editing the limits.conf file.

0 Karma

Legend

You may want to consider creating a summary index to store the count at some frequency and use the counts from there to compute percentage. Or look at tags for the subset and use that tag to compute count without the subsearch.

index=* | stats count(eval(tag::important)) as Important_Event count as All_Events | ...
0 Karma

Legend

Do you see any errors in the job inspector? Subsearches have limitations, that could affect your final result

State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!