Doing some long-tail analysis and I am running in Fast Mode but the query for 24 hours is taking a long time.
Please let me know if there is a way to speed this up. Not that familiar with TSTATS but if there is an option, please let me know.
index=wineventlog sourcetype="WinEventLog" |stats count by EventCode TaskCategory | where count<10 |sort count
You could either add more indexers or use a summary index to spread the search cost across time.
How many events do you have in a 24 hour period? What's your Splunk setup look like? Are your indexer(s) on physical blades or VM's?
Lots of people will virtuallize indexers which can get ugly. Splunk is all about IOPs
Thank you for the suggestion, yes we are adding more indexers, and looking into summary indexes too
Hmmm. Interesting. If the two fields you care about are extracted at index time, then use tstats. Other than that, it's a matter of learning how to finesse your data.
I don't know your data, but there have to be certain combinations of EventCode and TaskCategory that make up the bulk of your data. If you throw out those common transactions, then the rare transactions should stand out better, and the rest of the calculations should be much faster.
What you could do is, run your search for, say, a fifteen minute period, select all those common combinations that are found to have more than ten examples, and add those to a lookup table.
The second search would format that input lookup table as
(EventCode=X1 AND NOT(TaskCategory=Y11 OR TaskCategory=Y12 OR TaskCategory=Y13) OR (EventCode=X2 AND NOT(TaskCategory=Y21 OR TaskCategory=Y22 OR TaskCategory=Y23 OR TaskCategory=Y23) OR
and so on. EventCodes that are rare in themselves would probably get a different search (off the top of my head).
Another possibility could be that @packet_hunter is putting all his data into the same index, so he's having to sort through 100M+ events to pick out the sec logs. We need more info on his setup before making a good recommendation
Thank you both for your comments.
Yes in a 24hr period I have about 200 million security events.
Now -dumping the highest count and looking for the specific event codes of interest is an option I have considered, but there are times we need to see the totals of all codes within that day. Like if 10,000 4624 or 4625 from a specific security_id in an hour, then there is something worth looking at...
FYI, the index is dedicated but does contain application events, system events, as well as Security events. The majority of events are Security events. I probably would have set this up differently but I have to deal with what is in place.
@DalJeanis would you mind sharing a tstats example I might use for this scenario, I will concurrently try my hand at the syntax
I see counts of events with
|tstats count where index=wineventlog by sourcetype
Just having trouble grabbing the EventCode field values....
We have done the following things after doing R & D.
1.Changed date range from real time to today.
2.Set dashboard refresh time to every 5 minutes.
5.Scheduled this search every 5 minutes so it will save in the cache.
6.Search query optimization.
7.Auto restart splunk daily at 2:00 AM UTC so that memory will be released.
8.Set high priority to this dashboard.
7.Set high priority to this scheduled search.
8.Run stats tables first then start charts.
9.Changed the delimer of raw data from text files method to new way which will reduce the time while converting raw data to fields of delimiting proccess.
10.Reduce the number of indexes and source type
After all this my dashboards loading time reduced from 3 minutes to less than 10 seconds.
You need to sit for long hours to implement all 10 steps but worth doing it.