Our auditors asked a question, that caused the need to know how many records we log, per device, per sourcetype, per day.
To run the search each day would be killer on the search heads and indexers, based on our volume. So it was suggested that we create and populate a summary index, and then run searches, and perhaps a dashboard, from there.
This search gives me the output I need.
index=* | eval date=strftime(_time,"%Y-%m-%d") | stats count by host index sourcetype date | table host, index, sourcetype, date, count |sort index, sourcetype
How would I convert this to a summary index, where I could get something like this as an output. Allowing me to search by server, sourcetype, index, or date.
host index sourcetype date count Server001 app1_iis iis 5/8/2017 13671 Server001 app2_iis iis 5/8/2017 448838 Server001 app3_iis iis 5/8/2017 24 Server001 app4_iis iis 5/8/2017 35890 Server001 windows WinRegistry 5/8/2017 2314924 Server001 wineventlog WinEventLog:Security 5/8/2017 75489
Using Splunk Enterprise 6.5.1
Thank you in advance
Before the process, you can get rid of sort part since you could sort it in summary index data.
1) save this search as report
2) go to settings> Searchs, reports and alerts and find the one you have just saved and open configuration for that search
3) in the list:
a) set search, earliest, latest and check the box, "Schedule this search". Note that, if your time window for the search depends on how much data you want to summarize, such that, if you are looking for daily summaries, do 1 hour window, if it is monthly, do 1 day window and schedule your search daily etc..
b) Go to bottom of the configuration , and check, "Enable" for summary indexing. As a standart, summary indexes has naming summary_. Feel free to create your new index. You could also add fields if you have other searches that will feed the same summary index. You can group it by using this field you have added(such that: reportedSearch = Server01_applogs etc..)
c) click save
when your search runs at scheduled time for the first time, first data-set will be moved to your summary index. Please also note that, if you want to load initially big time window then smaller windows, get your first search to look for month or for all data. However, time window needs to be updated as soon as first schedule finishes. otherwise you would have duplicate events in your summary index.
hope this helps, documentation is here:
Are you really sure it would kill your environment if you use
tstats? This is the thing it rocks at.
| tstats count by host index sourcetype _time span=1d | eval date=strftime(_time,"%Y-%m-%d") | table host, index, sourcetype, date, count |sort index, sourcetype
Ugh, could not reply in Edge, had to jump to another browser.
With the Tstats, I get indexes like _audit, that I do not want. Any way to exclude those in a tstats? (using index!=_audit did not work)
Or better use tstat's query above in the summary index search. You can just save this search, schedule it to run daily (setup a cron for flexible schedule) and select summary indexing as action (there will be a checkbox at the bottom when you view it from Settings->Searches., reports and alerts).
I can try tstats. If I was asked to compare daily volumes, monthly, and year over year, would this still be the best way, versus some type of summary? We are averaging 7 billion events a week, at 70 gig daily average.