Knowledge Management

summary index noob question

elenzil
Path Finder

hi all -

apologies for the egregious noob question,
but i'm still not quite getting summary indexing,
and want to make sure i'm getting it right.

say i have a search like this:

log_level=error | timechart span=15m count by host

and i'd like to do it w/ summary indexes so that i can run the search efficiently over say 60 days.

my approach would be to make two new searches:

first, the summary search:
- search name = "summary_errors_by_host".
- search = "log_level=error | sitimechart span=15m count by host".
- start time = "-15m@m" finish time = "now".
- scheduled to run every 15 minutes.
- alert condition = always.
- alert mode = once per search.
- summary indexing = enabled.
- summary index = "summary".
- added fields: "author" = "me".

next, the report search:
- search = "index=summary search_name="summary_errors_by_host" | timechart span=15m count by host".

does this seem right ?

1 Solution

lguinn2
Legend

This is great, but I have a tweak or two for you. First, you don't really need any "additional fields." I generally skip this.

Second, think about your environment. Where do the events come from and how long do they take to get to Splunk? Imagine that you have a production server writing to a log file: myApp.log. A splunk forwarder on that production server would be monitoring myApp.log. When something is written to myApp.log, the forwarder notices that and it picks up the data and sends it across the network to the Splunk indexer, where it is parsed and written to disk. All of this may take only a few seconds - but it does take time!

So what happens if an event is written to myApp.log at 9:59:59 - will it make it to the index in time to be included in the 9:45 - 10:00 summary? Probably not.

Is this critical? I don't know; it depends on your particular situation. But, to avoid the problem, I would do this in your summarizing (scheduled) search:

start time = -20m@m

finish time = -5m@m

scheduled to run every 15 minutes

This puts a 5-minute delay into the summarization, which should be plenty of time for all the events to arrive and be indexed.

And this is totally not a n00b question - and besides, this is the place for questions both n00b and expert!

View solution in original post

araitz
Splunk Employee
Splunk Employee

Thanks for the great detail in your question!

0 Karma

lguinn2
Legend

This is great, but I have a tweak or two for you. First, you don't really need any "additional fields." I generally skip this.

Second, think about your environment. Where do the events come from and how long do they take to get to Splunk? Imagine that you have a production server writing to a log file: myApp.log. A splunk forwarder on that production server would be monitoring myApp.log. When something is written to myApp.log, the forwarder notices that and it picks up the data and sends it across the network to the Splunk indexer, where it is parsed and written to disk. All of this may take only a few seconds - but it does take time!

So what happens if an event is written to myApp.log at 9:59:59 - will it make it to the index in time to be included in the 9:45 - 10:00 summary? Probably not.

Is this critical? I don't know; it depends on your particular situation. But, to avoid the problem, I would do this in your summarizing (scheduled) search:

start time = -20m@m

finish time = -5m@m

scheduled to run every 15 minutes

This puts a 5-minute delay into the summarization, which should be plenty of time for all the events to arrive and be indexed.

And this is totally not a n00b question - and besides, this is the place for questions both n00b and expert!

elenzil
Path Finder

this has been working great, and together with some work on our indexers our splunk performance has improved a lot. however i've now run into a slightly more complex search i want to summarize (it has a transaction with a longish maxspan) and am running into some issues. i started a new splunkbase question for it here: http://bit.ly/M9Cjij if you have any insight it would be appreciated!

0 Karma

elenzil
Path Finder

thanks again on this. this really got me going.

0 Karma

elenzil
Path Finder

"This puts a 5-minute delay into the summarization"
ah, great tip, thanks.
also thanks for confirming that the basic set-up looks good!

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...