Getting Data In
Provide Splunk Cloud feedback in this confidential UX survey by June 17
for a chance to win a $200 Amazon gift card!

Summary Index vs Report Acceleration?

tradecraft1914
Explorer

I have DNS logs from both Windows and Unix BIND. What I am trying to do is create a quick way for admins to query 90 days worth of these DNS logs to see if a particular domain name has ever been looked up. My thought was to do search of each index, do a "stats count by domain" and put the results potentially in a summary index. It was suggested that I look at Report Acceleration as well. I'm running into a couple issues.

  1. I cant save my query as accelerated. I do not see any checkbox in version 6.0. Here is the search: index=dnslog | stats count by domain_name

  2. It would be useful for the analysts to have something like a "first seen" date to go with the domains returned in the searches but I cannot figure out a way to do that.

Any help would be greatly appreciated.

1 Solution

ecambra_splunk
Splunk Employee
Splunk Employee

I would highly recommend using report acceleration, it is self repairing. Summary Indexing is entirely reliant on the scheduled search performing as expected.

Generally I only use summary indexing for summarized information that needs to be stored longer than a year.

Report Acceleration is still available in 6.0, and your search does qualify (uses a transforming command, stats). If you go to Settings > Searches & Reports > Your Search, do you not see the check box?

To calculate the first seen date you can use earliest() in your stats command. It would look like this...

| stats count earliest(eval(strftime(_time, "%H:%M:%S"))) AS "first seen" by domain_name

You can modify the time format to display month, date, year and all that other goodness.

View solution in original post

saravanan90
Contributor

+adding one more

Report acceleration saved in - $SPLUNK_HOME/var/lib/splunk/index1/summary whereas
Summary indexing will be saved in summary index($SPLUNK_HOME/var/lib/splunk/summarydb). Temporarily search results are stored in $SPLUNK_HOME/var/spool/splunk/_.stash

woodcock
Esteemed Legend

Splunk Report Acceleration:

  • Increases performance ~2-5x.
  • Automatically preforms backfill.
  • Late-arriving events are of NO concern.
  • Drilldown to raw data still works.
  • Speeds up a search by creating additional TSIDX mappings into the raw data.
  • Is incredibly robust and requires no housekeeping.
  • Is insensitive to TimeZone and the user who enabled the acceleration.
  • Has a minor "always-on" impact to Indexers and consumes a small amount of disk space but no license.
  • Does not require any conversion (just click the checkbox and you are done).
  • Replaces this question:

    Is there *possibly* a match for my data in this bucket?
    

    With this question:

    Is there definitely a match for my data in this bucket?

    Splunk Summary Index:

  • Astronomical increases in performance are possible (but, if you do it wrong, things can actually be slower).

  • Admin must manually preform backfill.

  • Late-arriving events are of GREAT concern

  • Drilldown to raw data no longer works (must do Advanced dashboard programming to force it to switch back to raw data).

  • Is somewhat fragile and requires reactive housekeeping (many things can cause SI data outages).

  • Is sensitive to TimeZone and the user who runs the "Populating Search".

  • Has a minor "every-hour" impact to Indexers and consumes a small amount of disk space but no license.

  • Speeds up searches by replacing huge amounts of raw data with much smaller amounts of aggregate ("summarized") data in the Summary Index.

  • Requires an Admin to reverse-engineer the desired end-usage of the data in order to construct an appropriate "Populating Search" to do this conversion.

  • Requires that all existing KO uses of the raw data be manually converted (e.g. saved searches, macros, dashboard panels) to pull data from the SI instead of raw data.

ecambra_splunk
Splunk Employee
Splunk Employee

I would highly recommend using report acceleration, it is self repairing. Summary Indexing is entirely reliant on the scheduled search performing as expected.

Generally I only use summary indexing for summarized information that needs to be stored longer than a year.

Report Acceleration is still available in 6.0, and your search does qualify (uses a transforming command, stats). If you go to Settings > Searches & Reports > Your Search, do you not see the check box?

To calculate the first seen date you can use earliest() in your stats command. It would look like this...

| stats count earliest(eval(strftime(_time, "%H:%M:%S"))) AS "first seen" by domain_name

You can modify the time format to display month, date, year and all that other goodness.

View solution in original post

ecambra_splunk
Splunk Employee
Splunk Employee
0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!