Getting Data In

How to Re-Populate Summary Index?

skoelpin
SplunkTrust
SplunkTrust

I have a summary index set up to populate every hour. The forwarder that populated the summary index was down for a few days, after starting the forwarder, it grabbed the log files retroactively but did not send that data to the summary index. So how would I get that data from index=endeca to index=endeca-summary when the forwarder was down?

Tags (2)
0 Karma
1 Solution

renjith_nair
Legend

Hi @skoelpin,

You can use fill_summary_index.py to back fill the data

Detailed instructions are available : http://docs.splunk.com/Documentation/Splunk/6.0.2/Knowledge/Managesummaryindexgapsandoverlaps

Happy Splunking!

View solution in original post

chimell
Motivator

Hi
See this search code which use Collect command perhaps it will help you

index=endeca |collect index=endeca-summary

for more information about collect command see this link http://docs.splunk.com/Documentation/Splunk/6.3.3/SearchReference/Collect

0 Karma

renjith_nair
Legend

Hi @skoelpin,

You can use fill_summary_index.py to back fill the data

Detailed instructions are available : http://docs.splunk.com/Documentation/Splunk/6.0.2/Knowledge/Managesummaryindexgapsandoverlaps

Happy Splunking!

skoelpin
SplunkTrust
SplunkTrust

Thanks, this is exactly what I'm trying to do, but running into a problem

I'm currently on my Splunk indexer and opened up powershell and navigated to Splunk\bin where the python script is and I'm entering the below command. After executing this, I'm getting back "Microsoft Windows Version 6.3 2013 Microsoft Corp. All rights Reversed"

cmd python fill_summary_index.py -index "endeca-summary"  -et 1449730800 -lt 1452898800  -dedup true -auth admin:xxxxxxx

*Summary index = endeca-summary
*Regular index = endeca
*I used epoch time for -et and -lt.. I want to test this out by running it for 1 hour in the middle of my missing data to confirm this works before attempting to backfill the month

0 Karma

somesoni2
Revered Legend

The summary index backfill works to re-run the scheduled search (which is populating the summary index) by simulating historical execution. The correct syntax (as mentioned in the link in the answer) is below.

splunk.exe cmd python fill_summary_index.py -app AppName -name "SearchName" -et epochEarliest -lt epochLatest -dedup true -auth -j N admin:changeme

Where, SearchName is the name of search populating your summary index and AppName is the app containing the search, epochEarliest and epochLatest are epoch time range for the search schedule (not the timerange used in search), N is number of parallel executions.

And since it requires the search name, it has to be run from Search Head where the search is available/enabled/scheduled.

Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...