Knowledge Management

Summary Index Producing Doubled Results

jchensor
Communicator

I've recently created a saved search to store items into a summary index. It's scheduled to run every 5 minutes and searches with the parameters:

Start time: -5m@m
Finish time: now

Thus, the results produced should be recording a chunk of events from a source index every 5 minutes and adding them to the summary index. Also, just as an FYI, the source index just happens to only have 1 entry every 5 minutes. So I'm expecting this one entry is found every 5 minutes and put into the summary index.

However, instead, what I'm seeing is that my entries in the summary index are doubled! Instead of that 1 entry every 5 minutes I'm expecting, what I find instead is 2 entries every 5 minutes, which is the correct entry... just in there twice.

I've even set the saved search to execute every 15 minutes instead, searching the source index with the time range of "-15m@m" to "now". When I do this, the results are still doubled. This ruled out any chances of an entry being doubled by being found on the edges of both time ranges.

Also, someone else on my team has run a similar set up with a saved search running once every hour grabbing events from a source index. Within that hour, there are many, many, many entries to be found. But in the summary index, we find that every entry is in there exactly twice again.

Has anyone else experienced this problem? Am I setting this up incorrectly? Thanks!

  • James
0 Karma

colinj
Path Finder

Has anyone else seen this behavior? I'm having the same problem. Single search running every five minutes. I get double the number of entries in the summary_index.

kristian_kolb
Ultra Champion

the xxx SPLUNK xxx header in the file is actually metadata you can put into any file. See http://docs.splunk.com/Documentation/Splunk/latest/Data/Assignmetadatatoeventsdynamically#Configure_...

/kristian

0 Karma

pheezy
Explorer

I'm having the same problem; no solution yet.

0 Karma

jchensor
Communicator

Obviously, this is not a solution we want to go with, as modifying those files is definitely NOT the right solution. But I figured it'd be interesting to note that we have come to a working solution by doing this and lets you move forward with any other forms of testing you wanted to do.

0 Karma

jchensor
Communicator

...were modified to look like this:

#[batch://$SPLUNK_HOME/var/spool/splunk]
#move_policy = sinkhole
#crcSalt =

#[batch://$SPLUNK_HOME/var/spool/splunk/...stash_new]
[monitor://$SPLUNK_HOME/var/spool/splunk/...stash_new]
queue = stashparsing
sourcetype = stash_new
#move_policy = sinkhole
crcSalt =

After that change, everything has actually started WORKING properly! Those weird header lines that start with ---SPLUNK--- ... are no longer there, and events are only displaying once instead of being doubled. In other words, everything looks completely accurate.

0 Karma

jchensor
Communicator

I don't recommend this, but we have found an iffy workaround. Perform at own risk. ^_^

My teammate experimented a bit and modified the inputs.conf in the etc/system/default folder (the path you are never supposed to alter). Interestingly, she tried switching the monitoring of the stash files from "batch" to "monitor". So the sections that read:

[batch://$SPLUNK_HOME/var/spool/splunk]
move_policy = sinkhole
crcSalt =

[batch://$SPLUNK_HOME/var/spool/splunk/...stash_new]
queue = stashparsing
sourcetype = stash_new
move_policy = sinkhole
crcSalt =

(continued in next post)

0 Karma

jchensor
Communicator

Also, in the entry above, the random entry is supposed to have three asterisks (*) before and after the word SPLUNK, but when I do that here, it bold faces and italicizes the word. ^_^ So just pretend those dashes are asterisks.

0 Karma

jchensor
Communicator

Latest development:

My teammate checked the dispatch folder to find the actual results, and in the results.csv.gz files, the results are actually not duplicated! Each result is found only once.

Also, we see these random entries in the summary index:

---SPLUNK--- index="summary-data" source="SEARCH-NAME"

The reason this sticks out to us is that, when we ran these searches on our older 4.2.5 Search Heads, these types of events were nowhere to be found.

So for some reason, Splunk is displaying the entries twice. So this appears to be a viewing problem, not an indexing one.

  • James
0 Karma

jchensor
Communicator

An interesting development: so currently, I have 4 Search Heads in my environment. 2 of them are older, running on Splunk 4.2.5, and two of them are new, running on 4.3.3.

I set up the same saved searches and a local index on a 4.2.5 Search Head machine, and I'm NOT getting any duplicate events. However, my teammate saw the duplicated entries on one of the 4.3.3 machines and I saw the duplicated entries on the OTHER 4.3.3 machine. So either there's a bug in 4.3.3 or I did something wrong when I installed Splunk on the two new machines and have a setting set incorrectly.

James

0 Karma

jchensor
Communicator

Actually, for organization, I'm putting the search info here instead:

Adding my saved search here.

index=fooindex sourcetype=somesourcetype FIELD1="Value1" FIELD2="Value2" | stats avg(FIELD3) as FIELD4 by _time, FIELD1

I then have it scheduled to run every 5 minutes and Summary Indexing is enabled and I've selected a summary index I'll call "summary-data".

Then, when I search to see my results, all I do is run the search "index=summary-data" and see what pops up. And this is where I see each of the results duplicated.

Hope that helps a little.

James

0 Karma

jchensor
Communicator

I've added that information to the original post. As for the interval part, I'll definitely switch it to "@m" from now on. That's good advice, thanks. I'll let you know if it causes any changes to the results.

  • James
0 Karma

Paolo_Prigione
Builder

What are the searches that you run to collect data in the summary index? And what the one used to check for double events?
On a sidenote, it would be safer to set the interval of the saved searches to: from: -5m@m to:@m

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

 (view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...