Getting Data In

Why does Splunk selectively ignore duplicate events (not ingest events) from unique sources?

williamcharlton
Path Finder

I'm trying to learn how Splunk works by presenting it small sets of data and observing the results. The results of my most recent test really surprise me. I'm no sure what to make of it

I have a 4-server Splunk scenario:

  1. deployment server
  2. index server
  3. search head server
  4. A deployment client server (w/ a Splunk Universal Forwarder)

I used the deployment server web interface to create a *.csv files monitor on the deployment client server. Using csv sourcetype. The data is ingested into a single index.

I created 3 CSV files: testdata01.csv, testdata02.csv, and testdata03.csv. Each csv file has a heading row and 30 "event" rows, like this:

"Date","Field1","Field2","Field3","Field4","Field5"
"2019-01-01 00:00:29 ","testData1-86400","testData2-86400","testData3-86400","testData4-86400","testData5-86400"

.
.
.
"2019-01-01 00:00:00 ","testData1-86371","testData2-86371","testData3-86371","testData4-86371","testData5-86371"

For each data row, the Date decrements by one second (:29 down to :00). Likewise, the numeric value that appears in a row's 5 fields decrements by 1 (86400 down to 86371). The three CSV files each have the exact same set of 30 events.

I dropped the three files into the monitored folder and then performed a search from the search head. To my surprise, I see only 30 events from testdata01.csv. It appears that Splunk ignored the 30 events from testdata02.csv and the 30 events from testdata03.csv

I expected all 90 events to be ingested because each set of 30 has a unique source.

Why does Splunk selectively ignore events (not ingest events) from multiple CSV files?

0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi williamcharlton0028,
Splunk doesn't permits to reindex the same file even if has a different name.
If you want to index three files with the same content and a different name, you have to insert in your inputs.conf, in the stanza that reads the three files, the option:

crcSalt = <SOURCE>

in this way you say to Splunk to index all the files that have different file names from the ones already indexed.

Bye.
Giuseppe

View solution in original post

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi williamcharlton0028,
Splunk doesn't permits to reindex the same file even if has a different name.
If you want to index three files with the same content and a different name, you have to insert in your inputs.conf, in the stanza that reads the three files, the option:

crcSalt = <SOURCE>

in this way you say to Splunk to index all the files that have different file names from the ones already indexed.

Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...