Getting Data In

Why does Splunk selectively ignore duplicate events (not ingest events) from unique sources?

williamcharlton
Path Finder

I'm trying to learn how Splunk works by presenting it small sets of data and observing the results. The results of my most recent test really surprise me. I'm no sure what to make of it

I have a 4-server Splunk scenario:

  1. deployment server
  2. index server
  3. search head server
  4. A deployment client server (w/ a Splunk Universal Forwarder)

I used the deployment server web interface to create a *.csv files monitor on the deployment client server. Using csv sourcetype. The data is ingested into a single index.

I created 3 CSV files: testdata01.csv, testdata02.csv, and testdata03.csv. Each csv file has a heading row and 30 "event" rows, like this:

"Date","Field1","Field2","Field3","Field4","Field5"
"2019-01-01 00:00:29 ","testData1-86400","testData2-86400","testData3-86400","testData4-86400","testData5-86400"

.
.
.
"2019-01-01 00:00:00 ","testData1-86371","testData2-86371","testData3-86371","testData4-86371","testData5-86371"

For each data row, the Date decrements by one second (:29 down to :00). Likewise, the numeric value that appears in a row's 5 fields decrements by 1 (86400 down to 86371). The three CSV files each have the exact same set of 30 events.

I dropped the three files into the monitored folder and then performed a search from the search head. To my surprise, I see only 30 events from testdata01.csv. It appears that Splunk ignored the 30 events from testdata02.csv and the 30 events from testdata03.csv

I expected all 90 events to be ingested because each set of 30 has a unique source.

Why does Splunk selectively ignore events (not ingest events) from multiple CSV files?

0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi williamcharlton0028,
Splunk doesn't permits to reindex the same file even if has a different name.
If you want to index three files with the same content and a different name, you have to insert in your inputs.conf, in the stanza that reads the three files, the option:

crcSalt = <SOURCE>

in this way you say to Splunk to index all the files that have different file names from the ones already indexed.

Bye.
Giuseppe

View solution in original post

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi williamcharlton0028,
Splunk doesn't permits to reindex the same file even if has a different name.
If you want to index three files with the same content and a different name, you have to insert in your inputs.conf, in the stanza that reads the three files, the option:

crcSalt = <SOURCE>

in this way you say to Splunk to index all the files that have different file names from the ones already indexed.

Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...