Getting Data In

Why is a particular index-volume (per day) increasing?

dban2005
New Member

Recently, I have added a file share system for indexing via "Universal Forwarder" at Windows server to the receiver/deployment server (Linux). Yesterday, the total volume of raw data for the file share was 20 G and today it is 21 G. The corresponding indexing (for that particular file share) was 7 G yesterday (per day volume) and today again 7.123 G (again another per day volume); even larger than the increment to the raw data. In the inputs.conf, I have mentioned the global parameter ignoreOlderThan = 7d. Is it indexing the last 7 days in every 24 hours? Or is it something else? How can I determine and avoid? Note: there is no zip file and .xml file has been excluded from indexing.

0 Karma

lguinn2
Legend

How much of the data was Splunk able to index in the first day? Of the 20GB, how much data should have Splunk indexed, and how much did it actually index? I wonder if Splunk is "catching up."

ignoreOlderThan=7d will not cause the data to be indexed twice.

I would turn on the Monitoring Console and look at the tabs on indexing for information, as a starting point.

0 Karma

dban2005
New Member

day 1: 7GB; day 2: 7.531GB; day 3: 1.45GB (as we disabled the index and changed 2d and redeployed); day 4: 6.421GB
How can I understand whether is index is duplicating?

0 Karma
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...