Getting Data In

How to avoid duplicate log files from universal forwarder caused by customers compressing and exporting logs files on the device every 15 min?

strive
Influencer

Hi,

We have splunk UF installed on devices to send log files to another forwarder, which sends the logs to indexer.

Splunk UF on Device --> Forwarder --> Indexer.

The configurations are:

Splunk UF on Device

inputs.conf

[monitor:<Path to the directory>]  
disabled = false
host_regex = _(\d+\.\d+\.\d+\.\d+)
sourcetype = abc_xyz
index = my_index
crcSalt = <SOURCE>

outputs.conf

[tcpout]
disabled = false
defaultGroup = our_lwf

[tcpout:our_lwf]
server = <ip address of downward forwarder>:9998

Forwarder node

[splunktcp://:9998]
compressed=false

The issue is: We are getting duplicate log files. Same log files are sent twice. When we see the _indextime of log events, we see two different times. First set got indexed at X and the second set got indexed at X+Y minutes. The Y value varies from 11 to 13.

Note: On the device, the customers are compressing the log files and exporting them to another location every 15 minutes. If this compression and exporting feature is turned off on the devices then we are not seeing duplicate logs. If the compression and exporting is turned on we see duplicate logs.

Could you please let me know how to avoid duplicate log files getting into the system.

Thanks

Strive

0 Karma
1 Solution

strive
Influencer

re-added the blacklist configurations, restarted splunk and it is working

Enabled the TailingProcessor in DEBUG mode to check.. the splunkd.log clearly shows that the .gz files are ignored.

10-29-2014 12:18:19.380 +0000 DEBUG TailingProcessor - File state notification for path='our_path/file.gz' (first time).
10-29-2014 12:18:19.381 +0000 DEBUG TailingProcessor -   Item 'our_path.gz' matches stanza: our_path_*.
10-29-2014 12:18:19.381 +0000 DEBUG TailingProcessor -     Not using stanza for this item (Matched blacklist '\.(gz)$'.).
10-29-2014 12:18:19.381 +0000 DEBUG TailingProcessor -   Entry is associated with 0 configuration(s).
10-29-2014 12:18:19.381 +0000 DEBUG TailingProcessor - No configurations match, will ignore path='our_path/file.gz'.

During checks i came to know that splunk was not restarted when changes were made earlier. Not restarting splunk was the issue 😞

View solution in original post

srisahitya_v
Communicator

you can use
"followtail = true "
option.
you have to add this line in inputs.conf configure file for every stanza.

0 Karma

strive
Influencer

re-added the blacklist configurations, restarted splunk and it is working

Enabled the TailingProcessor in DEBUG mode to check.. the splunkd.log clearly shows that the .gz files are ignored.

10-29-2014 12:18:19.380 +0000 DEBUG TailingProcessor - File state notification for path='our_path/file.gz' (first time).
10-29-2014 12:18:19.381 +0000 DEBUG TailingProcessor -   Item 'our_path.gz' matches stanza: our_path_*.
10-29-2014 12:18:19.381 +0000 DEBUG TailingProcessor -     Not using stanza for this item (Matched blacklist '\.(gz)$'.).
10-29-2014 12:18:19.381 +0000 DEBUG TailingProcessor -   Entry is associated with 0 configuration(s).
10-29-2014 12:18:19.381 +0000 DEBUG TailingProcessor - No configurations match, will ignore path='our_path/file.gz'.

During checks i came to know that splunk was not restarted when changes were made earlier. Not restarting splunk was the issue 😞

strive
Influencer

Additional details:

Splunk UF version on device is: 4.3.x

Splunk version on downward forwarder is 5.0.4

Even though we have set compressed as false in downward forwarder, We tried adding blacklist = \.(gz)$ in the inputs.conf of Splunk UF on device. It doesn't work. We still get duplicate log files.

0 Karma
Get Updates on the Splunk Community!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Enhance Security Operations with Automated Threat Analysis in the Splunk EcosystemAre you leveraging ...

Splunk Developers: Go Beyond the Dashboard with These .Conf25 Sessions

  Whether you’re building custom apps, diving into SPL2, or integrating AI and machine learning into your ...

Index This | How do you write 23 only using the number 2?

July 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this month’s ...