Getting Data In

Windows monitored file duplicate data

darrenfuller
Contributor

I have a csv file that is written to once a day.   

The input points to a custom sourcetype [csvtest] which has appropriate settings for the data within.  

inputs.conf:

 

[monitor://c:\opt\splunk\etc\apps\csvtest\data\csvtest.csv]
index = main 
sourcetype = csvtest

 

 

props.conf:

 

[csvtest]
LINE_BREAKER = ([\r\n]+)
SHOULD_LINEMERGE = false
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%d
MAX_TIMESTAMP_LOOKAHEAD = 15
TRUNCATE = 10000

 

 

and data looks as so: 

 

2020-01-26,,,1.0,,,,,,,,9,,,,,,,,,,
2020-01-27,,,2.0,,,,,,,,19,,,,,,,,,,
2020-01-28,,,1.0,1.0,,,1.0,,,,11,,,,,,,,,,
2020-01-30,,,0.0,2.0,,,2.0,,,,27,,,,,,,,,,
2020-01-31,,,0.0,2.0,,,2.0,,,,17,,,,,,,,,,
2020-02-03,,,0.0,3.0,,,3.0,,,,29,,,,,,,,,,
2020-02-04,90.0,12.0,0.0,3.0,,,3.0,139.0,,,34,,,,,,,,,,
2020-02-05,96.0,8.0,0.0,3.0,,,3.0,150.0,,,43,,,,,,,,,,
2020-02-06,104.0,0.0,0.0,3.0,,,3.0,169.0,,,62,,,,,,,,,,
2020-02-08,130.0,25.0,0.0,3.0,,,3.0,197.0,,,39,,,,,,,,,,
2020-02-10,167.0,81.0,0.0,3.0,,,3.0,259.0,,,8,,,,,,,,,,
2020-02-11,184.0,79.0,0.0,3.0,,,3.0,285.0,,,19,,,,,,,,,,
2020-02-12,257.0,44.0,0.0,2.0,1.0,,3.0,313.0,,,9,,,,,,,,,,
2020-02-13,306.0,16.0,0.0,2.0,1.0,,3.0,340.0,,,15,,,,,,,,,,
2020-02-14,353.0,0.0,0.0,2.0,1.0,,3.0,364.0,,,8,,,,,,,,,,
2020-02-17,399.0,0.0,0.0,2.0,1.0,,3.0,402.0,,,0,,,,,,,,,,
2020-02-18,418.0,0.0,0.0,2.0,1.0,,3.0,421.0,,,0,,,,,,,,,,
2020-02-19,436.0,0.0,0.0,2.0,1.0,,3.0,456.0,,,17,,,,,,,,,,
2020-02-20,462.0,,0.0,1.0,2.0,,3.0,479.0,,,14,,,,,,,,,,
2020-02-21,483.0,,0.0,0.0,3.0,,3.0,498.0,,,12,,,,,,,,,,
2020-02-22,540.0,,0.0,1.0,3.0,,4.0,553.0,,,9,,,,,,,,,,

 

 

Every time this file adds a new line to the bottom , the whole file is ingested... causing duplicate data in the index.

I have created the same scenario on Linux and the forwarder appropriately uses the default initCrcLength and identifies it has seen the file before and only ingests the new event.   

in the _internal index, i am seeing events like so> 

 

03-23-2021 20:00:02.851 -0400 INFO  WatchedFile - Will begin reading at offset=0 for file='C:\opt\splunk\etc\apps\csvtest\data\csvtest.csv'.

 

 

Is this just a windows forwarder symptom?    Is there an inputs.conf setting I can utilize to fix this? 

Thanks in advance!

Labels (3)
0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...