Getting Data In

Why is some data being indexed as separate entries while monitoring csv log files?

chatterjb
Engager

I have a console app which reads data from table storage, and writes it out onto a csv file. I monitor each of the output folder, I checked to see if the data is being uploaded properly which it does. However there is disconnected data inside the index when I look at all the records. Reason I know is because the records in the csv file doesn't match the records on splunk.

An entry contains 18 fields some of the entries are being split in the middle of the entry.
Example Entry
12/15/2015, Name, ID, Number, Guid ....

Splunk logs it as 2 seperate entry

12/15/2015,Name,ID,
Number,Guid ...

The console app runs every day once at midnight and not all entries are being malformed just a few of them. The strange thing is that the data is still all there it just some of them are separated. Anyone know of a work around or is this a bug?
I was thinking of locking the file until it's finish writing but I'm not sure how splunk would react to fileshare locking.

Tags (2)
0 Karma

lguinn2
Legend

I don't think this is a bug. I would avoid locking the file in general, as it may have unexpected performance impacts.

What are the settings in the inputs.conf stanza that is monitoring the output folder? I would suggest this:

[monitor://yourdirectorypathhere]
index = theindexname
sourcetype = csv
ignoreOlderThan = 30d

Or, you might find a pretrained sourcetype that better fits your data here: List of pretrained sourcetypes
If you are not cleaning out the older files (which you should), the ignoreOlderThan will help Splunk's performance if the directory becomes full of older files that have already been indexed and that will never be updated.

If you don't want to use the csv sourcetype, you may need to place a props.conf file on your indexer that explicitly sets the parsing rules for the sourcetype that you choose.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...