Getting Data In

Not all CSV files with identical names are ingesting

lbrhyne
Path Finder

Hello,

We are attempting to ingest csv files from two different applications where the file name and structure are identical. The files are placed into two different file directories on a heavy forwarder and they contain different sets of data. The challenge we are having is the file being sent to the prod directory is being ingested, but the file in the dev directory is not or  neither file gets ingested.  Since the file names are identical and delivered to each directory at the same time, I'm thinking this is causing issues with one or both files not being ingested.

Below is how we configured our Props config and it does seem to work, but not consistently.  Any Help would be appreciated!

 

 

#Production
[batch://C:\Import\Prod\*.csv]
index = test
sourcetype = test
move_policy  = sinkhole

#Development
[batch://C:\Import\Dev\*.csv]
index = testb
sourcetype = testb
move_policy  = sinkhole

 

 

 

Labels (3)
0 Karma

lbrhyne
Path Finder

Thanks for the suggestion, but this did not resolve the issue. I have not pursued any further due to the requirement of ingesting the file was canceled.

Thanks again!

0 Karma

codebuilder
Influencer

Your configs look correct, assuming you have them in inputs.conf, and not "Props config" which you mentioned.

When using batch mode for files of the same name, something has to be different about the new file in order for Splunk to pick it up. Generally it uses the timestamp, a different file size can trigger it as well.

Unlike "monitor", batch does not consume files that are actively changing such as system logs. If the forwarder is running when you copy the file over there's a chance Splunk won't pick it up, from my experience anyway.

A better method for testing the scenario you described would be to stop the Forwarder, copy over your file(s), the start the Forwarder back up. Once the Forwarder is up and inspects the directory and file(s) it should ingest it.

Batch mode is more generally used for ingesting and deleting large numbers of files/logs with different names, timestamps, etc. such as rotated system logs where the timestamp of the rotation time was incorporated into the name.

That said, it should still work for your use case. But try the testing method I suggested. You may also include a parameter in props.conf that will help recognize existing files with different content:

CHECK_METHOD = modtime

 

See the documentation for more details:
https://docs.splunk.com/Documentation/Splunk/8.2.2/Data/Monitorfilesanddirectorieswithinputs.conf#Ba...

----
An upvote would be appreciated and Accept Solution if it helps!
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...