Getting Data In

indexing a file uploaded with same filename

smolcj
Builder

Hi,
scenario: a log uploader application helps in uploading logs to a directory. let it be splunkdata/timeofupload/yourid/splogs/filename.
regex used to monitor files are like
/splunkdata/.../.../...splogs/*
doubt1: will a file with different id but same filename can be indexed in splunk?
doubt2: will a file with same filename but different content and is can be indexed in splunk?
doubt3: will a file with same id same filename but different content can be indexed?
What settings do i have to change to get these features..
please help

Tags (1)
0 Karma

Drainy
Champion

As the previous answer says, Splunk builds a CRC of the first 256 bytes, in http://docs.splunk.com/Documentation/Splunk/latest/Admin/Inputsconf you can edit the crcSalt to include the source path in the CRC check to ensure the same files in different locations get indexed and you could always change the initCrcLength to capture a larger portion of the file if the header changes through the file.

Drainy
Champion

When you define the crcsalt you define the full path of the file, the actual filename being the same makes no difference so it will still be read

0 Karma

stefano_guidoba
Communicator

smolcj stated that contents are different, what may be equal is the filename (in different folders). So I guess he doesn't need to set crcSalt nor initCrcLength parameters.

0 Karma

stefano_guidoba
Communicator

Hi smolcj,

Splunk index files calculating the CRC with the first 256 characters of the file. So, if you have files with same name in different directories, Splunk checks file contents and if first 256 chars differs, they will be both indexed.
To be sure, configure your input like this:

[monitor:///splunkdata/timeofupload/*/splogs/*]

This way you are telling Splunk to monitor all timeupload subfolders which have files in splogs subfolder. To gain a better understanding, you can check the docs:

http://docs.splunk.com/Documentation/Splunk/5.0/Data/Monitorfilesanddirectories

Regards,
Stefano

0 Karma

smolcj
Builder

how can we specify timeofupload statically? it is calculated according to the time of the log uploaded and a directory is created using that name and log is saved inside that directory

0 Karma

Drainy
Champion

This doesn't quite resolve the issue of the CRC though

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...