Getting Data In

Not all the JSON files from the same folder location being indexed?


Environment : Heavy forwarder -> Indexers cluster -> SH

ON HWF side :
I am fetching logs using the Curl command which goes to directory DIR-A and following files are created :
These files gets downloaded everyday 10:00 am and before that script clean up all the old files from both DIR-A and DIR-B


Now these files have header and footer which needs to be removed before they indexed as json.

so i have another script which schedule to run after 10 min these files are downloaded in DIR-A
This script remove the header and footer from these files and copy them to New Dir DIR-B as follow :


till here everything works fine.

The issue start when I see 3 files indexed in splunk out of 4 or sometimes 2 out of 4.
I dont see any error in internal logs for files which are not indexed.

here is my input.conf :

index = test
crcSalt = sourcetype = test1 disabled = false

index = test
crcSalt = sourcetype = test2 disabled = false

index = test
crcSalt = sourcetype = test3 disabled = false

index = test
crcSalt = sourcetype = test4 disabled = false

props.conf : for all the sourcetype test1,test2,test3,test4 is same as below :

KV_MODE = false
AUTO_KV_JSON = false
category = Structured
disabled = false
pulldown_type = true

ON SH side settings :

props.conf for sourcetype test1,test2,test3,test4

KV_MODE = false
AUTO_KV_JSON = false

The strange part is if i edit the file ( the file which is not indexed) and add something like #test at the beginning of file and restart splunk it will get indexed fine.

Here is the pattern of the file which is having issue.


Please suggest if i need to use batch instead of monitor or any other suggestion ?

0 Karma

Path Finder

If your script is overwriting these files every time in your DIR-B, a batch input will work. Just keep in mind that Splunk will delete this file after it has been indexed. You could do something like the following:

# inputs.conf
index = test
move_policy = sinkhole
whitelist = .*\.json

# props.conf
# Add the following to your props on the forwarder
sourcetype = test1

sourcetype = test2

sourcetype = test3

sourcetype = test4
0 Karma



Are you salting the files on pourpose using the crcSalt (I could not tell from the conf files)?
Do these files have any form of timestamp in them?
Do they by any chance generate the same hash value (possibly splunk thinks it has already indexed them)?

0 Karma



Yes i am using crcSalt on purpose. This the settings i have placed .

crcSalt =

Yes, some of the files have timestamp but that is in future timestamp, Hence i am forcing the timestamp to current with the help of :

Nope, all the hash values are different.

0 Karma