Splunk Search

Splunk Universal Forwarder - How to troubleshoot Intermittent Issue of missing data?

madhav_dholakia
Communicator

Hi All,

We are using Splunk Cloud and have a Universal Forwarder setup on a windows machine - it reads CSV files from a particular folder and sends to indexer. 

inputs.conf:

 

 

 

[monitor://D:\Test\Monitoring\Abc]
disabled=0
index=indexabc
sourcetype = csv
crcSalt = <SOURCE>

 

 

 

props.conf:

 

 

 

[source::D:\Test\Monitoring\Abc\*.csv]
CHECK_METHOD = modtime

 

 

 

various CSV files are being placed under D:\Test\Monitoring\Abc hourly/daily and this setup works without any issues most of the times for all the CSV files.

but there are some instances where data from a single file for a particular hour/day is missing in the index "indexAbc" - this doesn't happen with a particular file but various files.

for example, there is a CSV called memory.csv which updates daily at 23:47 and when I checked data for the previous month (timechart span=1d), it doesn't show data for 25th March - I have checked the 3rd party script which sends data to this windows server and it has done that successfully.

when a CSV file is read and indexed, i see below entry in the splunkd.log but this is not available for the 25th march for which the data is missing:

 

 

 

03-26-2022 23:47:49.495 +0000 INFO  WatchedFile [6952 tailreader0] - Will begin reading at offset=0 for file='D:\Test\Monitoring\Abc\memory.csv'.

 

 

 

 for period 25th March 23:40 to 23:50, I have checked splunkd error in _internal index and the results are given below:

madhav_dholakia_0-1650361439440.png

Can you please suggest what could be causing this intermittent issue and whet troubleshooting steps I can follow?

Thank you.

Labels (1)
0 Karma
1 Solution

VatsalJagani
Champion

There is a wait parameter (time_before_close) in inputs.conf but I don't think it will be effective as we are changing CHECK_METHOD.

This parameter is more to say Splunk to wait until the other systems finish writing the file, not for the purpose of what you want here.

 

Also, as the error "Ran out of data while looking for end of header" does not specify which file the error was generated for, it's difficult to tell for sure what is the error that is causing this file to not ingest.

View solution in original post

VatsalJagani
Champion

@madhav_dholakia  - Based on the description what I can tell is that your files are getting replaced.

If you look at the logs the issue could be due to many reasons, some of them could be:

  • The file was too small for Splunk to decide whether it was the updated or already ingested file.
  • The header is too large in the file (more than 256 characters) and Splunk assumed that it's the original file which is already ingested.

 

Please take a look at the content of the file (memory.csv) on this occasion. And please take a look at the following parameters in inputs.conf see if that helps. (https://docs.splunk.com/Documentation/Splunk/latest/Admin/Inputsconf)

  • initCrcLength (default 256 character)
  • followTail

 

Also, make sure you are not receiving any permission-related issues because that could be also possible that the script that you are describing has locked the file for reading and Splunk was not able to read the file.

 

I hope this helps!!!

0 Karma

madhav_dholakia
Communicator

thanks, @VatsalJagani  - yes, the files are getting replaced.

What  I thought is: below option in props.conf will always index the file (if it has a change in last modified date), irrespective of header change or CRCLength - can you please suggest if this understanding is not correct?

CHECK_METHOD = modtime

Also, I do not suspect a permission level issue on the file, because at the next run, it is indexed successfully. Also, this issue occurs for different files at different time so not sure if any additional logs can help to understand the root cause.

Thank you.

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

Yes, that "CHECK_METHOD = modtime"  in UFs props.conf means that file is reread always when mod time has changed.

How this file has generated and moved to it's place? Basically it can be owned by someone else and splunk user haven't access to it before script has changed those later on? For that reason it can read on the next round. Also as this is a windows file there could be exclusive lock when it is created and later on that has removed when it's ready for read by other processes.

r. Ismo

madhav_dholakia
Communicator

thanks, @isoutamo - there are scripts running on 3rd party tool which place different CSV files on the given location (hourly/daily/weekly). 

as per the example I have shared, if I am checking the data for the last month - I can see memory.csv data for all days except 25th - so it was working before the 25th and also after the 25th. There is another file space.csv, it has got data for all days in March, except 20th. So, it happens intermittently and no changes in scripts have been made in the recent past.

is there any option to set that can enable a wait of say, 120 seconds after the file is updated,  and before it is indexed so in case of the below scenario, the lock is released (hopefully) by the time the file is being read and indexed?

Also as this is a windows file there could be exclusive lock when it is created and later on that has removed when it's ready for read by other processes 

Thank you.

0 Karma

VatsalJagani
Champion

There is a wait parameter (time_before_close) in inputs.conf but I don't think it will be effective as we are changing CHECK_METHOD.

This parameter is more to say Splunk to wait until the other systems finish writing the file, not for the purpose of what you want here.

 

Also, as the error "Ran out of data while looking for end of header" does not specify which file the error was generated for, it's difficult to tell for sure what is the error that is causing this file to not ingest.

madhav_dholakia
Communicator

since this paramter (time_before_close) is added, we haven't seen this issue for over a month now. 

thank you @VatsalJagani & @isoutamo for your help on this.

I have got another question on similar line here

Thank you.

madhav_dholakia
Communicator

Thanks, @VatsalJagani 

This parameter is more to say Splunk to wait until the other systems finish writing the file, not for the purpose of what you want here.

Yes, in my case files aren't large so might not be taking more time to be written, but I think I can give it a try just in case of there is any lock on the file and if a wait of n seconds might help.

Thanks for this, I will try this and share an update after monitoring for a couple of days, as the issue is intermittent.

Thank you.

0 Karma
Get Updates on the Splunk Community!

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...

Getting Started with AIOps: Event Correlation Basics and Alert Storm Detection in ...

Getting Started with AIOps:Event Correlation Basics and Alert Storm Detection in Splunk IT Service ...

Register to Attend BSides SPL 2022 - It's all Happening October 18!

Join like-minded individuals for technical sessions on everything Splunk!  This is a community-led and run ...