Getting Data In

Splunk forwarder file monitor is not detecting new files and getting error "Bug during applyPendingMetadata, header processor does not own the indexed extractions confs"?

jamesar
Explorer

Hi Splunkers,

I am monitoring a folder (/opt/pvlogs/QUT-GP-P10) with a collection of CSV text files, as follows:

....
int_magnetek_151019.txt 
int_magnetek_151020.txt
int_magnetek_151021.txt
int_magnetek_151022.txt
int_magnetek_151023.txt
int_magnetek_151024.txt
int_magnetek_151025.txt
int_magnetek_151026.txt
int_magnetek_151027.txt
....

Log file format example:

[info]
Anlage=QUT P BLOCK LEVEL 10
Datum=151027
[messung]
;s;Adr;WR;MPC;S_GL;S_WR;S_DC1;S_DC2;S_AL;UDC1;IDC1;UDC2;IDC2;UAC;IAC;PAC;FAC;T_WR1;T_WR2;I_LC;R_ISO;E_TOTAL;E_INT;P_LIMIT;COS_PHI
;s;;;%;;;;;;V;A;V;A;V;A;W;Hz;°C;°C;A;MOhm;kWh;Wh;;
[Start]
00:15:00;900;2;PVI-10.0-OUTD;;;;;;;;;;;;;;;;;;;;;;
00:15:00;900;3;PVI-10.0-OUTD;;;;;;;;;;;;;;;;;;;;;;
00:15:00;900;4;PVI-6000-OUTD;;;;;;;;;;;;;;;;;;;;;;
....

The configuration settings:

inputs.conf

[monitor:///opt/pvlogs/QUT-GP-P10/*.txt]
disabled = false
index = test
sourcetype = sec_pv_data
host = QUT-GP-P10
crcSalt = <SOURCE>

props.conf

[sec_pv_data]
SHOULD_LINEMERGE=false
HEADER_FIELD_LINE_NUMBER=5
HEADER_FIELD_DELIMITER=;
SEDCMD-null=s/\[Start\]|\[info\]|\[messung\]|Anlage.*|Datum.*|Info.*|;Time.*|;s;.*//
FIELD_DELIMITER=;    

The log files are created by a solar PV logger that updates the file (located on the actual logger device) with a new entry at 15 minute intervals. At midnight every night, the log file for that day is copied from the logger device, to the monitored folder. The forwarder should then detect the new file and forward the data to the indexer.

The file copy is achieved by a cron job that uses curl to connect to the logger device URL and pull down the new log file into the folder monitored by splunk. The problem is that the forwarder does not detect this new file written into the monitored folder. However, if I restart the Splunk service (with no other changes), the forwarder then ingests the previously unindexed log file/s.

In splunkd.log, there is an ERROR that looks to be related to this issue:

10-27-2015 00:15:04.961 +1000 ERROR WatchedFile - Bug during applyPendingMetadata, header processor does not own the indexed extractions confs.
10-27-2015 00:15:04.962 +1000 ERROR TailReader - Ignoring path="/opt/pvlogs/QUT-GP-Y11/int_magnetek_151027.txt" due to:   Bug during applyPendingMetadata, header processor does not own the indexed extractions confs.

Does anyone have any information about this error, or any advice on why my forwarder does not automatically detect new files that have been copied into a monitored folder (using curl)?

Also, why would restarting the service then allow the forwarder to detect the file?

1 Solution

jamesar
Explorer

For anyone interested in this problem, I have made progress identifying the cause of this issue....

The log files producing the errors are written by embedded PV logger devices. Sometimes a problem occurs with a logger, when initially creating a new log file, it contains erroneous/incorrect data (i.e. sometimes the file contains a single line like <h1>File Not Found). This file is then overwritten at some later date (possibly a few hours later) with the correct data structure (with the expected CSV headers etc..).

The bug occurs when the Splunk forwarder first detects the data file containing the incorrect data/format, and tries to ingest it. When the file is overwritten with the correct data, the forwarder doesn't attempt to re-ingest the file. The error message obviously was not very informative about the actual problem.

I haven't been able to find a way to setup the forwarder config, so that it ingests each file as a new unique file (i.e. crcSalt =), and also ingests any logs appended to the existing file. BUT if the file is completely overwritten with a new file (i.e. the situation above), then it re-ingests that file from the beginning (and then continues to ingest appends to the new file).

I was able to find a work around, by simply restarting the splunk forwarder service (via cron job) at a time where the erroneous log file has corrected itself.

Anyway, hopefully this helps someone facing a similar issue, cheers...

View solution in original post

jamesar
Explorer

For anyone interested in this problem, I have made progress identifying the cause of this issue....

The log files producing the errors are written by embedded PV logger devices. Sometimes a problem occurs with a logger, when initially creating a new log file, it contains erroneous/incorrect data (i.e. sometimes the file contains a single line like <h1>File Not Found). This file is then overwritten at some later date (possibly a few hours later) with the correct data structure (with the expected CSV headers etc..).

The bug occurs when the Splunk forwarder first detects the data file containing the incorrect data/format, and tries to ingest it. When the file is overwritten with the correct data, the forwarder doesn't attempt to re-ingest the file. The error message obviously was not very informative about the actual problem.

I haven't been able to find a way to setup the forwarder config, so that it ingests each file as a new unique file (i.e. crcSalt =), and also ingests any logs appended to the existing file. BUT if the file is completely overwritten with a new file (i.e. the situation above), then it re-ingests that file from the beginning (and then continues to ingest appends to the new file).

I was able to find a work around, by simply restarting the splunk forwarder service (via cron job) at a time where the erroneous log file has corrected itself.

Anyway, hopefully this helps someone facing a similar issue, cheers...

Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...