Getting Data In

Indexing XML files from universal forwarder

lmacneil76
Explorer

Hi all, attempting to index 44,833 xml files for parsing. I know splunk needs some configuration changes to work better with xml depending on your needs. In my case each file is unique.

My problem is out of 44,833 xml files about 300 are marked dups.

Log Error

06-13-2014 12:43:51.734 -0700 ERROR TailingProcessor - File will not be read, seekptr checksum did not match (file=C:\var\xml\new_production_items\2-10240750-Qti.xml).  Last time we saw this initcrc, filename was different.  You may wish to use a CRC salt on this source.  Consult the documentation or file a support case online at http://www.splunk.com/page/submit_issue for more info.

So I have used crcSalt and initCrcLength, but failing on the implementation. I have placed them in inputs.conf on the server than changed to the forwarder to no avail.

This is the inputs.conf settings

[monitor://c:\var\xml\*.xml]
disabled = 0
followTail = 0
crcSalt = <SOURCE>
initCrcLength = 2048

I have tried higher initCrcLength values from 1024 to 10500, nothing seems to take.

Each time I make a change I run the following commands:

Splunk universal forwarder:

splunk stop
splunk clean all
splunk start

Splunk Server:

splunk stop
splunk clean eventdata
splunk start

Any help would be greatly appreciated!

(Update)..
Just noticed that even with the error the file is still logged in some cases. My final results still indicate not all files are indexed but maybe the error above is a red herring!

0 Karma
1 Solution

lmacneil76
Explorer

Found the solution. Each sub folder needs its own stanza.

So files at C:\var\xml\new_production_items\2-10240750-Qti.xml would look like this.

[monitor://c:\var\xml\new_production_items\*.xml]
disabled = 0
followTail = 0
crcSalt = <SOURCE>
initCrcLength = 2048

And the inputs.conf on the forwarder has this configuration.

View solution in original post

briansutherland
Explorer

Thanks, Windows, separate entry for each directory required and 'initCrcLength' stanza error goes away!

0 Karma

lmacneil76
Explorer

Found the solution. Each sub folder needs its own stanza.

So files at C:\var\xml\new_production_items\2-10240750-Qti.xml would look like this.

[monitor://c:\var\xml\new_production_items\*.xml]
disabled = 0
followTail = 0
crcSalt = <SOURCE>
initCrcLength = 2048

And the inputs.conf on the forwarder has this configuration.

Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...