Getting Data In

indexing issue with IIS logs (File will not be read, seekptr checksum did not match)

Contributor

I'm supporting a system where we have deployed servers that are uploading their IIS logs to a central location. The indexer is configured to monitor the central location where each deployed server has its own uniquely named folder structure. The deployed servers are configured to upload their IIS logs every 12 hours. The IIS logs are configured to roll every day, but because the servers are uploading the logs twice a day, that means each log should be updated at least once.

So far, we've not had any issues (that I'm aware of) with duplicate events. However, some logs are simply not being indexed, and checking the _internal log today, I noticed a lot of these entries for the "missing" logs:

File will not be read, seekptr checksum did not match (file=\FILESERVER\SHARE\DEPT\UNIQUESVRNAME\admin\iislogs\uex170518.log). Last time we saw this initcrc, filename was different. You may wish to use larger initCrcLen for this sourcetype, or a CRC salt on this source.

And also some of these, which I assume just means the total log length was shorter than the default 256 byte initCrcLength value?

File will not be read, is too small to match seekptr checksum (file=\FILESERVER\SHARE\DEPT\UNIQUESVRNAME\admin\iislogs\uex170515.log). Last time we saw this initcrc, filename was different. You may wish to use larger initCrcLen for this sourcetype, or a CRC salt on this source.

The vast majority of these logs are being indexed just fine. What need I do to cleaned up these outliers? Just set the initCrcLength to something longer? I don't want any duplication, but I do want to be sure all of the logs are being indexed. I'm reading the documentation, but not really grasping how the CrcSalt and initCrcLength work to know exactly what to do with them or if they would actually solve this problem.

1 Solution

SplunkTrust
SplunkTrust

You get this error when there are different files which are having same first 256 bytes (initCrcLength). One option would be to increase the initCrcLength of the file so that file for each day can have unique Crc Handler.
Assuming that your file name contains the date and update are being done (Either the whole content is replaced or new stuffs are added to end of the file), you can use crcSalt = <SOURCE> (exact string to be used), so that Crc Handler will be created based on file path and file for each day will have unique Crc Handler.

View solution in original post

0 Karma

SplunkTrust
SplunkTrust

You get this error when there are different files which are having same first 256 bytes (initCrcLength). One option would be to increase the initCrcLength of the file so that file for each day can have unique Crc Handler.
Assuming that your file name contains the date and update are being done (Either the whole content is replaced or new stuffs are added to end of the file), you can use crcSalt = <SOURCE> (exact string to be used), so that Crc Handler will be created based on file path and file for each day will have unique Crc Handler.

View solution in original post

0 Karma

Contributor

So since my IIS logs have all of this stuff at the top:

#Software: Microsoft Internet Information Services 5.1
#Version: 1.0
#Date: 2004-09-29 00:13:03
#Fields: time c-ip cs-method cs-uri-stem sc-status

Is that included in the initCrcLength calculation, or since my transform is configured to ignore anything beginning with #, does the length calculation start at the actual event that is indexed?

0 Karma

SplunkTrust
SplunkTrust

That is added to the Crc Handler. Since the CRC is calculated at forwarder level and transform is applied at Indexer/heavy forwarder, you ignoring contents doesn't affect the CRC calculation.

0 Karma

Contributor

These logs aren't being forwarded, so how does that change your statement if the files are being picked up directly monitored by the indexer?

0 Karma

SplunkTrust
SplunkTrust

It'll still hold true as the order of CRC calculation and application of Transform is done one after another and by different component of Splunk engine. Any change you make, you would need to restart Splunk so that it can re-enumerate the list of files to be monitoring and CRC handlers.

0 Karma

SplunkTrust
SplunkTrust

Seems like some of your logs are being identified as duplicates since they are failing in Cyclic Redundancy Check. Have you already applied crcSalt=<SOURCE> for your input?
If setting crcSalt to <SOURCE> does not work then may actually have to increase initCrcLength. Refer to documentation: https://docs.splunk.com/Documentation/Splunk/latest/Admin/Inputsconf

Also check out the following answers which talk about adding a string to make files unique instead of complete source path through <SOURCE>.
https://answers.splunk.com/answers/35210/crcsalt-issue.html
https://answers.splunk.com/answers/186232/how-to-configure-inputsconf-to-apply-crcsalt-for-o.html

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

Contributor

If I increase the initCrcLength setting, will Splunk automatically re-read the files it skipped or do I have to do something to get it to retry?

0 Karma

SplunkTrust
SplunkTrust

Manually move the files to a separate location where it will not be read by Splunk. Once crcSalt=<SOURCE> is in place copy the files over to the folder being monitored.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma