Getting Data In

Why is my log file sometimes ignored?

twinspop
Influencer

Self-answered question follows. Perhaps it will help someone else in the same boat.

I have a file called portal-server.log on a log server (NFS mount from many machines) that periodically doesn't log after a roll. The internal logs show:

09-30-2016 18:26:33.435 -0400 ERROR TailingProcessor - File will not be read, seekptr checksum did not match (file=/var/logs/host1048/portal-server.log).  Last time we saw this initcrc, filename was different.  You may wish to use a CRC salt on this source.  Consult the documentation or file a support case online at http://www.splunk.com/page/submit_issue for more info.

I tried changing the initCrcLength but problem returned. (And I steered clear of using CRCSalt.) Checking the number of files on the log server. Checking the health of the NFS mount. So many avenues all leading to dead ends.

What is going on? Answer below...

0 Karma
1 Solution

twinspop
Influencer

I asked the user for the first few lines of the files thinking maybe there was a header that an initCrcLength adjustment would fix. No, it was plain old syslog:

2016-09-30 00:00:00,836 WARN  - [APPID: ] [TXID: ] [UID: ] [ORGOID: ] [AOID: ] [UA_MODE: ] - com.cs.services.ws.handlers.somehandler.handleMessage(): HTTP Header OrgOID is NOT present in the header

Then it hit me. I quickly searched for that exact log line:

index=problem_index earliest=@d latest=@d+1m "2016-09-30 00:00:00,836 WARN" orgoid is not present

And there it was. But in another source! The error from internal logs was legit. There was another log with IDENTICAL content. Turns out someone on the developer team added another appender to the log4j config.

View solution in original post

plaid_blanket
Explorer

9 years later, same problem, you saved me--thanks.  /var/log/secure and /var/log/messages both being monitored, both had the same log line at the beginning.

0 Karma

twinspop
Influencer

I asked the user for the first few lines of the files thinking maybe there was a header that an initCrcLength adjustment would fix. No, it was plain old syslog:

2016-09-30 00:00:00,836 WARN  - [APPID: ] [TXID: ] [UID: ] [ORGOID: ] [AOID: ] [UA_MODE: ] - com.cs.services.ws.handlers.somehandler.handleMessage(): HTTP Header OrgOID is NOT present in the header

Then it hit me. I quickly searched for that exact log line:

index=problem_index earliest=@d latest=@d+1m "2016-09-30 00:00:00,836 WARN" orgoid is not present

And there it was. But in another source! The error from internal logs was legit. There was another log with IDENTICAL content. Turns out someone on the developer team added another appender to the log4j config.

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Observability Simplified: Combining User Experience, Application Performance & ...

Tech Talk Observability Simplified: Combining User Experience, Application Performance & Network ...

Event Series May & June: From Network Visibility to Service Intelligence

Unifying the Network: Moving from Alert Noise to Service Intelligence with Splunk ITSI In today’s hybrid ...