Alerting

What is the source of SplitCompression ERROR messages?

Lowell
Super Champion

Can someone explain the normal source of these errors? I've seen these errors in both the search.log (in the dispatch folder) and when exporting a bucket using exporttool. Also, what is the appropriate action after these errors are encountered?

Example 1:

invalid read: addr=4ab60c4
ERROR SplitCompression - invalid separator: path=db_1267889549_1188869197_13/rawdata/2501302080.gz, offset=5623968, separator='l', expected='|'
ERROR SplitCompression - invalid separator: path=db_1267889549_1188869197_13/rawdata/2501302080.gz, offset=5623968, separator='l', expected='|'

Example 2:

invalid read: addr=4ab7148
ERROR SplitCompression - gzip seek failure: file=db_1267889549_1188869197_13/rawdata/2501302080.gz, hit unexpected EOF at 7279072 while trying to seek to 15375232
ERROR SplitCompression - gzip seek failure: file=db_1267889549_1188869197_13/rawdata/2501302080.gz, hit unexpected EOF at 7279072 while trying to seek to 15375232

Example 3:

ERROR databasePartitionPolicy - Could not read event: cd=235:137137517 index=_internal

Example 3 was taken from a search.log file in a different failure situation. But there were a bunch of errors like that intermixed with a bunch of the invalid separator messages.

Tags (2)
0 Karma
1 Solution

jrodman
Splunk Employee
Splunk Employee

A SplitCompression error essentially means that the index files (tsidx) do not agree with the event text files (rawdata). This can result from a defect in splunk, or in the operating system, or the hardware, or a power failure.

Correlate the timestamps for the bucket, and the related rawdata files with any possible crash_log files in your var/log/splunk directory. As well as any possible errors which landed into splunkd.log around those times. Perhaps this will help identify a defect (if any). If you experienced system crashes around that time, or power loss, that's most likely the cause.

databasePartitionPolicy is in charge of what buckets exist, where they should be located (warm/cold etc) and what should be searched. An error emitted by it could mean that a bucket was frozen during a search, for example.

That tsidx files do not agree with rawdata may indicate that you have experienced data loss, but in some cases the events may be written again, regardless. This comes down to the index data updates not being fully synchronized in an atomic way with the rawdata updates. A shutdown and restart might lead to an inconsistent state between the two that is fully consistent for all events received.

In general, these messages merit investigation. Since some of the messages in splunkd.log are best interpreted in the context of both expereince troubleshooting splunk, and adjoining messages, you would be well served to engage support with a nice juicy diag and a request for root cause analysis. If your local system administration team can correlate the errors to system events though, that's probably your answer.

View solution in original post

jrodman
Splunk Employee
Splunk Employee

A SplitCompression error essentially means that the index files (tsidx) do not agree with the event text files (rawdata). This can result from a defect in splunk, or in the operating system, or the hardware, or a power failure.

Correlate the timestamps for the bucket, and the related rawdata files with any possible crash_log files in your var/log/splunk directory. As well as any possible errors which landed into splunkd.log around those times. Perhaps this will help identify a defect (if any). If you experienced system crashes around that time, or power loss, that's most likely the cause.

databasePartitionPolicy is in charge of what buckets exist, where they should be located (warm/cold etc) and what should be searched. An error emitted by it could mean that a bucket was frozen during a search, for example.

That tsidx files do not agree with rawdata may indicate that you have experienced data loss, but in some cases the events may be written again, regardless. This comes down to the index data updates not being fully synchronized in an atomic way with the rawdata updates. A shutdown and restart might lead to an inconsistent state between the two that is fully consistent for all events received.

In general, these messages merit investigation. Since some of the messages in splunkd.log are best interpreted in the context of both expereince troubleshooting splunk, and adjoining messages, you would be well served to engage support with a nice juicy diag and a request for root cause analysis. If your local system administration team can correlate the errors to system events though, that's probably your answer.

jrodman
Splunk Employee
Splunk Employee

That could definitely cause it.

0 Karma

Lowell
Super Champion

I suppose that having multiple copies of splunkd running concurrently could cause this problem as well? (That's a really annoying problem.)

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...