Alerting

What is the source of SplitCompression ERROR messages?

Lowell
Super Champion

Can someone explain the normal source of these errors? I've seen these errors in both the search.log (in the dispatch folder) and when exporting a bucket using exporttool. Also, what is the appropriate action after these errors are encountered?

Example 1:

invalid read: addr=4ab60c4
ERROR SplitCompression - invalid separator: path=db_1267889549_1188869197_13/rawdata/2501302080.gz, offset=5623968, separator='l', expected='|'
ERROR SplitCompression - invalid separator: path=db_1267889549_1188869197_13/rawdata/2501302080.gz, offset=5623968, separator='l', expected='|'

Example 2:

invalid read: addr=4ab7148
ERROR SplitCompression - gzip seek failure: file=db_1267889549_1188869197_13/rawdata/2501302080.gz, hit unexpected EOF at 7279072 while trying to seek to 15375232
ERROR SplitCompression - gzip seek failure: file=db_1267889549_1188869197_13/rawdata/2501302080.gz, hit unexpected EOF at 7279072 while trying to seek to 15375232

Example 3:

ERROR databasePartitionPolicy - Could not read event: cd=235:137137517 index=_internal

Example 3 was taken from a search.log file in a different failure situation. But there were a bunch of errors like that intermixed with a bunch of the invalid separator messages.

Tags (2)
0 Karma
1 Solution

jrodman
Splunk Employee
Splunk Employee

A SplitCompression error essentially means that the index files (tsidx) do not agree with the event text files (rawdata). This can result from a defect in splunk, or in the operating system, or the hardware, or a power failure.

Correlate the timestamps for the bucket, and the related rawdata files with any possible crash_log files in your var/log/splunk directory. As well as any possible errors which landed into splunkd.log around those times. Perhaps this will help identify a defect (if any). If you experienced system crashes around that time, or power loss, that's most likely the cause.

databasePartitionPolicy is in charge of what buckets exist, where they should be located (warm/cold etc) and what should be searched. An error emitted by it could mean that a bucket was frozen during a search, for example.

That tsidx files do not agree with rawdata may indicate that you have experienced data loss, but in some cases the events may be written again, regardless. This comes down to the index data updates not being fully synchronized in an atomic way with the rawdata updates. A shutdown and restart might lead to an inconsistent state between the two that is fully consistent for all events received.

In general, these messages merit investigation. Since some of the messages in splunkd.log are best interpreted in the context of both expereince troubleshooting splunk, and adjoining messages, you would be well served to engage support with a nice juicy diag and a request for root cause analysis. If your local system administration team can correlate the errors to system events though, that's probably your answer.

View solution in original post

jrodman
Splunk Employee
Splunk Employee

A SplitCompression error essentially means that the index files (tsidx) do not agree with the event text files (rawdata). This can result from a defect in splunk, or in the operating system, or the hardware, or a power failure.

Correlate the timestamps for the bucket, and the related rawdata files with any possible crash_log files in your var/log/splunk directory. As well as any possible errors which landed into splunkd.log around those times. Perhaps this will help identify a defect (if any). If you experienced system crashes around that time, or power loss, that's most likely the cause.

databasePartitionPolicy is in charge of what buckets exist, where they should be located (warm/cold etc) and what should be searched. An error emitted by it could mean that a bucket was frozen during a search, for example.

That tsidx files do not agree with rawdata may indicate that you have experienced data loss, but in some cases the events may be written again, regardless. This comes down to the index data updates not being fully synchronized in an atomic way with the rawdata updates. A shutdown and restart might lead to an inconsistent state between the two that is fully consistent for all events received.

In general, these messages merit investigation. Since some of the messages in splunkd.log are best interpreted in the context of both expereince troubleshooting splunk, and adjoining messages, you would be well served to engage support with a nice juicy diag and a request for root cause analysis. If your local system administration team can correlate the errors to system events though, that's probably your answer.

jrodman
Splunk Employee
Splunk Employee

That could definitely cause it.

0 Karma

Lowell
Super Champion

I suppose that having multiple copies of splunkd running concurrently could cause this problem as well? (That's a really annoying problem.)

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...