Splunk Enterprise

Splunk real-time data integrity check

esllorj
New Member

Hi splunkers, 

My client wants to conduct a consistency check on all indexes that they collect

So I added enableDataIntegrityControl=1 to every index setting
and I created a script to run the command SPLUNK_CMD check-integrity -index "$INDEX" for all indexes.

But that's where the problem comes from. The data we keep collecting in real time is that running a command during check-integrity fails.  ( ex linux_os logs, window_os logs)

results are like this
result
server.conf/[sslConfig]/sslVerifyServerCert is false disabling certificate validation; must be set to "true" for increased security
disableSSLShutdown=0
Setting search process to have long life span: enable_search_process_long_lifespan=1
certificateStatusValidationMethod is not set, defaulting to none.
Splunk is starting with EC-SSC disabled
CMIndexId: New indexName=linux_os inserted, mapping to id=1
Operating on: idx=linux_os bucket='/opt/splunk/var/lib/splunk/linux_os/db/db_1737699472_1737699262_0'
Integrity check error for bucket with path=/opt/splunk/var/lib/splunk/linux_os/db/db_1737699472_1737699262_0, Reason=Journal has no hashes.
Operating on: idx=_audit bucket='/opt/splunk/var/lib/splunk/linux_os/db/hot_v1_1'
Total buckets checked=2, succeeded=1, failed=1
Loaded latency_tracker_log_interval with value=30 from stanza=health_reporter
Loaded aggregate_ingestion_latency_health with value=1 from stanza=health_reporter
aggregate_ingestion_latency_health with value=1 from stanza=health_reporter will enable the aggregation of ingestion latency health reporter.
Loaded ingestion_latency_send_interval_max with value=86400 from stanza=health_reporter
Loaded ingestion_latency_send_interval with value=30 from stanza=health_reporter

Is there a way to solve these problems?

Labels (1)
0 Karma

livehybrid
Super Champion

Hi @esllorj 

In short - you cannot run an integrity check against buckets created before the integrity check was enabled, see the following community post: https://community.splunk.com/t5/Splunk-Enterprise/enable-integrity-control-on-splunk-6-3/m-p/266889#....

Credit to @dbhagi_splunk for their answer here:

Data Integrity Control feature & the corresponding settings/commands only apply to the data that is indexed after turning on this feature. It won't go ahead & generate hashes (or even check integrity) for pre-existing data.

So in the case where "./splunk check-integrity -index [index_name]" returned the following error, That means this bucket is not created/indexed with Data Integrity control feature enabled. Either it was created before you enabled it (assuming you turned on this feature for your index now) or you haven't enabled this feature for the index=index_name at all.

Error description "journal has no hashes": This indicates that journal is not created with hashes enabled.
Integrity check error for bucket with path=/opt/splunk/var/lib/splunk/index_name/db/db_1429532061_1429531988_278, Reason=Journal has no hashes.

Same applies to "./splunk generate-hash-files -index [ index_name]"
You would be able to generate (means, extracting the hashes embedded in the journal) only for data integrity control enabled buckets. This won't go and compute/create hashes for normal buckets without this feature enabled. Say you enabled the feature & you created few buckets, but you lost hash files of a particular bucket (someone modified or deleted them on disk), then you can run this command so that it again extract hashes & writes them to hash files (l1hashes_id_guid.dat, l2hash_id_guid.dat). Hope i answered all your questions.

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to July Tech Talks, Office Hours, and Webinars!

What are Community Office Hours?Community Office Hours is an interactive 60-minute Zoom series where ...

Updated Data Type Articles, Anniversary Celebrations, and More on Splunk Lantern

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

A Prelude to .conf25: Your Guide to Splunk University

Heading to Boston this September for .conf25? Get a jumpstart by arriving a few days early for Splunk ...