Solved: Re: KVstore usage verification on Indexers | check...

NullZero

Background:
I have a client with a large clustered environment, I have recently upgraded it to 9.4.6 and fixed wiredTiger / MongoDB 7.0.14 on their indexers. During the remediation work I fixed historical incorrect settings such as the GUI being enabled on the Indexer tier. I want to undertake further best practice remediation work and disable the KVstore on them in order to save resource and prevent un-necessary services running.

Challenge:
I understand that it is possible / viable to enable a KVstore collection on the peers (indexers) in certain use cases where it can add value. I do not see a use case for my client and they are not aware of one. I have checked the peers and they are all individual KV store captains.

splunk show kvstore-status --verbose

Checks:
I do not want to disable the KVstores (via the CM and config distribution) until I have verified the contents of them. I can issue the command via CLI to check the list:

./splunk search '| rest /servicesNS/-/-/data/transforms/lookups splunk_server=local | search type=kvstore | fields title, collection, id'

- This outputs a tidy 3 column list in the CLI.
- Example is the Linux TA:

title	collection	id
auditd_host_inventory	auditd_host_inventory	https://127.0.0.1:8089/servicesNS/nobody/TA-linux_auditd/data/transforms/lookups/auditd_host_invento...

I can then check the linux TA Kvstore via this command:

curl -k -u <splunk-user> https://localhost:8089/services/search/v2/jobs/export -d search=" | inputlookup auditd_host_inventory"

This outputs a list of circa 400 results.

Guidance:
- There is a list of about 8 other KVstores from my first command.
- I am struggling to issue the same command for the other KVstores
- I think this is due to the app context / name space and sharing
- I am getting myself wrapped up in the correct Rest endpoint command to simply issue the command for my list of each of the remaining 8 kvstores e.g.

title
snow_sys_user_list_lookup

collection
snow_sys_user_list_kvstore_lookup

Any guidance on this and generally checking or disabling on peers gratefully received. The aim is to protect the clients environment and give them confidence that I have checked before disabling.

livehybrid

Hi @NullZero

You could try adding the `namespace` / app ID to the services/search/v2/jobs/export request such as:

curl -k -u <splunk-user> https://localhost:8089/services/search/v2/jobs/export -d search=" | inputlookup snow_sys_user_list_lookup" -d namespace="Splunk_TA_snow"

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

View solution in original post

NullZero

The other question here is why aren't kvstores turned off in PS Base apps on G Drive? Anybody know who we can contact to improve that.

livehybrid

Hi @NullZero

You could try adding the `namespace` / app ID to the services/search/v2/jobs/export request such as:

curl -k -u <splunk-user> https://localhost:8089/services/search/v2/jobs/export -d search=" | inputlookup snow_sys_user_list_lookup" -d namespace="Splunk_TA_snow"

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

NullZero

So the addition of namespace worked @livehybrid but the results were actually fairly hard to work with. For example we really only want to return the header row, the namespace, title, appcontext etc are all rather fiddly.

I think this demonstrates a good attempt to really investigate however and I have the confidence to inform my client.

NullZero

Thanks @VatsalJagani I appreciate the feedback. I agree, but with a client is it not appropriate to demonstrate that you've checked and have evidence? It's not a strong answer if something were to go wrong for example?

VatsalJagani

@NullZero - Splunk Docs are the proof:

When you are using distributed environment, indexer's knowledge objects will not be used
- https://help.splunk.com/en/splunk-enterprise/administer/distributed-search/9.2/knowledge-bundle-repl...
- https://help.splunk.com/en/splunk-enterprise/administer/distributed-search/9.2/knowledge-bundle-repl...
KVstore are generally not by default replicated to Indexers
But if replicate=true is set for KVstore lookups then also, the replication happen from Search Head to Indexer in a CSV formatted files inside the knowledge bundle
- https://splunk.my.site.com/customer/s/article/indexer-name-Could-not-load-lookup-LOOKUP-automatic-lo...
- https://splunk.my.site.com/customer/s/article/KV-Store-lookups-occupying-4-5-Gigs-of-bundle-size

And regarding lets say if you have some data from the past inside KVstore lookups, which might need to recover in the future, you can always re-enable the kvstore and get it back.

I hope this helps!!!

NullZero

@VatsalJagani - This is not a viable approach for a customer, turn it off in prod, see if someone shouts and then just turn back on and shrug. We need to prove that we have checked it and show that we're operating in the best way possible.

I'm about to try the namespace addition to the REST search, that may well be the fix I was looking for and thanks @livehybrid

PickleRick

But as @VatsalJagani said - there is no replication of KVstore between SH and indexer layers. True, if the particular collection is set to replicate, _its contents_ get replicated to indexers but they do that by pushing a csv file with a dump of the collection data. There it gets treated like a normal csv-backed lookup.

There is no clustering (and clustering would be required for replication) of KVstores between SH tier and indexer tier. There is not even a clustering of KVstores at indexer tier if they are enabled there - each indexer runs its own 1-member mongo cluster.

livehybrid

@NullZero you mentioned that auditd_host_inventory on the indexers have a number of events, it looks like the default for the collections in the TA-linux_auditd app is for the collections to replicate to the indexers.

# collections.conf
[auditd_host_inventory]
replicate = true

[learnt_posix_identities]
replicate = true

Therefore I am not sure if disabling KV Store on the indexers when this is set on the SHC would cause any issue?

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

VatsalJagani

@NullZero - Simple answer is you do not need KVstore on Indexer.

Unless you run search manually on the Indexer, and you are running the search from Search Head
- Kvstore lookups on Indexer will never gets utilized.
- Actually nothing on Indexer config will be utilized for searches run by users.

So regardless of what you have on the Indexer, it is currently not being utilized. So you are safe to disable it unless you are using Indexers in unconventional way directly.

I hope this helps!!!

PickleRick

I can think of one very border case (actually being a badly engineered environment) when you have an input running on indexer. And that input would use KVstore to store state. I can't tell from the top of my head which add-ons used that but there were ones out there in the wild that would do that. But again - this is a badly engineered environment since you shouldn't run such stuff on an indexer. The right way is to set up a separate HF for this.

KVstore usage verification on Indexers | check before disabling

other

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Join the Conversation

KVstore usage verification on Indexers | check before disabling

other

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...