Solved: Question about the replication factor in searchhea...

munang · ‎01-30-2024

Hello. I am a Splunk newbie.

I have a question about the replication factor in searchhead clustering.

Looking at the docs it says that search artifacts are only replicated for scheduled saved searches.

https://docs.splunk.com/Documentation/Splunk/9.1.2/DistSearch/ChooseSHCreplicationfactor

I'm curious as to the reason and advantage of duplicating search artifacts only in this case.

And, then, in the case of real-time search, is it correct that search artifacts are not replicated and only remain on the local server?

In that case, in a clustering environment, member 2 should not be able to see the search results of member 1.
But I can view it by using the loadjob command in member2.

Then, wouldn’t it be possible to view real-time search artifacts as well?

Thank you

PickleRick · ‎01-30-2024

Real-time searches are eeeeeevil and generally should not be used at all. Also there's not much to replicate in case of a real-time search since... they occur real-time and if you tried to run it another time you'd be running it with another set of data.

But if you meant ad-hoc search - I think the assumption is that ad-hoc search are used interactively so that you're not probably gonna need the results in another session, called with loadjob by another person logged in to another SHC member. With scheduled search it's different because a quite common way to optimize load is to schedule separate searches asynchronously so that one search uses the results from the already-performed search.

So it's simply that in some cases it seems to make much more sense than in others.

View solution in original post

munang · ‎01-31-2024

@PickleRick

Sorry for the late reply.

Thank you for the clear explanation. I understand!!!

PickleRick · ‎01-30-2024

Real-time searches are eeeeeevil and generally should not be used at all. Also there's not much to replicate in case of a real-time search since... they occur real-time and if you tried to run it another time you'd be running it with another set of data.

But if you meant ad-hoc search - I think the assumption is that ad-hoc search are used interactively so that you're not probably gonna need the results in another session, called with loadjob by another person logged in to another SHC member. With scheduled search it's different because a quite common way to optimize load is to schedule separate searches asynchronously so that one search uses the results from the already-performed search.

So it's simply that in some cases it seems to make much more sense than in others.

Question about the replication factor in searchhead clustering.

distributed search

search head

search head clustering

Troubleshooting the OpenTelemetry Collector

Adoption of Infrastructure Monitoring at Splunk

Modern way of developing distributed application using OTel