Monitoring Splunk

For the Indexer Capacity Planning phase of upgrading our Splunk instance, where can I find what impact running searches will have on indexer performance?

marrette
Path Finder

Indexer Capacity Planning - linking indexing and search performance: how does one effect the office?

I'm attempting to plan an upgrade of our Splunk instance from an ancient 6.4.1 to a brand new 7.2 instance and as part of that I'm trying to work out what sort of capacity I need...

So this seems like it should be an easy task — as I'm already ingesting data into a Splunk setup so I can get real stats on how many searches are being run and how much data is being ingested.

However, after reading the capacity planning documents, I can't seem to find anything to indicate what the impact of running searches has on indexed performance. For example, there's the reference host specification which gives an idea of what indexer performance I can expect if no searches are being made and there is also a guide on resources used when searches are run.

But nothing linking the two?

If I'm ingesting 400gb of day per day and seem to be averaging about 10 concurrent searches per minute (during office hours) how will that impact the indexing rates of a reference specification host?

1 Solution

FrankVl
Ultra Champion

Have a look at these documentation pages:
Regarding search impact on indexers: http://docs.splunk.com/Documentation/Splunk/latest/Capacity/Accommodatemanysimultaneoussearches

And guidelines regarding the number of instances needed for certain numbers of users vs. data ingestion volumes:
http://docs.splunk.com/Documentation/Splunk/latest/Capacity/Summaryofperformancerecommendations

View solution in original post

0 Karma

FrankVl
Ultra Champion

Have a look at these documentation pages:
Regarding search impact on indexers: http://docs.splunk.com/Documentation/Splunk/latest/Capacity/Accommodatemanysimultaneoussearches

And guidelines regarding the number of instances needed for certain numbers of users vs. data ingestion volumes:
http://docs.splunk.com/Documentation/Splunk/latest/Capacity/Summaryofperformancerecommendations

0 Karma

marrette
Path Finder

Thanks for the information. I had a read those pages previously and still find parts of them a bit vague, for example:
An indexer that meets the reference hardware requirements can ingest up to 300GB/day while supporting a search load.
How much 'search load' - it is documented that the reference hardware can support 1.7tb of ingest if it has no search load. So there must be a sliding scale of search load reducing indexing performance?

The table further down that page also helps - but again it uses the vague figure of 'total users' which I can't seem to find defined anywhere (and a 'user' could be someone running a single query once very 10 minutes or having a big dashboard open and updating regularly).

0 Karma

FrankVl
Ultra Champion

Yeah, there is always a massive "it depends" with all of this. The docs provide some rules of thumb, but it is impossible to give and hard formulas I guess. As you correctly state: each user is different, but likewise also each search is different. Performance also heavily depends on your data distribution, the type of data, the amount of extractions / automated lookups etc. etc.

In the end it comes down to choosing a sensible starting point based on the high level guidelines and then closely monitoring performance to see if you need to scale out (which fortunately is relatively easy) or tune certain settings.

0 Karma

HiroshiSatoh
Champion

I think that it can be judged only by the number of cores of the CPU.
Since 10 cores are used for simultaneous search and 1.5 cores are used for index processing, delays will not occur if it exceeds 12 cores.
There is no problem if the specification and configuration (search head 1, indexer 2) are recommended configurations.

Also, if the data you import at once is large, tuning is necessary.

0 Karma

FrankVl
Ultra Champion

Then again: your current stats about number of concurrent searches may be totally irrelevant if your new deployment has faster CPUs or faster storage, or thanks to performance improvements in splunk itself, causing searches to complete much faster (and as a result your concurrent searches going down considerably).

0 Karma
Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...