Hi Splunk Community,
I'm new to Splunk and working on a deployment where we index large volumes of data (approximately 500GB/day) across multiple sources, including server logs and application metrics. I've noticed that some of our searches are running slowly, especially when querying over longer time ranges (e.g., 7 days or more).
Here’s what I’ve tried so far:
Used summary indexing for some repetitive searches.
Limited the fields in searches using fields command.
Ensured searches are using indexed fields where possible.
However, performance is still not ideal, and I’m looking for advice on:
Best practices for optimizing search performance in Splunk for large datasets.
How to effectively use data models or accelerated reports to improve query speed.
Any configuration settings (e.g., in limits.conf) that could help.
My setup:
Splunk Enterprise 9.2.1
Distributed deployment with 1 search head and 3 indexers
Data is primarily structured logs in JSON format
Any tips, configuration recommendations, or resources would be greatly appreciated! Thanks in advance for your help.
HI @zaks191 ,
Please consider the below points for the better performance in your environment.
1. Be Specific in Searches: Always use index= and sourcetype= and add unique terms early in your search string to narrow down data quickly.
2. Filter Early, Transform Late: Place filtering commands (like where, search) at the beginning and transforming commands (stats, chart) at the end of your SPL.
3.Leverage Index-Time Extractions: Ensure critical fields are extracted at index time for faster searching, especially with JSON data.
4.Utilize tstats: For numeric or indexed data, tstats is highly efficient as it operates directly on pre-indexed data (.tsidx files), making it much faster than search | stats.
5.Accelerate Data Models: Define and accelerate data models for frequently accessed structured data. This pre-computes summaries, allowing tstats searches to run extremely fast.
6.Accelerate Reports: For specific, repetitive transforming reports, enable report acceleration to store pre-computed results.
7.Minimize Wildcards and Regex: Avoid leading wildcards (*term) and complex, unanchored regular expressions as they are resource-intensive.
8.Optimize Lookups: For large lookups, consider KV Store lookups or pre-generate summaries via scheduled searches.
9.Use Job Inspector: Regularly analyze slow searches with the Job Inspector to pinpoint bottlenecks (e.g., search head vs. indexer processing).
10.Review limits.conf (Carefully): While not a primary fix, review settings like max_mem_usage_mb or max_keymap_rows in limits.conf after monitoring resource usage, but proceed with caution and thorough testing.
11.Setup Alerts for Expensive searches: use internal metrics to detect problematic searches
12.Monitor and Limit User Search Concurrency: Users running unbounded or wide time-range ad hoc searches can harm performance.
Happy Splunking
3, 4 and partially 7 - not really.
3. Indexed fields - unless they contain additional metadata not present in the original events - are usually best avoided entirely. There are other ways of achieving the same result.
4. You can't use tstats instead of stats-based search just because the field is a number. It requiers specific types of data. True though that if you can use tstats instead of normal stats, it's way faster.
7. Wildcards at the beginning of search term should not be "avoided", they should not be used at all unless you have a very very very good for using them, know and understand the performance impact and can significantly limit sought through events using other means. The remark about regexes is generally valid but this is most often not the main reason for performance problems.
Are you doing indexed extractions on the JSON data - that's not such a good idea as it can bloat your index with stuff you don't need there.
The question is not about "optimising for large datasets", it's more about using the right queries for the data you have, large or small.
I suggest you post some example queries you have, as the community can offer some advice on whether they are good or not so good - use the code block syntax button above <>
For
See my post in another thread about performance
As @PickleRick says, the job inspector is your friend (see scanCount) and reducing that number will improve searches.
Use subsearches sparingly, avoid joins and transaction - they are almost never necessary. Summary indexing itself will not necessarily speed up your searches, particularly if the search that creates the summary index is bad and the search that searches the summary index is also bad.
A summary index does not mean faster - it's just another index with data and you can still write bad searches against that.
Please share some of your worst searches and we can try to help.
1. 500GB/day is not that big 😉
2. There are some general rules of thumb (which @livehybrid already covered) but the search - to be effective - must be well built from scratch. Sometimes it simply can't be "fixed" if you have bad data (not "wrong", just inefficiently formed).
3. And there is no replacement for experience, unfortunately. Learn SPL commands, understand how they work, sometimes rethink your problem to fit better into SPL processing.
4. Use and love job inspector.
Hi @zaks191
Do all your servers meet the minimum recommendations (16GB RAM/ 16 CPU Cores)? If so then your indexer configuration should suffice for a 500GB/day ingestion.
It sounds like this is the sort of task that would be better with the support of a Splunk Partner or Splunk Professional Services, but if tackling yourself then I would start with the following non-exchaustive list of query optimization techniques:
Avoid wildcards in base searches; use specific terms or tags.
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing