<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to Optimize Search Performance for Large-Scale Data in Splunk? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750386#M119224</link>
    <description>&lt;P&gt;3, 4 and partially 7 - not really.&lt;/P&gt;&lt;P&gt;3. Indexed fields - unless they contain additional metadata not present in the original events - are usually best avoided entirely. There are other ways of achieving the same result.&lt;/P&gt;&lt;P&gt;4. You can't use tstats instead of stats-based search just because the field is a number. It requiers specific types of data. True though that&amp;nbsp;&lt;EM&gt;if you can&lt;/EM&gt; use tstats instead of normal stats, it's way faster.&lt;/P&gt;&lt;P&gt;7. Wildcards at the beginning of search term should not be "avoided", they should not be used at all unless you have a very very very good for using them, know and understand the performance impact and can significantly limit sought through events using other means. The remark about regexes is generally valid but this is most often not the main reason for performance problems.&lt;/P&gt;</description>
    <pubDate>Thu, 24 Jul 2025 09:42:54 GMT</pubDate>
    <dc:creator>PickleRick</dc:creator>
    <dc:date>2025-07-24T09:42:54Z</dc:date>
    <item>
      <title>How to Optimize Search Performance for Large-Scale Data in Splunk?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750130#M119182</link>
      <description>&lt;P&gt;Hi Splunk Community,&lt;/P&gt;&lt;P&gt;I'm new to Splunk and working on a deployment where we index large volumes of data (approximately 500GB/day) across multiple sources, including server logs and application metrics. I've noticed that some of our searches are running slowly, especially when querying over longer time ranges (e.g., 7 days or more).&lt;/P&gt;&lt;P&gt;Here’s what I’ve tried so far:&lt;/P&gt;&lt;P&gt;Used summary indexing for some repetitive searches.&lt;/P&gt;&lt;P&gt;Limited the fields in searches using fields command.&lt;/P&gt;&lt;P&gt;Ensured searches are using indexed fields where possible.&lt;/P&gt;&lt;P&gt;However, performance is still not ideal, and I’m looking for advice on:&lt;/P&gt;&lt;P&gt;Best practices for optimizing search performance in Splunk for large datasets.&lt;/P&gt;&lt;P&gt;How to effectively use data models or accelerated reports to improve query speed.&lt;/P&gt;&lt;P&gt;Any configuration settings (e.g., in limits.conf) that could help.&lt;/P&gt;&lt;P&gt;My setup:&lt;/P&gt;&lt;P&gt;Splunk Enterprise 9.2.1&lt;/P&gt;&lt;P&gt;Distributed deployment with 1 search head and 3 indexers&lt;/P&gt;&lt;P&gt;Data is primarily structured logs in JSON format&lt;/P&gt;&lt;P&gt;Any tips, configuration recommendations, or resources would be greatly appreciated! Thanks in advance for your help.&lt;/P&gt;</description>
      <pubDate>Sun, 20 Jul 2025 05:57:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750130#M119182</guid>
      <dc:creator>zaks191</dc:creator>
      <dc:date>2025-07-20T05:57:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to Optimize Search Performance for Large-Scale Data in Splunk?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750132#M119183</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/311764"&gt;@zaks191&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Do all your servers meet the minimum &lt;A href="https://help.splunk.com/en/splunk-enterprise/get-started/deployment-capacity-manual/9.4/performance-reference/reference-hardware" target="_self"&gt;recommendations&lt;/A&gt; (16GB RAM/ 16 CPU Cores)? If so then your indexer configuration should suffice for a 500GB/day ingestion.&lt;/P&gt;&lt;P&gt;It sounds like this is the sort of task that would be better with the support of a Splunk Partner or Splunk Professional Services, but if tackling yourself then I would start with the following&lt;STRONG&gt;&amp;nbsp;non-exchaustive list of query optimization techniques&lt;/STRONG&gt;:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Limit queries to only query the timerange required&lt;/LI&gt;&lt;LI&gt;Ensure scheduled searches are not running more frequently than necessary&lt;/LI&gt;&lt;LI&gt;Ensure dashboards utilise base searches where possible&lt;/LI&gt;&lt;LI&gt;Ensure dashboards do not refresh/reload faster than necessary&lt;/LI&gt;&lt;LI&gt;Use &lt;A href="https://conf.splunk.com/files/2020/slides/PLA1089C.pdf" target="_self"&gt;techniques&lt;/A&gt; such as tstats queries where possible&lt;/LI&gt;&lt;LI&gt;Add TERM(&amp;lt;string&amp;gt;) values to your searches to help indexers find data faster&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Avoid wildcards in base searches; use specific terms or tags.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;span class="lia-unicode-emoji" title=":glowing_star:"&gt;🌟&lt;/span&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Did this answer help you?&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;If so, please consider:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Adding karma to show it was useful&lt;/LI&gt;&lt;LI&gt;Marking it as the solution if it resolved your issue&lt;/LI&gt;&lt;LI&gt;Commenting if you need any clarification&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Your feedback encourages the volunteers in this community to continue contributing&lt;/P&gt;</description>
      <pubDate>Sun, 20 Jul 2025 06:39:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750132#M119183</guid>
      <dc:creator>livehybrid</dc:creator>
      <dc:date>2025-07-20T06:39:49Z</dc:date>
    </item>
    <item>
      <title>Re: How to Optimize Search Performance for Large-Scale Data in Splunk?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750136#M119184</link>
      <description>&lt;P&gt;1. 500GB/day is not that big &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;&lt;P&gt;2. There are some general rules of thumb (which&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/170906"&gt;@livehybrid&lt;/a&gt;&amp;nbsp;already covered) but the search - to be effective - must be well built from scratch. Sometimes it simply can't be "fixed" if you have bad data (not "wrong", just inefficiently formed).&lt;/P&gt;&lt;P&gt;3. And there is no replacement for experience, unfortunately. Learn SPL commands, understand how they work, sometimes rethink your problem to fit better into SPL processing.&lt;/P&gt;&lt;P&gt;4. Use and love job inspector.&lt;/P&gt;</description>
      <pubDate>Sun, 20 Jul 2025 07:26:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750136#M119184</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2025-07-20T07:26:06Z</dc:date>
    </item>
    <item>
      <title>Re: How to Optimize Search Performance for Large-Scale Data in Splunk?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750155#M119188</link>
      <description>&lt;P&gt;Are you doing indexed extractions on the JSON data - that's not such a good idea as it can bloat your index with stuff you don't need there.&lt;/P&gt;&lt;P&gt;The question is not about "&lt;STRONG&gt;optimising for large datasets&lt;/STRONG&gt;", it's more about using the right queries for the data you have, large or small.&lt;/P&gt;&lt;P&gt;I suggest you post some example queries you have, as the community can offer some advice on whether they are good or not so good - use the code block syntax button above &amp;lt;&amp;gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For&amp;nbsp;&lt;/P&gt;&lt;P&gt;See my post in another thread about performance&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.splunk.com/t5/Splunk-Search/Best-Search-Performance-when-adding-filtering-of-events-to-query/m-p/750038#M242251" target="_blank"&gt;https://community.splunk.com/t5/Splunk-Search/Best-Search-Performance-when-adding-filtering-of-events-to-query/m-p/750038#M242251&lt;/A&gt;&lt;/P&gt;&lt;P&gt;As&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/231884"&gt;@PickleRick&lt;/a&gt;&amp;nbsp;says, the job inspector is your friend (see scanCount) and reducing that number will improve searches.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Use subsearches sparingly, avoid joins and transaction - they are almost never necessary. Summary indexing itself will not necessarily speed up your searches, particularly if the search that creates the summary index is bad and the search that searches the summary index is also bad.&amp;nbsp;&lt;/P&gt;&lt;P&gt;A summary index does not mean faster - it's just another index with data and you can still write bad searches against that.&lt;/P&gt;&lt;P&gt;Please share some of your worst searches and we can try to help.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 21 Jul 2025 01:29:14 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750155#M119188</guid>
      <dc:creator>bowesmana</dc:creator>
      <dc:date>2025-07-21T01:29:14Z</dc:date>
    </item>
    <item>
      <title>Re: How to Optimize Search Performance for Large-Scale Data in Splunk?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750381#M119223</link>
      <description>&lt;P&gt;&lt;STRONG&gt;HI&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/311764"&gt;@zaks191&lt;/a&gt;&amp;nbsp;,&lt;BR /&gt;&lt;BR /&gt;&lt;/STRONG&gt;&amp;nbsp;Please consider the below points for the better performance in your environment.&lt;STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;1.&lt;STRONG&gt; Be Specific in Searches&lt;/STRONG&gt;:&amp;nbsp;Always use index= and sourcetype= and add unique terms early in your search string to narrow down data quickly.&lt;BR /&gt;2. &lt;STRONG&gt;Filter Early, Transform Late&lt;/STRONG&gt;: Place filtering commands (like where, search) at the beginning and transforming commands (stats, chart) at the end of your SPL.&lt;/P&gt;&lt;P&gt;3.&lt;STRONG&gt;Leverage Index-Time Extractions&lt;/STRONG&gt;: Ensure critical fields are extracted at index time for faster searching, especially with JSON data.&lt;/P&gt;&lt;P&gt;4.&lt;STRONG&gt;Utilize tstats:&lt;/STRONG&gt; For numeric or indexed data, tstats is highly efficient as it operates directly on pre-indexed data (.tsidx files), making it much faster than search | stats.&lt;/P&gt;&lt;P&gt;5.&lt;STRONG&gt;Accelerate Data Models&lt;/STRONG&gt;: Define and accelerate data models for frequently accessed structured data. This pre-computes summaries, allowing tstats searches to run extremely fast.&lt;/P&gt;&lt;P&gt;6.&lt;STRONG&gt;Accelerate Reports&lt;/STRONG&gt;: For specific, repetitive transforming reports, enable report acceleration to store pre-computed results.&lt;/P&gt;&lt;P&gt;7.&lt;STRONG&gt;Minimize Wildcards and Regex&lt;/STRONG&gt;: Avoid leading wildcards (*term) and complex, unanchored regular expressions as they are resource-intensive.&lt;/P&gt;&lt;P&gt;8.&lt;STRONG&gt;Optimize Lookups&lt;/STRONG&gt;: For large lookups, consider KV Store lookups or pre-generate summaries via scheduled searches.&lt;/P&gt;&lt;P&gt;9.&lt;STRONG&gt;Use Job Inspector&lt;/STRONG&gt;: Regularly analyze slow searches with the Job Inspector to pinpoint bottlenecks (e.g., search head vs. indexer processing).&lt;/P&gt;&lt;P&gt;10.&lt;STRONG&gt;Review limits.conf&lt;/STRONG&gt; (Carefully): While not a primary fix, review settings like max_mem_usage_mb or max_keymap_rows in limits.conf after monitoring resource usage, but proceed with caution and thorough testing.&lt;/P&gt;&lt;P&gt;11.&lt;STRONG&gt;Setup Alerts for Expensive searches&lt;/STRONG&gt;: use internal metrics to detect problematic searches&lt;/P&gt;&lt;P&gt;12.&lt;STRONG&gt;Monitor and Limit User Search Concurrency:&lt;/STRONG&gt; Users running unbounded or wide time-range ad hoc searches can harm performance.&lt;BR /&gt;&lt;BR /&gt;Happy Splunking&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 24 Jul 2025 08:46:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750381#M119223</guid>
      <dc:creator>thahir</dc:creator>
      <dc:date>2025-07-24T08:46:42Z</dc:date>
    </item>
    <item>
      <title>Re: How to Optimize Search Performance for Large-Scale Data in Splunk?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750386#M119224</link>
      <description>&lt;P&gt;3, 4 and partially 7 - not really.&lt;/P&gt;&lt;P&gt;3. Indexed fields - unless they contain additional metadata not present in the original events - are usually best avoided entirely. There are other ways of achieving the same result.&lt;/P&gt;&lt;P&gt;4. You can't use tstats instead of stats-based search just because the field is a number. It requiers specific types of data. True though that&amp;nbsp;&lt;EM&gt;if you can&lt;/EM&gt; use tstats instead of normal stats, it's way faster.&lt;/P&gt;&lt;P&gt;7. Wildcards at the beginning of search term should not be "avoided", they should not be used at all unless you have a very very very good for using them, know and understand the performance impact and can significantly limit sought through events using other means. The remark about regexes is generally valid but this is most often not the main reason for performance problems.&lt;/P&gt;</description>
      <pubDate>Thu, 24 Jul 2025 09:42:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-Optimize-Search-Performance-for-Large-Scale-Data-in/m-p/750386#M119224</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2025-07-24T09:42:54Z</dc:date>
    </item>
  </channel>
</rss>

