Splunk Search

optimization spl search command

indeed_2000
Motivator

Hi
I have spl command that take long time to return results!
The main goal is to find high duration consume by each servers overtime.

This spl command extract server and duration, is there any way to optimize this command?

index="my-index"
| search duration
| rex field=source  "\/data\/(?<product>\w+)\\/(?<customer>\w+)/(?<date>\d+)\/log\.(?<servername>\w+)."  
| rex  "duration\[(?<duration>\d+.\d+)"  
 
Scope: for 2 hours more than 20 billion events exist on this log file.

Any idea?
Thanks,

Labels (5)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

If you can also create a field extraction that will extract a duration field, then rather that just searching 'duration' which is looking for duration in the raw event text, you can search

index="my-index" duration>1000

which will limit the amount of rows you get back in the first place. At the moment you are doing the regex for all of those 2 billion rows.

Do you have any filtering criteria after those rex statements? If you are looking for long duration is that per event - if so the goal is to minimise the number of events you need to process, so searching for duration>X would help.

What else are you doing after the initial statements? Can you share more of that search or are you saying it takes two hours to just get that data?

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @indeed_2000,

at first you don't need to put the additional condition after pipe "|", in this way the search is slower, so please see this and using the job inspector, see if you have better performaces with something like this:

index="my-index" duration
| rex field=source  "\/data\/(?<product>\w+)\\/(?<customer>\w+)/(?<date>\d+)\/log\.(?<servername>\w+)."  
| rex  "duration\[(?<duration>\d+.\d+)" 

Second thing: if you could use one regex to extarct duration you'll have a quicker search.

But, having you so many events, there isn't any search that can have acceptable response times, so You ahve two solution or your problem, that i hint to use both:

  • check the resources of your Indexers,
  • use accelerated searches.

At first, how many CPUs have you in your Indexers, are you sure that they are sufficient?

Then, what's the IOPS of your storage?

storage is the real bottleneck of Splunk, for this reason, Splunk requires at least 800 IOPS (better 1200) on the storage.

About the second solution, see at https://docs.splunk.com/Documentation/SplunkCloud/latest/Knowledge/Aboutsummaryindexing and https://docs.splunk.com/Documentation/Splunk/8.2.2/Report/Acceleratereports and https://docs.splunk.com/Documentation/Splunk/8.2.2/Knowledge/Acceleratedatamodels

In few words. you should create a scheduled search (e.g. every 15 minutes or every hour) the pre elaborates your too many events and put results in a summary index or a data model, so, you can run your searches on them  (that are very less and quicker!) instead on the real events.

Ciao.

Giuseppe

 

0 Karma

indeed_2000
Motivator

is it enough resource?

Cpu 26 cores (Core per socket 1)


RAM 64GB


iostat -d  >
Device:            tps        kB_read/s    kB_wrtn/s        kB_read            kB_wrtn
sdc                  34.07      2007.13      3659.99         7109800816     12964706645

 

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @indeed_2000,

in general 26 Cores is a very good configuration (12 is the minimum), RAM isn't relevant for searches.

About storage, you should make a check using e.g. Bonnie++ to see the real throughput of your storage.

Anyway, the problem I think is in the great number of events, so take in consideration the choices I quickly described to accelerate your searches.

Ciao.

Giuseppe

0 Karma

indeed_2000
Motivator

1-How about spl that use table? they won’t be able to accelerated. any other idea?

2-when I put accelerated report on dashboard, it will load from scratch!

Tags (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @indeed_2000,

the best approach is to use a streaming comman (e.g. stats or timechart), if not possible run a search identifying the fields you need and save results in a summary index using a search like this:

index="my-index" duration
| rex field=source  "\/data\/(?<product>\w+)\\/(?<customer>\w+)/(?<date>\d+)\/log\.(?<servername>\w+)."  
| rex  "duration\[(?<duration>\d+.\d+)" 
| table date product customer servername duration
| collect index=my_summary

 then you can run your search on the summary index and it's faster:

index=my_summary
| table date product customer servername duration

About the second question, please see at  https://docs.splunk.com/Documentation/Splunk/8.2.2/Report/Embedscheduledreports

Ciao.

Giuseppe

0 Karma

indeed_2000
Motivator

do the same thing but index my_summary is empty after run collect query.Any idea?

 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

@indeed_2000 The summary index must exist before you can collect to it.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @indeed_2000,

in this case, you have to check the generating scheduled search:

  • has it results?
  • when did it run last time?

Ciao.

Giuseppe

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...