Splunk Search

Speed Search Review

rafazurc
New Member

Hello Everyone.

I m new to splunk and I have one search which is taking a bit longer than others. Is there any suggestion on how to improve this search ?

index=mydatasource_* (sourcetype = x_connections OR sourcetype= x_collectors) engine="engine" Src_SubnetName = "vpn"| eval src= if(isnull(src),name, src)
| eval Dates = _time
| eval Src_SubnetName = Src_Sitename
| convert timeformat="%Y-%m-%d" ctime(Dates)
| stats dc(src) by src,Src_SubnetName, Dates

alt text

0 Karma

sectrainingjk
Explorer

To add to what @efavreau said about identifying words that will end up in every result...

I've had a lot of success using the [Patterns] analysis of search results to identify these words.

Also, [All Fields] and then sorting to see fields with maximum "100% Event Coverage" and "# of Values" can help as well.

0 Karma

to4kawa
Ultra Champion
index=mydatasource_* ((sourcetype = x_connections src=*) OR (sourcetype= x_collectors name=*)) engine="engine" Src_SubnetName = "vpn" 
| eval src= coalesce(src,name) 
| eval Dates = strftime(_time, "%F") 
| stats estdc(src) as distinct_src_count by Src_Sitename, Dates
| rename Src_Sitename as Src_SubnetName

Your query has extra calculations.
How about this?

0 Karma

PavelP
Motivator

Hello @rafazurc ,

run these searches (use the "smart mode", use a short period like last 60min instead of last 24hours) and post their search.log (your search.log screenshot is not complete and some important information can be missed) :

search 1:

index=mydatasource_* (sourcetype = x_connections OR sourcetype= x_collectors) engine="engine" Src_SubnetName = "vpn"| eval src= if(isnull(src),name, src)
| eval Dates = _time
| eval Src_SubnetName = Src_Sitename
| convert timeformat="%Y-%m-%d" ctime(Dates)
| stats dc(src) by src,Src_SubnetName, Dates

search 2:

index=mydatasource_* (sourcetype = x_connections OR sourcetype= x_collectors) engine="engine" Src_SubnetName = "vpn"

search 3:

index=mydatasource_* (sourcetype = x_connections OR sourcetype= x_collectors) 

by comparing durations of command.search component you'll get the idea if your search can be [easily] optimized.

search 4:

index=mydatasource_* sourcetype = x_connections

search 5:

index=mydatasource_* sourcetype= x_collectors

also check the splunk documentation and try to find out if this a rare/sparse or rare search

0 Karma

jpolvino
Builder

The dc aggregation function can be very expensive. Did you job inspector give any insight as to where the time is being spent? I'm also curious what you're ultimately trying to achieve...knowing that may help the community solve your challenge.

See this link for info on dc and how to work around it.

0 Karma

rafazurc
New Member

Hello @jpolvino. I ve added the print of mu job inspector results. What I m trying to achieve is. I have 2 sourcetypes one is the connection and the other collector. The fist one, the field I need to use is src, the second is name. So I m trying to check each event and if src is null consider the name.

After that, I m formatting _time as date, and the SubnetName is a common field for both sourcetypes. The result I need is to list distinct src by each network by day.

I really would like to optimize this search to reduce the search cost. I m checking the link you ve sent looking for more hints.

Thanks

0 Karma

efavreau
Motivator

@rafazurc The more specific you can make a search before the first |, the faster it will be. Do you need need blank src in your results? The put in src=*, to get rid of blanks. Do you need all those indexes? Is there any other detail, even a word or two that will appear in every result? Put all of that up front before the first pipe. Otherwise, it is what it is. The rest of your SPL isn't expensive.

###

If this reply helps you, an upvote would be appreciated.
0 Karma

rafazurc
New Member

Hello @efavreau. As I have 2 sourcetype and one has src and other name. Does it work to add before the first pipe (src=* OR name=*) Thanks

0 Karma

efavreau
Motivator

@rafazurc If you need these fields, then adding (src=* OR name=*) is better than not having it.

###

If this reply helps you, an upvote would be appreciated.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

How long is "a bit"? How much data is being searched? Searching more data will take more time.

---
If this reply helps you, Karma would be appreciated.
0 Karma

rafazurc
New Member

Hello @richgalloway .

To search the last 24hours (~200M events ) takes around 45 minutes. and generates ~80k Results.

Thanks.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...