Dashboards & Visualizations

Optimization of a 6minute+ search while using inline sparkline with tstats

wrussell12
Explorer

I have a search that I have almost optimized, the current search is 6+ minutes, and I have it down to about 16 seconds.
Inspecting the job is showing that most of the job is due to the join, which is needed for the spark line.

The idea is that we can monitor bandwidth for 2 days, and then flag if any outliers are found. If an outlier is found, I need to show a sparkline within a field for the last 4 hours.

This search takes about 16 seconds, I wish I could make it faster, but I cannot figure out how to make the sparkline work inside the tstats.

index="cables" source="bandwidth" earliest=-4h latest=now() | stats sparkline(avg(bandwidthMbps)) as "4HR" by cableId
| join cableId [| tstats count from datamodel=prod.cables by cables.bandwidth, cables.name,  cable.Id, _time span=1m 
| rename cable.id as cableId, cable.bandwidth as bandwidth, cable.name as name

| lookup allcables.csv cableId output zipcode,active | where active="true" | search zipcode!=*90210* 
| eventstats avg(bandwidth) as avg, stdev(bandwidth) as stdev by cableId | eval lowerBound=(avg-stdev*exact(2)), upperBound=(avg+stdev*exact(2)) | eval isOutlier=if(bandwidth < lowerBound OR bandwidth > upperBound, 1, 0) | sort - _time | dedup cableId | where isOutlier=1 | where _time > (now() - 360) ] 
| lookup network.csv ipName  | eval IP_lookup=coalesce(ip_name, alt_ip_name) | lookup Regions.csv range as IP_lookup output address | table cableId, avg, "4HR"

This is the original search

 index="cables" source="bandwidth" earliest=-4h latest=now() | stats sparkline(avg(bandwidthMbps)) as "4HR" by cableId
| join cableId [| tstats latest(cables.bandwidth) as bandwidth, cables.name as name from datamodel=prod.cables by cables.id,  _time span=1m 
    | rename cable.id as cableId
    | lookup allcables.csv cableId output zipcode,active | where active="true" | search zipcode!=*90210* 
    | eventstats avg(bandwidth) as avg, stdev(bandwidth) as stdev by cableId | eval lowerBound=(avg-stdev*exact(2)), upperBound=(avg+stdev*exact(2)) | eval isOutlier=if(bandwidth < lowerBound OR bandwidth > upperBound, 1, 0) | sort - _time | dedup cableId | where isOutlier=1 | where _time > (now() - 360) ] 
    | lookup network.csv ipName  | eval IP_lookup=coalesce(ip_name, alt_ip_name) | lookup Regions.csv range as IP_lookup output address | table cableId, avg, "4HR"

Differences:
1. the faster search uses count, instead of latest.
2. I tried using avg(bandwidth) within the tstats, but I was getting different results from the original search.

My two questions:
1. why is tstats count so much faster when I list everything I want within the by clause.
2. how can I get correct sparkline within the tstats without needing to do a double search. The tstats search is a range of 48hours, but the sparkline is the 4 hours.

I have a lot of searches like this, that take 6 minutes, and would love to optimize it all.

0 Karma

Kawtar
Path Finder

Hello,

You can replace join by append, but you should use stats by .

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

In order to see differences in search speeds and their reasons, do run both variants and post the performance graph from the top section of the job inspector.

Pay attention to field naming in the tstats searches, I see cables.id and cable.Id - it's likely that only one of those two is correct.

As for sparklines in tstats, these two searches do the same:

index=_internal | stats sparkline(count)

| tstats count where index=_internal by _time span=1m | stats sparkline(sum(count))

The tstats loads the underlying data, a count by minute. The stats formats the data as a sparkline.

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...