Dashboards & Visualizations

How to optimize a search with huge amount of data?

nilaksh92
Path Finder

Hi Everyone

I have an index, under which records are coming at 30 seconds.

I have one lookup, which is having some fields.

Both Index and Lookup has one common field, on the basis of what I have get all matching rows.

After Getting data, I need to perform aggregations on day basis.

|inputlookup lookupname| join type=inner lookup_field[search index="abc" | rename index_field as "lookup_field"]

This is giving 500 records and not getting proper results.

index="abc" | join type=inner index_field[|inputlookup lookupname | rename "lookup_field" as "index_field"]

This is giving almost 2 lack records. But whenever I am using this is my dashboard, it is taking lot of time to display.

Which one is correct way of joining, if second one is corect, How to make it optimize?

Dashboard is getting refreshed at every 30 seconds interval.

Please guide on this.

Thanks
Nikks

Tags (1)
0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Okay, here's a set of standard efficiency suggestions ...

1) ALWAYS get rid of all the rows you can at the very front. The subsearch here will tell splunk not to return any lines that are not in the lookup table. For more information, see the manual on the format command. The square braces in the subsearch cause the return values to be implicitly formatted as if piped to the format command.

index="abc" [|inputlookup lookupname | rename "lookup_field" as "index_field" | table "index_field"]

2) ALWAYS get rid of all the fields you can at the very front. The following, as the first line after the initial search, will tell splunk that it does not have to calculate any fields other than the ones that are listed.

| fields index_field myfield1 myfield2 

3) Where not subject to the above, PREFERABLY do any matching, lookup or joining at the latest point that you can... after aggregations if possible. That means that, instead of matching 1000 times for the same value, it aggregates those thousand rows once and then matches once. (If you are just using the lookup as a filter and not adding any data, then you do not need this step.)

| stats count as mycount sum(myfield1) as myfield1sum, max(myfield2) as myfield2max  by index_field 
| join index_field [|inputlookup lookupname | rename "lookup_field" as "index_field" | table index_field lookupvaluefield]
| table index_field mycount myfield1sum myfield2max lookupvaluefield

If you want any more specific advice, then you will need to post the rest of your query. (You didn't tell us WHAT you were aggregating, so we can't tell you how best to do that.)

DalJeanis
SplunkTrust
SplunkTrust

Come to think of it, there is a more important rule of efficiency, that saves 100% of the CPU cycles for certain operations. That rule is,

DON'T DO ANYTHING YOU DON'T ACTUALLY NEED TO DO.

You said you are aggregating on a "day" basis and refreshing every 30 seconds...

WHY?

Look at the role of whoever is looking at this dashboard, and yourself ask if that exact person will need to make a decision in the next 30 seconds based on what just happened. If not, then you are overengineering the dashboard. Back it off to every 5 minutes, and you have saved 90% of the CPU cycles.

Even if they do, consider having one panel with the full day's data, perhaps on a 10-minute refresh, and another panel showing ONLY the last fifteen minutes on 30-second refresh. Your dashboard will run better and be more useful, and you will reserve those CPU cycles for something that actually will benefit the business.

adonio
Ultra Champion

if i understand correctly and you need the results from the lookup based on the values under the filed XYZ from search, try and narrow down your search first: index=abc fields XYZ (or other filtering options such as sourcetye, host, etc) and then complete your search
hope it helps

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...