Knowledge Management
Highlighted

Should search performance improve using saved search or summary index?

Builder

Hi All,

I am running a search which shows the totalusedspace (storage used) of an application for last 30 days. Below is the query for the same, but it takes some 40 to 45 seconds to load the panel. I want to improve the performance of this search, so that the panel loads faster. I tried creating a saved search and use that search in the panel, but still it is running very slow. Below is the search:

index=app sourcetype="app:users" | dedup user| stats sum(space_used) as total_space | eval total_space=round(total_space/1024/1024/1024/1024,2)."TB"

Please suggest on how can I create a summary index for this, as I think summary index would improve the performance for the same. Please help resolve this issue ?

Thanks
PG

0 Karma
Highlighted

Re: Should search performance improve using saved search or summary index?

SplunkTrust
SplunkTrust

You could perhaps see if the below search improves the time taken? I have assumed you may want to know usage by user and only want to track where there are valid values. If you don't need user , you can take that off.

index=app sourcetype="app:users" user=* space_used=* | fields space_used, user | stats sum(space_used) as total_space  by user| eval total_space=round(total_space/1024/1024/1024/1024,2)."TB"

Depending on your number of events per day in your index, search against 30day could take a long time. If the search is still taking long, you can create a savedsearch that runs each day or a few times in a day, but only looks like last 1 day or a few hours [ you would need to adjust the earliest and latest to avoid overlap]. Pls refer to https://docs.splunk.com/Documentation/Splunk/7.2.4/Knowledge/Usesummaryindexing

Highlighted

Re: Should search performance improve using saved search or summary index?

Builder

No. I dont want to know usage by user. I am finding out total used space by all users in last 30 days. The query you have written will not give me my result, as it is returning used_space by each user and that too multiple entries for each user. That is why I am using dedup user so that i get latest utilization for users. My query is returning the proper value, but as that has to be run for 30 days, which is taking time.

Hence, I need suggestion that how can I make use of summary index in this and this output value should be close to real-time. How do I configure summary index for a panel that aggregates the data for last 30 days and shows the real-time value ? Hope you got my question ? Please advise ?

0 Karma
Highlighted

Re: Should search performance improve using saved search or summary index?

SplunkTrust
SplunkTrust

when you use stats by user, it shouldn't return multiple entries for same user. So, if you change your search to something like below, how long does it take to run? index=app sourcetype="app:users" user=* space_used=* | fields space_used | stats sum(space_used) as total_space| eval total_space=round(total_space/1024/1024/1024/1024,2)."TB"

If you are going down the summary index approach, you can setup a scheduled search (as per the link sent earlier) and write the results to 'summary' index. Your dashboard then can have another search that pulls the results off the summary index. [ the eval total_space can be moved to the search in the dashboard as well]

0 Karma
Highlighted

Re: Should search performance improve using saved search or summary index?

Builder

the query still takes 112 seconds to execute. If I use stats by user, it calculates all the event values of last 30 days which is not correct, hence I have to use "dedup user" which will take latest used_space field value and add for all users. But still it takes more than 100 seconds which is very slow.

I have already configured summary index which is populated by scheduled search, but still that takes time. Hence, I think feasible approach is "loadjob", but the loadjob time range is the issue, as it does not load data based on timepicker option.

0 Karma
Highlighted

Re: Should search performance improve using saved search or summary index?

Hello @pgadhari

  1. If you don't want to run the query again and again, then you can create a scheduled search for this and using loadjob command you can load the result of last job, which can make it faster.
  2. Otherwise go for acceleration of report which can also be helpful. you can create acceleration for 30days.
  3. For summary indexing, you can create a summary index and using collect command send the results to that index.

View solution in original post

0 Karma
Highlighted

Re: Should search performance improve using saved search or summary index?

Builder

Can you provide an example of how can I implement point no. 1 ?

Out of 3 points you have specified ? which is the best solution, please advise ?

0 Karma
Highlighted

Re: Should search performance improve using saved search or summary index?

Hello @pgadhari

If you want to have the results showed for last 30days, then it is better to go with load job one, you can schedule job to run once a day, at the starting of day and load the results to the panel for the full day.
like

 | loadjob savedsearch="admin:search:MySavedSearch"

I will always to try to find out other way then summary indexing and also with summary indexing if you are changing the sourcetype then the usage will count as license usage.

0 Karma
Highlighted

Re: Should search performance improve using saved search or summary index?

Builder

The problem with the loadjob is - if the user wants to change the "time range" from 30 days to last 2 months or 3 months, then it will still show the value for 30 days only, which will be wrong, as the saved search will be configured to run for 30 days. How can I resolve that issue ? any solution on that problem ?

0 Karma
Highlighted

Re: Should search performance improve using saved search or summary index?

0 Karma