Knowledge Management

Should search performance improve using saved search or summary index?

Builder

Hi All,

I am running a search which shows the totalusedspace (storage used) of an application for last 30 days. Below is the query for the same, but it takes some 40 to 45 seconds to load the panel. I want to improve the performance of this search, so that the panel loads faster. I tried creating a saved search and use that search in the panel, but still it is running very slow. Below is the search:

index=app sourcetype="app:users" | dedup user| stats sum(space_used) as total_space | eval total_space=round(total_space/1024/1024/1024/1024,2)."TB"

Please suggest on how can I create a summary index for this, as I think summary index would improve the performance for the same. Please help resolve this issue ?

Thanks
PG

0 Karma
1 Solution

Motivator

Hello @pgadhari

  1. If you don't want to run the query again and again, then you can create a scheduled search for this and using loadjob command you can load the result of last job, which can make it faster.
  2. Otherwise go for acceleration of report which can also be helpful. you can create acceleration for 30days.
  3. For summary indexing, you can create a summary index and using collect command send the results to that index.

View solution in original post

0 Karma

Esteemed Legend

The Scheduled Saved Search option should work just fine for your use-case. You are probably using the | savesearch in your panel but try switching to | loadjob and make sure that your saved search is scheduled to run periodically and your panel will be instantaneous.

Builder

I am trying to use the loadjob for some other query also wherein I am facing the performance issues. But somehow I think it is not getting the _time after the loadjob output.

0 Karma

Esteemed Legend

That makes no sense. You just need to pass it the events=true argument.

0 Karma

Builder

where I have to pass the argument events=true ? can you share more details please ?

0 Karma

Builder

I tried passing the arguments to my loadjob search, but getting following error :

Error in 'SearchOperator:loadjob': There are no events in the artifacts of job_id 'scheduler_cGFua2FqLmdhZGhhcmlAZHViYWlhaXJwb3J0cy5hZQ_c3BsdW5rX2FwcF9ib3g__boxadvancedusers_at_1556514300_13795'.
The search job has failed due to an error. You may be able view the job in the Job Inspector.

Duration (seconds)      Component   Invocations Input count Output count
    0.00     dispatch.check_disk_usage  1   -   -
    0.00     dispatch.createdSearchResultInfrastructure 1   -   -
    0.00     dispatch.evaluate  1   -   -
    0.00     dispatch.evaluate.loadjob  1   -   -
    0.00     dispatch.fetch 1   -   -
    0.01     dispatch.optimize.FinalEval    1   -   -
    0.00     dispatch.optimize.matchReportAcceleration  1   -   -
    0.03     dispatch.optimize.optimization 1   -   -
    0.00     dispatch.optimize.reparse  1   -   -
    0.00     dispatch.optimize.toJson   1   -   -
    0.00     dispatch.optimize.toSpl    1   -   -
    0.00     dispatch.results_combiner  1   -   -
    0.04     dispatch.writeStatus   7   -   -
    0.04     startup.configuration  1   -   -
    0.18     startup.handoff    1   -   -
0 Karma

Esteemed Legend

Why are you not sharing the SPL command that starts with | loadjob? How are we supposed to help?

0 Karma

Builder

Below is the loadjob query :

| loadjob savedsearch="admin:splunkappbox:boxadvancedusers" events=true

I have also created a seperate question on that query, where in, I have explained in more details my issues and also the queries i have posted there.

https://answers.splunk.com/answers/742744/improving-search-performance-of-search-powered-by.html

0 Karma

Esteemed Legend

If that is your search and you are getting that error, then you need to extend the Time to live setting dispatch.ttl in the advanced edit section of your saved search. The problem is that the search artifacts are being reaped before the next search generates fresh ones. So if you have scheduled your search to run every 4 hours, you need to set your dispatch.ttl to at least 4 hours, so that you don't leave a gap.

0 Karma

Ultra Champion

It's a very good use case for a summary index. The major draw back of summary index is the fact that it's very common for the Splunk platform to skip searches and the summary index integrity can be compromised when searches are skipped. But here in your use case (if I understand it correctly), some skipped searches won't impact the value of the generated summary index.

0 Karma

Motivator

Hello @pgadhari

  1. If you don't want to run the query again and again, then you can create a scheduled search for this and using loadjob command you can load the result of last job, which can make it faster.
  2. Otherwise go for acceleration of report which can also be helpful. you can create acceleration for 30days.
  3. For summary indexing, you can create a summary index and using collect command send the results to that index.

View solution in original post

0 Karma

Builder

Can you provide an example of how can I implement point no. 1 ?

Out of 3 points you have specified ? which is the best solution, please advise ?

0 Karma

Motivator

Hello @pgadhari

If you want to have the results showed for last 30days, then it is better to go with load job one, you can schedule job to run once a day, at the starting of day and load the results to the panel for the full day.
like

 | loadjob savedsearch="admin:search:MySavedSearch"

I will always to try to find out other way then summary indexing and also with summary indexing if you are changing the sourcetype then the usage will count as license usage.

0 Karma

Builder

The problem with the loadjob is - if the user wants to change the "time range" from 30 days to last 2 months or 3 months, then it will still show the value for 30 days only, which will be wrong, as the saved search will be configured to run for 30 days. How can I resolve that issue ? any solution on that problem ?

0 Karma

Motivator
0 Karma

Builder

Ok. Will check that and update shortly. Thanks.

0 Karma

Motivator

@pgadgari

Any update?

0 Karma

Builder

@vishaltaneja07011993 .. I tried putting the time picker options using above link, but that is not working. It takes the time from "time picker" properly in the search, but no results are found in that timeframe selected. Please advise ?

0 Karma

Builder

@vishaltaneja07011993 - I was able to fix the totalusedspace issue using the loadjob command wherein it shows for last 30 days. That is working fine now.

I have another query which is powered by summary index and datamodel. But that query is also taking more time to execute and I am thinking of using loadjob command there, but I think the problem is - after the loadjob command, time is not returned and thats why the starttime and end_time link which you shared earlier does not seems to be working. I will share the query details shortly.

0 Karma

Builder

I will close this question, as loadjob for my this query is working very good. I have another query for which I am opening a new question. Thanks.

0 Karma

SplunkTrust
SplunkTrust

You could perhaps see if the below search improves the time taken? I have assumed you may want to know usage by user and only want to track where there are valid values. If you don't need user , you can take that off.

index=app sourcetype="app:users" user=* space_used=* | fields space_used, user | stats sum(space_used) as total_space  by user| eval total_space=round(total_space/1024/1024/1024/1024,2)."TB"

Depending on your number of events per day in your index, search against 30day could take a long time. If the search is still taking long, you can create a savedsearch that runs each day or a few times in a day, but only looks like last 1 day or a few hours [ you would need to adjust the earliest and latest to avoid overlap]. Pls refer to https://docs.splunk.com/Documentation/Splunk/7.2.4/Knowledge/Usesummaryindexing