What am i doing wrong in summary index?

maayan · ‎08-13-2023

Hi,

I'm working with a large amount of data. I have a main report that extracts all data of the previous month and 5 additional small reports that filter by event type and take the only fields that are relevant for the event. For example: report 1 for event A, report 2 for event B, and so on..
In order to improve the performance I want to use a summary index.

I read the documentation and I'm doing the following:

Create a report in: "Searches, Reports, and Alerts"
the query is :
index=myIndex source=mySource sourcetype=_json
| rename… | table …
| stats values(*) as * by TimeStamp,source
| lookup lookUp_table_toAdd_Fields.csv source AS source
| sistats values(*) as * by TimeStamp,source
Enable summary index
Scheduled the report. Run daily for 24 hours.
Create a new search to extract data saved in the index:
index="summary" source="SummaryIndex_Main"
| stats values(*) as * by TimeStamp,source | table *

Data range- only 6 days (data between 1.8-6.8, only 987,771 events)
Results:
When it runs it looks like it collecting the data but when the run finish, the statistics tab contain no results and I get the error:" The following error(s) occurred while the search ran. Therefore, search results might be incomplete."

I don’t have permission to change the config files and I'm not sure what I'm doing wrong.

Please help!!

*Note- I need to extract all the original fields from the main query this is why I use sistats and stats (and not collect). And I have no aggregation. Just need to extract the data and be aware of overlaps.

* Relevant questions that I have posted:
https://community.splunk.com/t5/Reporting/Summary-index-for-non-aggregated-data-How-to-read-only-del...

https://community.splunk.com/t5/Reporting/Why-sistats-doesn-t-work-after-lookup/m-p/653864#M12170

Thanks,

Maayan

ITWhisperer · ‎08-13-2023

What does it show when you click on Show errors?

maayan · ‎08-13-2023

"Events might not be returned in sub-second order due to search memory limits. See search.log for more information. Increase the value of the following limits.conf setting:[search]:max_rawsize_perchunk."

i dont have permissions to change the config files, what am i doing wrong in my steps? i extracted on this case only 987,771 events

ITWhisperer · ‎08-13-2023

Probably trying to index too many events - try reducing the timespan and/or break up your search into multiple smaller chunks.

maayan · ‎08-13-2023

But I have a problem extracting data from the index not writing to the index.
(when I created the index I schedule a daily job on a range of 24 hours (data of 1.8, 2.8,..,6.8 separately).

Now I need to use the data that was collected in the index and present data of 30 days

ITWhisperer · ‎08-13-2023

First check to see if you can retrieve events, e.g.

index=summary source="SummaryIndex_Main" | table *

Then see if you can do any stats for this index e.g.

index=summary source="SummaryIndex_Main" | stats count by TimeStamp,source

maayan · ‎08-13-2023

Hi ,

First query:
I get the results. The error still exists but I get the data. The problem is that I need to extract the original fields, not sure that I get all of them i will check.
in addition, it takes time to extract and i thought that the summary index supposes to improve the performance.

Second query:
The query stopped and the errors are:
"DAG Execution Exception: Search has been canceled
Search auto-canceled
Events might not be returned in sub-second order due to search memory limits. See search.log for more information. Increase the value of the following limits.conf setting:[search]:max_rawsize_perchunk.The search job has failed due to an error. You may be able view the job in the"

thanks!

ITWhisperer · ‎08-14-2023

Summary indexes can improve performance when you are performing aggregation functions such as count or avg. When you are just collecting all the values of all the fields (particularly by a field which is likely to have lots of unique values), you are not using summary indexes in the manner they were intended to be used and not gaining very much (if anything) in performance. Your summary index probably has the same amount of data as your original index.

maayan · ‎08-15-2023

Ok i understand, so what do you recommend me to do in my case to improve the performance? It takes too long time to load and run queries..

And how can I extract all the original fields from the index?
in case I execute the report without aggregation (without stats) to pull data from index :

index=summary source="SummaryIndex_Main" | table *

ITWhisperer · ‎08-15-2023

It depends on what you are ultimately trying to do. Just copying the events from the main index to a "summary" index saves you nothing, and indeed seems to make things worse. You could try breaking up your search time frames in to smaller chunks.

What am i doing wrong in summary index?

index

Index This | Divide 100 by half. What do you get?

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

Splunk and Fraud