Getting Data In

What am i doing wrong in summary index?

maayan
Path Finder

Hi,

I'm working with a large amount of data. I have a main report that extracts all data of the previous month and 5 additional small reports that filter by event type and take the only fields that are relevant for the event. For example: report 1 for event A, report 2 for event B, and so on..
In order to improve the performance I want to use a summary index.

I read the documentation and I'm doing the following:

  1. Create a report in: "Searches, Reports, and Alerts"
    the query is :
    index=myIndex source=mySource sourcetype=_json
    | rename… | table …
    | stats values(*) as * by TimeStamp,source
    | lookup lookUp_table_toAdd_Fields.csv source AS source
    | sistats values(*) as * by TimeStamp,source

  2. Enable summary index
     
  3. Scheduled the report. Run daily for 24 hours.
     
  4. Create a new search to extract data saved in the index:
    index="summary" source="SummaryIndex_Main"
    | stats values(*) as * by TimeStamp,source | table *

    Data range- only 6 days  (data between 1.8-6.8, only 987,771 events)

  5. Results:
    When it runs it looks like it collecting the data but when the run finish, the statistics tab contain no results and I get the error:" The following error(s) occurred while the search ran. Therefore, search results might be incomplete."

maayan_0-1691919023478.png

I don’t have permission to change the config files and I'm not sure what I'm doing wrong.

Please help!!
 
*Note- I need to extract all the original fields from the main query this is why I use sistats and stats (and not collect). And I have no aggregation. Just need to extract the data and be aware of overlaps.

* Relevant questions that I have posted:
https://community.splunk.com/t5/Reporting/Summary-index-for-non-aggregated-data-How-to-read-only-del...

https://community.splunk.com/t5/Reporting/Why-sistats-doesn-t-work-after-lookup/m-p/653864#M12170

Thanks,

Maayan

Labels (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

What does it show when you click on Show errors?

0 Karma

maayan
Path Finder

"Events might not be returned in sub-second order due to search memory limits. See search.log for more information. Increase the value of the following limits.conf setting:[search]:max_rawsize_perchunk."

i dont have permissions to change the config files, what am i doing wrong in my steps? i extracted on this case only 987,771 events

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Probably trying to index too many events - try reducing the timespan and/or break up your search into multiple smaller chunks.

0 Karma

maayan
Path Finder

But I have a problem extracting data from the index not writing to the index. 
(when I created the index I schedule a daily job on a range of 24  hours (data of 1.8, 2.8,..,6.8 separately).

Now I need to use the data that was collected in the index and present data of 30 days

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

First check to see if you can retrieve events, e.g.

index=summary source="SummaryIndex_Main" | table *

Then see if you can do any stats for this index e.g.

index=summary source="SummaryIndex_Main" | stats count by TimeStamp,source

 

0 Karma

maayan
Path Finder

Hi ,

First query:
I get the results. The error still exists but I get the data. The problem is that I need to extract the original fields, not sure that I get all of them i will check.
in addition, it takes time to extract and i thought that the summary index supposes to improve the performance.

Second query:
The query stopped and the errors are:
"DAG Execution Exception: Search has been canceled 
Search auto-canceled
Events might not be returned in sub-second order due to search memory limits. See search.log for more information. Increase the value of the following limits.conf setting:[search]:max_rawsize_perchunk.The search job has failed due to an error. You may be able view the job in the"


thanks!

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Summary indexes can improve performance when you are performing aggregation functions such as count or avg. When you are just collecting all the values of all the fields (particularly by a field which is likely to have lots of unique values), you are not using summary indexes in the manner they were intended to be used and not gaining very much (if anything) in performance. Your summary index probably has the same amount of data as your original index.

0 Karma

maayan
Path Finder

Ok i understand, so what do you recommend me to do in my case to improve the performance? It takes too long time to load and run queries..

And how can I extract all the original fields from the index?
in case I execute the report without aggregation (without stats) to pull data from index :

index=summary source="SummaryIndex_Main" | table *

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

It depends on what you are ultimately trying to do. Just copying the events from the main index to a "summary" index saves you nothing, and indeed seems to make things worse. You could try breaking up your search time frames in to smaller chunks.

0 Karma
Get Updates on the Splunk Community!

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...