topic Summary index for non aggregated data: How to read only delta data each time? in Other Usage

Summary index for non aggregated data: How to read only delta data each time?

maayan — Mon, 07 Aug 2023 18:56:06 GMT

Hi,

I'm working with a large amount of data.
I wrote a main report that extracts all events (let's call them events A,B,C,D) from the last 30 days and do some manipulations for fields.
And then i wrote 5 reports that filter the main saved report by events type and get only the relevant fields for each event:
For example- the report for event A contain all fields relevant for event A,
report for event B contains all fields relevant for event B and etc.

My dashboard contains 5 tabs, one for each event (tab 1 for report A, tab 2 for report B,..), and triggers the relevant saved search report (reports A/B/C,..)

Problems- all the reports run very slow

My questions:
1. How to read only delta data each time? i mean, how to not read 30 days each time at once, if the query was already run today and i execute it one more time it should read only new data and use the history data that have already read in the previous run.

2. i read a bit about summary index. my reports extract all fields and not aggregate data. how to create my 6 reports (main+5 others) with summary index? As i said, - i use table command and not functions like top,count,.. in my query (my reports just extract relevant fields with some naming manipulations)

* in case that you would recommend to use summary index i will appreciate if you could provide me example code, because i have 6 reports and not sure how work with summary index

thanks,
Maayan

Re: Summary index for non aggregated data

gcusello — Mon, 07 Aug 2023 09:17:53 GMT

Hi @maayan,

yes the solutions could be summary indexes or data models.

In bothe cases, you have to schedule a search, e.g. for report 1:

index=index1 | table _time field 1 field2, field3 | collect index=summary1

the frequency dependa on your requirements.

then you can run a search on this index, e.g. calculate sum of field 2 for each field1:

index=summary1 | stats sum(field2) AS field2_sum BY field1

Ciao.

Giuseppe

Re: Summary index for non aggregated data

maayan — Mon, 07 Aug 2023 10:48:59 GMT

ok, I will try,thanks!

And regarding my first question, is it something that I can do in Splunk? (read delta data)

And which method do you recommend to use in my case? data model or summary index?

thanks

Re: Summary index for non aggregated data

gcusello — Mon, 07 Aug 2023 11:03:42 GMT

Hi @maayan,

yes, you can calculate delta, global and partial sum, etc...

the main job is building the scheduled search to extract the requested data.

in my opinion, I'd use summary index, scheduling the population search with the frequency you need (e.g. every month or every night.

Ciao.

Giuseppe

Re: Summary index for non aggregated data

maayan — Tue, 08 Aug 2023 09:00:56 GMT

Thanks Gcusello!
Can you explain more about: "you can calculate delta, global and partial sum, etc..." ?
I didn't find documentation and also asked in other communities and nobody knows.

Re: Summary index for non aggregated data

maayan — Sun, 13 Aug 2023 09:36:28 GMT

Hi, i tried to implement the summary index as you suggested but i had a problem to extract the original fields from the main query. i read that i might use stats and stats. i posted a new post. maybe you can help. thanks

Re: Summary index for non aggregated data

gcusello — Sun, 13 Aug 2023 14:25:28 GMT

Hi @maayan,

it all depends on the data you have (that I don't know), so e.g. if field1 is the hostname and field2 is the CPU utilization, you save with the scheduled search the CPU utilization min, max and avg day by day.

index=index1 | stats min(CPU) AS min_CPU max(CPU) AS max_CPU avg(CPU) AS avg_CPU BY host | collect index=summary1

then you can calculate (using the normal commands as stats or timechart) the max, the avg and the min in a month

index=summary1 | stats min(min_CPU) AS min_CPU max(max_CPU) AS max_CPU avg(avg_CPU) AS avg_CPU BY host

As I said, it depends on the data that you added to you summary index.

Ciao.

Giuseppe