Re: Summary index for non aggregated data

maayan · ‎08-07-2023

Hi,

I'm working with a large amount of data.
I wrote a main report that extracts all events (let's call them events A,B,C,D) from the last 30 days and do some manipulations for fields.
And then i wrote 5 reports that filter the main saved report by events type and get only the relevant fields for each event:
For example- the report for event A contain all fields relevant for event A,
report for event B contains all fields relevant for event B and etc.

My dashboard contains 5 tabs, one for each event (tab 1 for report A, tab 2 for report B,..), and triggers the relevant saved search report (reports A/B/C,..)

Problems- all the reports run very slow

My questions:
1. How to read only delta data each time? i mean, how to not read 30 days each time at once, if the query was already run today and i execute it one more time it should read only new data and use the history data that have already read in the previous run.

2. i read a bit about summary index. my reports extract all fields and not aggregate data. how to create my 6 reports (main+5 others) with summary index? As i said, - i use table command and not functions like top,count,.. in my query (my reports just extract relevant fields with some naming manipulations)

* in case that you would recommend to use summary index i will appreciate if you could provide me example code, because i have 6 reports and not sure how work with summary index

thanks,
Maayan

gcusello · ‎08-07-2023

Hi @maayan,

yes the solutions could be summary indexes or data models.

In bothe cases, you have to schedule a search, e.g. for report 1:

index=index1
| table _time field 1 field2, field3
| collect index=summary1

the frequency dependa on your requirements.

then you can run a search on this index, e.g. calculate sum of field 2 for each field1:

index=summary1
| stats sum(field2) AS field2_sum BY field1

Ciao.

Giuseppe

maayan · ‎08-07-2023

ok, I will try,thanks!

And regarding my first question, is it something that I can do in Splunk? (read delta data)

And which method do you recommend to use in my case? data model or summary index?

thanks

maayan · ‎08-13-2023

Hi, i tried to implement the summary index as you suggested but i had a problem to extract the original fields from the main query. i read that i might use stats and stats. i posted a new post. maybe you can help. thanks

gcusello · ‎08-07-2023

Hi @maayan,

yes, you can calculate delta, global and partial sum, etc...

the main job is building the scheduled search to extract the requested data.

in my opinion, I'd use summary index, scheduling the population search with the frequency you need (e.g. every month or every night.

Ciao.

Giuseppe

maayan · ‎08-08-2023

Thanks Gcusello!
Can you explain more about: "you can calculate delta, global and partial sum, etc..." ?
I didn't find documentation and also asked in other communities and nobody knows.

gcusello · ‎08-13-2023

Hi @maayan,

it all depends on the data you have (that I don't know), so e.g. if field1 is the hostname and field2 is the CPU utilization, you save with the scheduled search the CPU utilization min, max and avg day by day.

index=index1
| stats 
   min(CPU) AS min_CPU 
   max(CPU) AS max_CPU 
   avg(CPU) AS avg_CPU 
   BY host
| collect index=summary1

then you can calculate (using the normal commands as stats or timechart) the max, the avg and the min in a month

index=summary1
| stats 
   min(min_CPU) AS min_CPU 
   max(max_CPU) AS max_CPU 
   avg(avg_CPU) AS avg_CPU 
   BY host

As I said, it depends on the data that you added to you summary index.

Ciao.

Giuseppe

Summary index for non aggregated data: How to read only delta data each time?

summary indexing

Observability Unlocked: Kubernetes & Cloud Monitoring with Splunk IM

Index This | What did the zero say to the eight?

Splunk Observability Cloud's AI Assistant in Action Series: Onboarding New Hires & ...

Are you a member of the Splunk Community?

Summary index for non aggregated data: How to read only delta data each time?

summary indexing

Observability Unlocked: Kubernetes & Cloud Monitoring with Splunk IM

Index This | What did the zero say to the eight?

Splunk Observability Cloud's AI Assistant in Action Series: Onboarding New Hires & ...