Solved: Re: How to get a peak count while still including ...

dirtebird · ‎09-06-2022

I have this query that gets current CURRENT_OUT counts by DISTRICT

This works to get current counts because the db source is updated every 5 minutes in splunk DB connect. I get a nice table of CURRENT_OUT by DISTRICT.

Is is possible to expand this to add a peak value for CURRENT_OUT over, say, the last 24 hours, while still including the current CURRENT_OUT value in the table as well? I'm looking at the bin command but I can't put it together. Once I expand my timeframe for my query, I'm bringing back way too much data an overinflating the current CURRENT_OUT values for each DISTRICT. Thanks

ITWhisperer · ‎09-06-2022

Does something like this work for you?

index=<my index> sourcetype=oracle:query source=<source> earliest=-1d 
| bin span=5m _time
| stats sum(OUT_CUSTS) as CURRENT_OUT by DISTRICT, _time
| stats max(CURRENT_OUT) as PEAK_OUT latest(CURRENT_OUT) as CURRENT_OUT by DISTRICT
| sort by DISTRICT
| fields DISTRICT, CURRENT_OUT, PEAK_OUT

View solution in original post

ITWhisperer · ‎09-06-2022

Does something like this work for you?

index=<my index> sourcetype=oracle:query source=<source> earliest=-1d 
| bin span=5m _time
| stats sum(OUT_CUSTS) as CURRENT_OUT by DISTRICT, _time
| stats max(CURRENT_OUT) as PEAK_OUT latest(CURRENT_OUT) as CURRENT_OUT by DISTRICT
| sort by DISTRICT
| fields DISTRICT, CURRENT_OUT, PEAK_OUT

dirtebird · ‎09-07-2022

Wow! This is so close to what I needed, thank you! The only issue is with the latest(CURRENT_OUT)

| stats max(CURRENT_OUT) as PEAK_OUT latest(CURRENT_OUT) as CURRENT_OUT by DISTRICT

Many times, a district will not have any data for a given time slice. I would only want to bring back a non-zero value if the timestamp was the "latest" time slice, here's an example where 8:00am is "latest":

_time	DISTRICT	OUT_CUSTS
8:00am	A	15
7:55am	A	10
7:50am	A	23
7:55am	B	1
7:50am	B	3

Desired result:

DISTRICT	CURRENT_OUT	PEAK_OUT
A	15	23
B	0	3

Actual result:

DISTRICT	CURRENT_OUT	PEAK_OUT
A	15	23
B	1	3

Here B has no 8:00am data. latest(CURRENT_OUT) returns the value 1 from the 7:55am time slice, but I want to see 0 because there is no data for district B for 8:00am. Is there a way to accomplish this without changing the way data is being fed into splunk?

ITWhisperer · ‎09-07-2022

Try something like this - note that timechart uses the earliest and latest settings to determine which time period to use.

| timechart span=5m sum(OUT_CUSTS) by DISTRICT
| fillnull value=0
| untable _time DISTRICT CURRENT_OUT
| stats max(CURRENT_OUT) as PEAK_OUT latest(CURRENT_OUT) as CURRENT_OUT by DISTRICT

dirtebird · ‎09-07-2022

This works, however the other wrinkle is that I have another column i want to do a count() on called INCIDENT_ID.

| timechart span=5m count(INCIDENT_ID) as CURRENT_INCIDENTS sum(OUT_CUSTS) as CURRENT_OUT by DISTRICT 
| fillnull value=0 
| untable _time DISTRICT CURRENT_INCIDENTS_OR_OUT

Gives me results that look like

_time	DISTRICT	CURRENT_INCIDENTS_OR_OUT
10:00	CURRENT_INCIDENTS: ABC	5
10:00	CURRENT_INCIDENTS: XYZ	4
10:00	CURRENT_OUT: ABC	1
10:00	CURRENT_OUT: XYZ	1

So I end up with the INCIDENT count and CURRENT_OUT sum in the same column

Am I out of luck? Either way, what you've been able to show me is nothing short of wizardry and I thank you for your help!

ITWhisperer · ‎09-07-2022

It is a bit more complicated, but try this

| timechart span=5m sum(OUT_CUSTS) as OUT_CUSTS count(INCIDENT_ID) as INCIDENTS by DISTRICT
| fillnull value=0
| foreach "INCIDENTS: *"
    [| eval "<<MATCHSEG1>>"='<<FIELD>>']
| foreach "OUT_CUSTS: *"
    [| eval "<<MATCHSEG1>>"='<<MATCHSEG1>>'.":".'<<FIELD>>']
| fields - INCIDENT* OUT_CUSTS*
| untable _time DISTRICT CURRENT_INCIDENT_OUT
| eval CURRENT_OUT=mvindex(split(CURRENT_INCIDENT_OUT,":"),0)
| eval INCIDENTS=mvindex(split(CURRENT_INCIDENT_OUT,":"),1)
| stats max(CURRENT_OUT) as PEAK_OUT latest(CURRENT_OUT) as CURRENT_OUT latest(INCIDENTS) as INCIDENTS by DISTRICT

dirtebird · ‎09-08-2022

This worked perfectly! Amazing to run your search step by step to see how you solved this. You gave me an education in Splunk, thank you very much for your efforts! I'm marking your first answer as the solution since it's simpler and solves the original question that was asked. Thanks again for your help!

thesplunkmonkey · ‎09-06-2022

I generally hate recommending any solution that includes any sort of subsearch due to the performance hit, but here is one potential option for you.

index=<my index> sourcetype=oracle:query source=<source> earliest=-5m
| stats sum(OUT_CUSTS) as CURRENT_OUT by DISTRICT 
| append 
    [ index=<my index> sourcetype=oracle:query source=<source> earliest=-1d 
    | bin span=5m _time
    | stats sum(OUT_CUSTS) as CURRENT_OUT by DISTRICT, _time
    | stats max(OUT_CUSTS) as PEAK_OUT by DISTRICT] 
| stats values(PEAK_OUT) as PEAK_OUT, values(CURRENT_OUT) as CURRENT_OUT by DISTRICT
| sort by DISTRICT
| fields DISTRICT, CURRENT_OUT, PEAK_OUT

dirtebird · ‎09-08-2022

It didn't like the index command inside the search. I didn't spend much time troubleshooting it, since the answer above ended up working for my needs. Thank you for taking the time to reply.

How to get a peak count while still including current counts?

table

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Splunk Observability for AI

🔐 Trust at Every Hop: How mTLS in Splunk Enterprise 10.0 Makes Security Simpler

Are you a member of the Splunk Community?

How to get a peak count while still including current counts?

table

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Splunk Observability for AI

🔐 Trust at Every Hop: How mTLS in Splunk Enterprise 10.0 Makes Security Simpler