Splunk Search

How to table/chart over a period of time

clintla
Contributor

trying to calculate groupings of VMs capacity growth over time but a chart or table looks to be the best answer if you need to report on 100 VMs.

In a simplified data set per below
Date ,Name,Capacit Used
5/1/2019, VM1,100
5/1/2019, VM2,100
5/1/2019, VM4,450
6/1/2019, VM1,100
6/1/2019, VM2,140
6/1/2019, VM4,450
7/1/2019, VM1,105
7/1/2019, VM2,200
8/1/2019, VM1,110
8/1/2019, VM2,200
9/1/2019, VM1,110
9/1/2019, VM2,200
10/1/2019,VM1,110
10/1/2019,VM2,200
10/1/2019,VM3,100
11/1/2019,VM1,110
11/1/2019,VM2,200
11/1/2019,VM3,200

How can you search it so that if you search for 7/1/2019 through 11/1/2019 that the result would be tabled as

VM1 5GB
VM2 0GB
VM3 200GB

So this almost needs to be like a delta except that its clobbered by VMs that are not in the beginning or ending of the time range.

If a VM is created in the time range then its starting capacity should be 0

0 Karma

woodcock
Esteemed Legend

This definitely does what you need but I am unsure if the all_min_time calculation needs to go before or after the where:

| makeresults 
| eval _raw="Date,Name,Capacity Used
5/1/2019,VM1,100
5/1/2019,VM2,100
5/1/2019,VM4,450
6/1/2019,VM1,100
6/1/2019,VM2,140
6/1/2019,VM4,450
7/1/2019,VM1,105
7/1/2019,VM2,200
8/1/2019,VM1,110
8/1/2019,VM2,200
9/1/2019,VM1,110
9/1/2019,VM2,200
10/1/2019,VM1,110
10/1/2019,VM2,200
10/1/2019,VM3,100
11/1/2019,VM1,110
11/1/2019,VM2,200
11/1/2019,VM3,200" 
| multikv forceheader=1 
| eval _time = strptime(Date, "%m/%d/%Y") 
| sort 0 - _time 

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution" 

| eval _time = strptime(Date, "%m/%d/%Y") 
| where _time >= strptime("7/1/2019", "%m/%d/%Y") AND _time <= strptime("11/1/2019", "%m/%d/%Y") 
| eventstats min(_time) AS all_min_time
| stats count first(all_min_time) AS all_min_time min(_time) AS min_time range(Capacity_Used) AS option1 min(Capacity_Used) AS min max(Capacity_Used) AS max first(Capacity_Used) AS newest last(Capacity_Used) AS oldest BY Name 
| eval first = if(my_min_time==all_min_time, first, 0)
| eval option2 = max - min, option3 = newest - oldest 
| foreach option* [ eval <<FIELD>> = <<FIELD>> . "GB" ]
| table Name option* * 1
0 Karma

clintla
Contributor

I was actually looking at the same thing w/ all_min_time, looks promising.

0 Karma

clintla
Contributor

When I move those times (all_min_time) to my actual search.. they are all the same.

I dont think there is an easy solution. (although it seems there should be).

0 Karma

woodcock
Esteemed Legend

It absolutely does work. The problem is that you are not giving an accurate excerpt of your events.

0 Karma

clintla
Contributor

Almost have this working
I think this has a fatal flaw,
| eventstats min(_time) AS all_min_time
if you search 1 system, the eventstats command clobbers it because the all time minimum will be the earliest of a system even if not the whole time picker range.

I guess the question is how would we get the earliest time from the time picker and be able to use it in a chart?

0 Karma

woodcock
Esteemed Legend

The problem is that your example doesn't have the use-case that you are complaining about. In any case, I have modified my answer to accommodate it as such: if there is only 1 event in the results set for any VM Name, then the value is listed as the growth:

| makeresults 
| eval _raw="Date,Name,Capacity Used
5/1/2019,VM1,100
5/1/2019,VM2,100
5/1/2019,VM4,450
6/1/2019,VM1,100
6/1/2019,VM2,140
6/1/2019,VM4,450
7/1/2019,VM1,105
7/1/2019,VM2,200
8/1/2019,VM1,110
8/1/2019,VM2,200
9/1/2019,VM1,110
9/1/2019,VM2,200
10/1/2019,VM1,110
10/1/2019,VM2,200
10/1/2019,VM3,100
11/1/2019,VM1,110
11/1/2019,VM2,200
11/1/2019,VM3,200" 
| multikv forceheader=1 
| eval _time = strptime(Date, "%m/%d/%Y") 
| sort 0 - _time 

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution" 

| eval _time = strptime(Date, "%m/%d/%Y") 
| where _time >= strptime("7/1/2019", "%m/%d/%Y") AND _time <= strptime("11/1/2019", "%m/%d/%Y") 
| stats count range(Capacity_Used) AS option1 min(Capacity_Used) AS min max(Capacity_Used) AS max first(Capacity_Used) AS newest last(Capacity_Used) AS oldest BY Name 
| eval option2 = max - min, option3 = newest - oldest 
| foreach option* [ eval <<FIELD>> = if(count==1, min, <<FIELD>>) . "GB" ] 
| table Name option* * 1
0 Karma

clintla
Contributor

VM3 is one that is added in the time range & is what the main problem that I've been trying to fix.

Lemme check this out though & see if I can use it.

Thanks for the ongoing answers.. it does help in me learning more & being able to try new things

0 Karma

clintla
Contributor

Still, I guess I am not explaining it well enough or there is just not a straight-forward solution for this.

VM3 is the use case I was describing. It simply comes in during the time range

The more I look at it, the more I think it needs to be solved w/ a _time solution.

If a VM is removed from the selected time range.. it also wont work. (like VM4 if you do all-time)

if you count events, it doesnt work (doesnt tell you to start or end with a 0)

Anyway.. thanks for the efforts.. I will continue to plug away at it.

0 Karma

woodcock
Esteemed Legend

OK, I finally get it. Stand by for one more answer.

0 Karma

woodcock
Esteemed Legend

Like this:

| makeresults
| eval _raw="Date,Name,Capacity Used
5/1/2019,VM1,100
5/1/2019,VM2,100
5/1/2019,VM4,450
6/1/2019,VM1,100
6/1/2019,VM2,140
6/1/2019,VM4,450
7/1/2019,VM1,105
7/1/2019,VM2,200
8/1/2019,VM1,110
8/1/2019,VM2,200
9/1/2019,VM1,110
9/1/2019,VM2,200
10/1/2019,VM1,110
10/1/2019,VM2,200
10/1/2019,VM3,100
11/1/2019,VM1,110
11/1/2019,VM2,200
11/1/2019,VM3,200"
| multikv forceheader=1
| eval _time = strptime(Date, "%m/%d/%Y")
| sort 0 - _time

 | rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

 | eval _time = strptime(Date, "%m/%d/%Y")
 | where _time >= strptime("7/1/2019", "%m/%d/%Y") AND _time <= strptime("11/1/2019", "%m/%d/%Y")
 | stats range(Capacity_Used) AS option1 min(Capacity_Used) AS min max(Capacity_Used) AS max first(Capacity_Used) AS newest last(Capacity_Used) AS oldest BY Name
 | eval option2 = max - min, option3 = newest - oldest
 | foreach option* [ eval <<FIELD>> = <<FIELD>> . "GB" ]
 | table Name option* *
0 Karma

clintla
Contributor

Again for 7/1/2019 through 11/1/2019 that the result would be tabled as

VM1 5GB
VM2 0GB
VM3 200GB

(this is the actual growth)

If the VM doesnt exist at the beginning of a selected time frame.. it should be 0.

your results seem to be tied to min/max or earliest/latest but if the VM is not in the beginning or end of the range.. should start or end at 0 respectively.

Name option1 option2 eval option3 max min newest oldest
VM1 5GB 5GB 5 110 105 110 105
VM2 0GB 0GB 0 200 200 200 200
VM3 100GB 100GB 100 200 100 200 100

0 Karma

clintla
Contributor

should there be a fillnul & then add in a token for TimeRange.earliest=-30d ?

0 Karma

clintla
Contributor

Maybe a better example is just this.

if you had a VM that was created last month at 100GB and you did a search
for year to date.

your growth would be 100GB.

if you do any variation of min/max, earliest/latest, range.. you'll show 0GB growth
because earliest/latest/max/min/range is all going to show the same value

but the true growth is 100GB

0 Karma

woodcock
Esteemed Legend

OK, one more time. Stand by for new answer...

0 Karma

to4kawa
Ultra Champion
| makeresults 
| eval _raw="Date ,Name,Capacit_Used
5/1/2019, VM1,100
5/1/2019, VM2,100
5/1/2019, VM4,450
6/1/2019, VM1,100
6/1/2019, VM2,140
6/1/2019, VM4,450
7/1/2019, VM1,105
7/1/2019, VM2,200
8/1/2019, VM1,110
8/1/2019, VM2,200
9/1/2019, VM1,110
9/1/2019, VM2,200
10/1/2019,VM1,110
10/1/2019,VM2,200
10/1/2019,VM3,100
11/1/2019,VM1,110
11/1/2019,VM2,200
11/1/2019,VM3,200"
| rex mode=sed "s/ //g"
| multikv
| table Date,Name,Capacit_Used
`comment("this is sample data")`
| eval _time=strptime(Date,"%m/%d/%Y") 
`comment("Adjust the period here if necessary")`
| stats earliest(_time) as first_time earliest(Capacit_Used) as first_value latest(_time) as last_time latest(Capacit_Used) as last_value
 range(Capacit_Used) as result by Name
| foreach *_time 
    [eval <<FIELD>> = strftime(<<FIELD>>,"%m/%d/%Y") ]
| eval result = result . "GB"

Hi, @clintla
Please evaluate the previous answer.

0 Karma

clintla
Contributor

This is running into the same problem I was having
Name first_time first_value last_time last_value result
VM1 05/01/2019 100 11/01/2019 110 10GB
VM2 05/01/2019 100 11/01/2019 200 100GB
VM3 10/01/2019 100 11/01/2019 200 100GB
VM4 05/01/2019 450 06/01/2019 450 0GB

Per the stats command-
| stats earliest(_time) as first_time earliest(Capacit_Used) as first_value latest(_time) as last_time latest(Capacit_Used) as last_value
range(Capacit_Used) as result by Name

earliest & latest clobber those VMs that were deleted or added in a time range.
for VM4- it should not show up in July to Nov but if it was in a time selection- it should read -450 (due to it was deleted)

This does give me some ideas to try though so thanks!

0 Karma

woodcock
Esteemed Legend

Like this (not sure about the units, though):

| makeresults | eval _raw="Date,Name,Capacity Used
5/1/2019,VM1,100
5/1/2019,VM2,100
5/1/2019,VM4,450
6/1/2019,VM1,100
6/1/2019,VM2,140
6/1/2019,VM4,450
7/1/2019,VM1,105
7/1/2019,VM2,200
8/1/2019,VM1,110
8/1/2019,VM2,200
9/1/2019,VM1,110
9/1/2019,VM2,200
10/1/2019,VM1,110
10/1/2019,VM2,200
10/1/2019,VM3,100
11/1/2019,VM1,110
11/1/2019,VM2,200
11/1/2019,VM3,200"
| multikv forceheader=1

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

| eval _time = strptime(Date, "%m/%d/%Y")
| where _time >= strptime("7/1/2019", "%m/%d/%Y") AND _time <= strptime("11/1/2019", "%m/%d/%Y")
| stats sum(Capacity_Used) AS Capacity_Used BY Name
| eval Capacity_Used = Capacity_Used . "GB"
0 Karma

clintla
Contributor

Output from above
VM1 545GB
VM2 1000GB
VM3 300GB

This looks like its just adding up all the daily capacities reported.

Trying to look over any search & find a start/end time & track the growth. My example below was just from july to nov. havent found a way to make it work yet.

VM1 5GB
VM2 0GB
VM3 200GB

Think there is an easy way to just be able to pick any time frame & then track the growth of groups of VMs?

0 Karma

woodcock
Esteemed Legend

Got it. See my new answer.

0 Karma

yannK
Splunk Employee
Splunk Employee

If you extracted the fields, try to sum the capacity over the period, using a stats

mysearch | stats sum(capacity_used) by Name

If you want to see the usage evolution (not the sum of usage), timechart would be the best too but when using timechart and chart, the top 10 series are displayed only, you need to add a higher limit if you have 100 Vms

But the UI and graphs may not be able to show them.

Example

    mysearch | timechart span=1d max(capacity_used) AS totalUsage by Name limit=100

So I would recommend to use a stats instead to build a table.

  mysearch | bucket _time span=1d | stats max(capacity_used) AS totalUsageby _time Name
0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...