mmedal

Explorer

08-14-2012
03:57 PM

I have a bunch of SAN usage data that I am inputting into Splunk that looks as follows, with each line representing an entry in Splunk:

```
Group: diskdg1 Disks: 21 Disk in use: data04 Capacity: 1%
Group: diskdg2 Disks: 21 Disk in use: data05 Capacity: 1%
Group: diskdg3 Disks: 5 Disk in use: data01 Capacity: 33%
Group: diskdg4 Disks: 34 Disk in use: data08 Capacity: 1%
Group: diskdg5 Disks: 30 Disk in use: data07 Capacity: 1%
Group: diskdg6 Disks: 38 Disk in use: data09 Capacity: 25%
```

What I would like to do is display a table with these fields, plus a new field displaying a "change in capacity" since 7 days ago. In other words, I would like to evaluate the difference between the capacity field now and the capacity field for that entry 7 days ago.

Can anyone assist me with a search?

Thanks so much, Matt

dwaddle

SplunkTrust

08-14-2012
08:07 PM

At first glance, the difference should be pretty easy - you can use the `delta`

search command. But, `delta`

lacks a `by`

clause so you could only do one `Group`

at a time - a bit of a limitation. But, I think you can use `streamstats`

to roughly create a `delta`

per-Group.

Assuming that your data above has field extractions for `Group`

and `Capacity`

then a search like this should get you close:

```
sourcetype=my_san_data
| streamstats last(Capacity) as high first(Capacity) as low by Group window=7 global=f
| eval delta=high-low
| table _time,Group,Capacity,delta
```

You may need to swap around high vs low just to get it to work out mathematically right. There is an assumption here that you are collecting this data once per day. The way this "should" work is `streamstats`

will do a sliding window of 7 events per `Group`

and use the first and last values of `Capacity`

within each of those sliding windows to calculate a delta.

Obviously a sliding window of 7 events is not necessarily **strictly** 7 days. It depends on you collecting exactly once per day, every day, without missing one. If you are collecting once per hour, then you can adjust `window`

to be 168 instead.

There are some more complicated ways of dealing with this like maintaining state in lookups, or time-oriented subsearches if you need a higher precision than a sliding window. But, unless your accuracy requirements are very very high, this should be "close enough".

Re: Data Field Entries Across Different Time Spans per Entry

mmedal

Explorer

08-15-2012
04:34 PM

Thanks for the feedback. Great answer to my question, it certainly is "close enough" haha.