Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Splunk Search

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Community
- :
- Splunk Answers
- :
- Using Splunk
- :
- Splunk Search
- :
- Re: sum(x/y) and sum(x)/sum(y)

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark Topic
- Subscribe to Topic
- Mute Topic
- Printer Friendly Page

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

sum(x/y) and sum(x)/sum(y)

inglisn

Path Finder

09-14-2011
05:06 AM

This has stumped me for too long so I'm opening it up to the experts.

I have some event data of format "timestamp, Category, SubCategory, X, Y". The data has already been processed by the source system so that there are, say, 5000 events per timestamp. I want to filter on the Category, leaving about 500 relevant events to process.

X is a data volume (e.g. through an interface). I can generate the total using sum(X).

Y gives me the number of users. There is a fixed value of Y per SubCategory, so that for a given Category values(Y) would contain 4 or 5 data points. This is not really a problem as I can return the correct value in a subsearch through "join Category,SubCategory [search | stats | mvexpand | stats ]"

I want to calculate the average per-user volume for X for a given category and also for each subCategory within the category.

Using stats gives me:

SubCategory UsersInSubCategory sum(X) sum(X/Y)

A 100 100MB 1MB

B 200 200MB 1MB

Totals 300 300MB 2MB

This is correct when breaking out by SubCategory, but for the whole Category I cannot use sum(X/Y) as what I want is sum(X)/sum(Y). Since the underlying events are in the thousands avg(X/Y) gives me averages per event, not per SubCategory.

I'm beginning to think two queries might be easier.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

inglisn

Path Finder

09-16-2011
02:31 AM

OK, so I played around an piping into eval started working. Go figure. Here's my current solution:

search and filter on Category

| stats first(_time) as timestr, values(Y) as y, sum(X) as x by date_year, date_month, date_mday, date_hour, SubCategory

| convert timeformat="%Y-%m-%d %H:00" ctime(timestr) as DateTime

| stats sum(y) as y, sum(x) as x by DateTime, SubCategory

| eval r=x/y

| chart sum(r) by DateTime, SubCategory

The raw data is grouped into 300s clumps comprising several thousand unique events. The first stats command creates the period over which the summaries are built and includes a timestamp which is used by the second stats to build the actual table. Then we eval the average and pipe into chart to produce something that can be put into a graph.

This allows me to keep or drop the SubCategory grouping in the second stats and the chart clauses and calculate the correct average over the Category.

By using stats I've lost the event timestamp so needed to manually re-insert it. I works, but not as nicely as timechart as the X-Axis strings are quite long and a bit ugly when plotting over a week (168 points per series). Changing it to summarise over different time periods would involve changing the 2nd and 3rd lines.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

sophy

Splunk Employee

09-14-2011
10:20 AM

hi! have you tried using the eval command to calculate sum(X)/sum(Y)?

```
... | stats sum(X) AS x sum(Y) AS y | eval r=x/y
```

you can read more about eval in the search reference: http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Eval

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

acdevlin

Communicator

09-15-2011
09:23 AM

Instead of piping from one stats command into a second stats command, why not try using eventstats to calculate the sums?

```
... | eventstats sum(X) as sumX, sum(Y) as sumY | eval r=sumX/sumY | stats values(r)
```

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

inglisn

Path Finder

09-15-2011
02:35 AM

Get Updates on the Splunk Community!

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

As security threats and their complexities surge, security analysts deal with increased challenges and ...