Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results forย

Splunk Search

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results forย

- Community
- :
- Splunk Answers
- :
- Using Splunk
- :
- Splunk Search
- :
- Plain histogram of x-axis values over y-axis

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark Topic
- Subscribe to Topic
- Mute Topic
- Printer Friendly Page

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Plain histogram of x-axis values over y-axis

mahikrrish

Explorer

โ06-08-2017
11:13 PM

Hi,

I want to create plain and simple histogram in Splunk, like everyone used to do in school days on graph paper. I have selected "id", and "pr" fields. I want "id" to be on x-axis and its corresponding value of "pr" on y-axis. How should I do that? Splunk isn't allowing me to do that. I don't want to use Sum, Count, Max, Min, Standard Deviation, Mode.

```
source="HVR_1 PageRank.csv" id="*" pr="*" | chart pr over id
```

Can anyone correct my code? Please!

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

sideview

SplunkTrust

โ06-09-2017
12:26 PM

I've had this sort of question come up a lot, and I thought maybe I'd give a different kind of answer, in case it was helpful or complementary.

Questions are more or less "I want to just chart the raw values, as points on a screen. Typically a timechart. "

And they come up in two ways:

a) I don't want to bucket the times, and I don't want to think about avg/min/max, because there aren't very many of them! I just want the values on the screen

b) I don't want to bucket the times and/or think about avg/min/max because I want the human eye to see the storm of points as a scatter plot and I think that'll be better than some clever statistic.

And there are a few ways to answer it.

1) OK, you can throw the raw points at the chart, you just have to use no actual transforming command at all!

Here's a good canonical answer

https://answers.splunk.com/answers/211376/how-to-chart-raw-windows-perfmon-values-over-time.html

Con - If your time granularity exceeds (or greatly exceeds) the number of pixels on the screen..... you're not going to have a good time. ie the "storm of points" may just be a weird fuzzy block of noise.

Con - the charting framework doesn't really like to graph tens of, or hundreds of thousands of points. You might now or down the road get some truncation and error messages about truncation.

2) Sometimes the correct answer is to really come back and use some statistical aggregation, and resign yourself to a particular bucketing of the time values. Here's a good, if verbose question that covers this:

https://answers.splunk.com/answers/386217/displaying-average-from-a-timechart.html

3) and there are sometimes other outlier answers, like this one here to use first() as a shoot from the hip heuristic.

https://answers.splunk.com/answers/6216/how-to-plot-values-without-using-max-avg-count.html

but this seems imo pretty problematic and potentially misleading. use with caution.

Kind of sprawling answer. Perhaps not really an "answer" at all and more of a "further reading" post. ๐

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

DalJeanis

SplunkTrust

โ06-09-2017
07:39 AM

You are rejecting the methods that work. WHY?

You are focused on creating a histogram, which means that for each value of id, there must be a single unique numeric value of pr that constitutes how tall the bar will be.

What, exactly, does the value of pr mean? It must be a number, for the one-dimensional histogram you are asking for to exist.

If pr is not a number, then COUNT is the only aggregate function that makes sense. Use that. (If there are multiple possible values of pr for each id, you could use distinct count also, or you could abandon the single-dimension histogram in favor of something else.)

If pr is a number, and if there is only one event for each value of pr in each value of id, then SUM, MAX, MIN, AVG will all work and will all get the same answer.

If pr is a number, and there are multiple possible events for each combination of pr and id, then you need to decide exactly what you are trying to graph. Figure out the math for "how do I know how tall the bar needs to be?" and then code that into the chart command (or any other command).

On the other hand, if you want to do an x-y plot of various values, try visualizations that are not bar charts. Specifically, try the bubble chart and other x-y plots to see if they meet your need.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

mahikrrish

Explorer

โ06-09-2017
09:50 PM

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

DalJeanis

SplunkTrust

โ06-11-2017
12:05 PM

That depends on your query. If each value of id has only one value of pr returned by the query, and that value is numerical, then that value is indistinguishable from most aggregate functions: mathematically, it is equal to the max, the min, the mode, the mean, and the average; sequentially, it is the first, the last, the earliest, and the latest; Set-wise, it is completely equivalent to the list() and the values(). So, for that case where the id-pr relationship is 1-1, almost any meaningful aggregate function will serve. (Okay, not the stdev, but that wouldn't be meaningful.)

When you `| chart sum(pr) over id`

, then **for each id**, splunk will calculate the sum of the pr values.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

twinspop

Influencer

โ06-09-2017
07:37 AM

```
source="HVR_1 PageRank.csv" id="*" pr="*" | chart last(pr) as pr over id
```

But as rich7177 points out, this may not be exactly what you want.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Richfez

SplunkTrust

โ06-09-2017
07:26 AM

So what if there's more than one pr for one id? Which pr value should it use? How would Splunk know that?

Is there a time aspect to this data? Or is it only a "most recent value" type dataset?

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Richfez

SplunkTrust

โ06-09-2017
07:24 AM

Do you have a handful of sample data you could provide?

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

mahikrrish

Explorer

โ06-09-2017
09:45 PM

Yes. I can provide you. "pr" is PageRank of the "id" node. Each node has only 1 "pr". Following is the sample.

name id pr count name2 id2

148 148 0.199162542 64 148 148

243 243 1.126083355 29 243 243

31 31 0.17263125 55 31 31

85 85 0.16646875 136 85 85

137 137 0.207598883 51 137 137

251 251 0.505910879 26 251 251

65 65 0.729124137 25 65 65

53 53 0.38208409 55 53 53