Solved: Can you help me with my histogram visualization?

skribble5 · ‎12-27-2018

Hi all,

I would like to show my data via 2 different histograms, but I am having trouble figuring it out. Can someone help please.

Here is the summary:

The details are:
My data has a few key fields:

_time: this is the timestamp field for a record
read_category: this has 3 possible values (read, skimmed, glanced)
read_seconds: this is the field that stores number of seconds for a record. The max is 20 seconds.
city: city of the record (geolocation data)

I would like to make 2 visualizations using this data.

Visualization showing on:
x-axis: each hour of day (1 hour block)
y-axis: for each hour of day, how much % of total records for that hour was of a particular "read_category"
I would give splunk 2 inputs to create this every time — first input would be the city (e.g. London) and the other input would be the read_category (e.g. glanced)
For the second visualization:
x-axis: each hour of day (1 hour block)
y-axis: for each hour of day, what was the average duration (i.e. field read_seconds) of total records for that hour. The maximum possible average is 20 seconds; it can't be higher than that.
I would give splunk 1 input to create this every time - which would be the city (e.g. London)

I hope my question and screenshot is clear. This is the code so far, although it has nothing to do with my visualization attempts (which haven't worked for me):

| inputlookup xxx.csv | convert ctime(created_at) as Time timeformat=%y-%m-%dT%h:%m:%s | eval _time = strptime(created_at, "%Y-%m-%dT%H:%M:%S") | eval date_hour=strftime(_time, "%H") | table created_at, _time, date_hour, read_category, read_seconds, city

Appreciate any help you can provide!

p_gurav · ‎12-27-2018

Try this:

1. <your search> |  eventstats count as total_events | stats count values(total_events) as total_events by read_category, hour   | eval per=round(count/total_events*100)  | chart sum(per) by hour, read_category

2. <your_search> | stats avg(read_second) as avg_read_second by hour

View solution in original post

p_gurav · ‎12-27-2018

Try this:

1. <your search> |  eventstats count as total_events | stats count values(total_events) as total_events by read_category, hour   | eval per=round(count/total_events*100)  | chart sum(per) by hour, read_category

2. <your_search> | stats avg(read_second) as avg_read_second by hour

skribble5 · ‎12-28-2018

Thanks p_gurav. Number 2 is now giving me the right data in a table and I should be able to figure out now how to make a graph/histogram out of it.

Number 1 is giving me data in a table but I am expecting the percentages in each row to add up to 100% and currently it is not. I can't add a screenshot but here is the result for 6 am and 7 am:

date_hour | glanced | read | skimmed |
6 am | 0.6 | 1.4 | 0.7 |
7 am | 1.1 | 2.6 | 1.0 |

Each row of 6 am and 7 am should add up to 100%, but it doesn't. I think these are percentages of ALL EVENTS, but we need to show percentage for each read_category for THAT hour. I think we are very close.

The current code is:

| inputlookup xxx.csv | convert ctime(created_at) as Time timeformat=%y-%m-%dT%h:%m:%s | eval _time = strptime(created_at, "%Y-%m-%dT%H:%M:%S") | eval date_hour=strftime(_time, "%H") | eventstats count as total_events | stats count values(total_events) as total_events by read_category, date_hour   | eval per=round((count/total_events)*100,1)  | chart sum(per) by date_hour, read_category

Thanks!

niketn · ‎12-28-2018

@skribble5 , please try the following search for percent of Read Category Per Hour:

| inputlookup xxx.csv 
| eval _time = strptime(created_at, "%Y-%m-%dT%H:%M:%S")
| eval date_hour=strftime(_time, "%H") 
| chart count as Total count(eval(read_category=="glanced")) as glanced count(eval(read_category=="read")) as read count(eval(read_category=="skimmed")) as skimmed by date_hour
| foreach glanced read skimmed [ eval <<FIELD>>_perc=round((<<FIELD>>/Total)*100,2)]
| fields - Total glanced read skimmed

Following is a run anywhere example based on Splunk's _internal index on similar lines as per your question. Instead of read_category as glanced, read, skimmed, internal splunkd log has log_level as INFO WARN and ERROR:

index=_internal sourcetype=splunkd log_level IN ("INFO","ERROR","WARN")
| chart count as Total count(eval(log_level=="INFO")) as INFO count(eval(log_level=="WARN")) as WARN count(eval(log_level=="ERROR")) as ERROR by date_hour
| foreach INFO ERROR WARN [ eval <<FIELD>>_perc=round((<<FIELD>>/Total)*100,2)]
| fields - Total INFO WARN ERROR

PS: I dont think you need convert ctime as strptime should be able to convert string time to epoch as per your need.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

skribble5 · ‎12-28-2018

Thanks niketnilay, that has worked beautifully. Really appreciate the assistance!

Thanks to this community for helping novices!

niketn · ‎12-28-2018

@skribble5 we are glad to assist you. Do up vote the answer and comments that helped. Now that you have learnt new techniques, hope you would also be assisting those in need 😉 Welcome to the community!

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

Can you help me with my histogram visualization?

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life