I'm looking to timehart errors (I'm using the count of the field 'level' for errors) by host. Since my some of my hosts have vastly higher levels of traffic than others, its more useful to me to see this as a percentage.
I tried the below to no avail, I was consulting another similar request and it didn't work. https://answers.splunk.com/answers/103041/percentage-timechart.html
index=foo <my search here> | timechart count(where level!="info") as lvl, count as ttl by host| eval pct=lvl/ttl
I'm able to get what I want for a single host by the following query:
index=foo host=x | bucket _time bins=100 | eventstats count as total by _time | stats count first(total) as total by _time, level | eval percent=(count/total)*100 | search NOT level="Info" | timechart first(percent) by level
Timechart should give results like:
_time x y z
1:00 2 0 1 // host x has 2% errors, z 1% errors
1:05 1 1 5 // host x and y have 1% errors, z has 5% errors
etc.
I'm using Splunk Enterprise 6.5.
Thanks much!
| makeresults
| eval host="HostA HostA HostA HostA HostA HostB HostB HostB HostC" | makemv host | mvexpand host
| eval timefan=mvrange(1,1000,10) | mvexpand timefan
| eval _time = now() + timefan
| eval fan=mvrange(1,5+random()%75) | mvexpand fan
| eval rand= (random()%127+random()%153)%27 | eval level=if(rand<25,"Info","Error")
| rename COMMENT as "The above just generates random data for three hosts across roughly 1000 seconds."
| rename COMMENT as "Unit allows us to count what we want by summing it, and create zero records by zeroing it"
| eval unit=1
| rename COMMENT as "Now we add zero-unit Error records for every host and _time unit"
| bin _time bins=100
| appendpipe [|stats values(host) as host values(_time) as times | mvexpand host | mvexpand times | rename times as _time | eval unit=0, level="Error"]
| rename COMMENT as "This part follows your basic logic, but accounts for _time-host combinations that have no errors or no records at all."
| eventstats sum(unit) as total by _time host
| stats sum(unit) as mycount first(total) as total by _time host level
| eval percent=round((mycount/if(total=0,1,total))*100,0)
| rename COMMENT as "Finally, we get rid of records and fields we don't need, and present the rest..."
| search NOT level="Info"
| table _time host percent
| timechart first(percent) by host
| makeresults
| eval host="HostA HostA HostA HostA HostA HostB HostB HostB HostC" | makemv host | mvexpand host
| eval timefan=mvrange(1,1000,10) | mvexpand timefan
| eval _time = now() + timefan
| eval fan=mvrange(1,5+random()%75) | mvexpand fan
| eval rand= (random()%127+random()%153)%27 | eval level=if(rand<25,"Info","Error")
| rename COMMENT as "The above just generates random data for three hosts across roughly 1000 seconds."
| rename COMMENT as "Unit allows us to count what we want by summing it, and create zero records by zeroing it"
| eval unit=1
| rename COMMENT as "Now we add zero-unit Error records for every host and _time unit"
| bin _time bins=100
| appendpipe [|stats values(host) as host values(_time) as times | mvexpand host | mvexpand times | rename times as _time | eval unit=0, level="Error"]
| rename COMMENT as "This part follows your basic logic, but accounts for _time-host combinations that have no errors or no records at all."
| eventstats sum(unit) as total by _time host
| stats sum(unit) as mycount first(total) as total by _time host level
| eval percent=round((mycount/if(total=0,1,total))*100,0)
| rename COMMENT as "Finally, we get rid of records and fields we don't need, and present the rest..."
| search NOT level="Info"
| table _time host percent
| timechart first(percent) by host
This worked for me. Essentially I put everything after line 9 (1-9 in the above just generated some random data for you to work with) after my search. Thanks!