Splunk Search

timechart with WHERE clause not behaving as expected

New Member

I have a fairly straightforward query using timechart to count the top 10 users triggering an event. ( Sanitized )

index=foobar EventCode=1234 | timechart span=1d count(EventCode) BY user WHERE max in top10 usenull=f useother=f

This returns a chart that makes good sense. Now in order to separate noise from things II'm concerned about, I want to only see users who have triggered the event 10 times or more per day. A quick look at the manual provided this :


Chart the eventypes by sourceip
For each minute, count the eventypes by source
ip, where the count is greater than 10.

sshd failed OR failure | timechart span=1m count(eventtype) BY source_ip usenull=f WHERE count>10

(docs.splunk.com/Documentation/Splunk/7.3.0/SearchReference/Timechart)

So I adapted this to my existing query :

index=foobar EventCode=1234 | timechart span =1d count(EventCode) by user usenull=f WHERE count>10

This however did NOT generate what I expected. I expected to see the same chart as before, only with the users with less than 10 events triggered per day being left out. Instead I have a bizarre chart that I received a truncation warning for and that included users that had fewer than 10 per day.

I've been told I need to put spaces in the WHERE clause ( WHERE count >10 , or WHERE count > 10 ) but this made no change.
I've been told that the chart is accurate and that it reflects users who have hit more than 10 at any time ( not per day ) but that did not make sense to me, nor did it jibe with what the manual said.

After many tries to make timechart work, I gave up and explored other options , and discovered this solution in a community article :

index=foobar EventCode=1234 | bucket span=1d _time | stats count by _time user | where count > 10 | xyseries _time,user,count

This did return what I was looking for , but I'm not sure why one works and the other doesn't.

Any suggestions as to why?

0 Karma

Esteemed Legend

The count and user fields get turned into a matrix such that the field count is absorbed. You need to bring it back, filter on it, then, go back to where you were; try this:

index=foobar EventCode=1234
| timechart span=1d count BY user usenull=f
| untable _time user count
| eventstats max(count) AS max_count BY user
| where max_count>10
| timechart span=1d avg(count) AS count BY user usenull=f
0 Karma

New Member

I gave this a try, unfortunately it still displayed users with less than 10 events on a daily basis.

0 Karma

Esteemed Legend

It depends on what you mean; if the original answer isn't it, then maybe you mean this:

index=foobar EventCode=1234
| timechart span=1d count BY user usenull=f
| untable _time user count
| where count>10
| timechart span=1d avg(count) AS count BY user usenull=f
0 Karma

Ultra Champion

Have you tried using a simple count in your timechart command, instead of count(EventCode)? Results should be the same anyway, since you already filter for a certain EventCode in your base search so each event will have that field populated.

Edit: when I read the docs correctly, using where count > 10 like this, actually applies over the whole period, it doesn't apply the filter for each timespan. So if the sum of the event count over all days is > 10, it will include that entry, even if single days are all below 10.

0 Karma

New Member

Thanks for the reply. I tried it with COUNT instead of COUNT(EventCode) and as you indicated the results were the same.

I'm not sure what you meant by " when I read the docs correctly". Which docs are you referring to , and where do they indicate that the results are the sum is over all days and not per span?

What I had read ( under Basic Examples , in the on line docs for 7.3.0 ) was :

"For each minute, count the eventypes by source_ip, where the count is greater than 10."

This satisfies my need for the count to be by each span , and not over all time specified. The only difference is that I used "span=1d" instead of 1 minute.

Are you suggesting that there is an error in the docs, or did I miss something / am I looking in the wrong place?

0 Karma

Ultra Champion

https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Timechart#Where_clause_examples

"Example 4: Using the where clause with the count function measures the total number of events over the period. This yields results similar to using the sum function."

0 Karma

New Member

OK this makes sense to me now.

So all in all, is TIMECHART the wrong approach to what I'm trying to do?

My goal is to chart all users who have tripped a specific alarm , per day, over 30 days and then a second chart that shows the same information, but only for users who have tripped the alarm more than 10 times in a day.

0 Karma

Ultra Champion

I guess so. Although you could still do | timechart span=1d sum(count) as count by user instead of your xyseries. But you'll have to do some steps before to filter against that threshold indeed.

And that also makes sense if you look at what the output of timechart is. It gets you a _time column and then 1 column for each user (in contrast to a stats count by user, which gets you a count and a user column). What you want to then do is remove individual cells from that result. I guess in theory you could do that by following the timechart with a foreach construct to set any cell below the threshold to 0. But doing a stats first, filter that and then do the charting is probably easier.

0 Karma