Hello,
I have a following query which gives the count of "zero".
index=main item_type=television
| timechart count span=1m as item_count
| fillnull
| eval item_nonexistence = if(item_count <= 0, 1, null())
| stats sum(item_nonexistence) as item_count
| search item_count > 5
It gives me the count if the condition, else no results. I have bunch of items (10) and i would like to get if count of "zero" is greater than 5.
if count of zero is greater than 5 for item1, item2, item5, item8. It should produce me the results in the following way.
item_type count
item1 item1_missing
item2 item2_missing
item5 item5_missing
item8 item8_missing
To rewrite your problem statement: You want to know, for each item_type, whether for any five or more individual minutes across the search timeline, there were no events.
index=main item_type=*
| fields _time item_type
| bin _time span=1m
| timechart span=1m count as item_count by item_type
| fillnull
Up to this point, you had it right. The fields
command we added is just explicitly telling splunk that those are the only fields needed for the remainder of the calculation, and special field _time
is there just for us to see, because the fields command doesn't remove hidden fields like _time
unless we specifically tell it to.
| untable _time item_type Item_count
| where item_count=0
| stats count as minute_count by item_type
| where minute_count > 5
Untable is a special command that will take each record, and split it up into multiple records, each of which has the first named field (in this case _time
), and the name and value of one other field. The name will go in the second parameter, and the value in the third parameter. I could have said untable _time foo bar
and the field foo
would contain the item_type and bar
would contain the count for that time.
So, for example, if your records were these two records...
_time drone radio television typewriter
(time1) 0 3 3 0
(time2) 0 0 1 5
They would become these eight records...
_time item_type Item_count
(time1) drone 0
(time1) radio 3
(time1) television 3
(time1) typewriter 0
(time2) drone 0
(time2) radio 0
(time2) television 1
(time2) typewriter 5
Passing through the where
test, these ones would pass...
_time item_type Item_count
(time1) drone 0
(time1) typewriter 0
(time2) drone 0
(time2) radio 0
Then the stats would put them together like this
item_type minute_count
drone 2
radio 1
typewriter 1
And if any of them had more than five, they would pass the next test and give you your list. You can then add any wording you want with another eval.
I believe that this is the Sentinel Search problem discussed (with solution) here:
https://conf.splunk.com/session/2015/conf2015-LookupTalk.pdf
To rewrite your problem statement: You want to know, for each item_type, whether for any five or more individual minutes across the search timeline, there were no events.
index=main item_type=*
| fields _time item_type
| bin _time span=1m
| timechart span=1m count as item_count by item_type
| fillnull
Up to this point, you had it right. The fields
command we added is just explicitly telling splunk that those are the only fields needed for the remainder of the calculation, and special field _time
is there just for us to see, because the fields command doesn't remove hidden fields like _time
unless we specifically tell it to.
| untable _time item_type Item_count
| where item_count=0
| stats count as minute_count by item_type
| where minute_count > 5
Untable is a special command that will take each record, and split it up into multiple records, each of which has the first named field (in this case _time
), and the name and value of one other field. The name will go in the second parameter, and the value in the third parameter. I could have said untable _time foo bar
and the field foo
would contain the item_type and bar
would contain the count for that time.
So, for example, if your records were these two records...
_time drone radio television typewriter
(time1) 0 3 3 0
(time2) 0 0 1 5
They would become these eight records...
_time item_type Item_count
(time1) drone 0
(time1) radio 3
(time1) television 3
(time1) typewriter 0
(time2) drone 0
(time2) radio 0
(time2) television 1
(time2) typewriter 5
Passing through the where
test, these ones would pass...
_time item_type Item_count
(time1) drone 0
(time1) typewriter 0
(time2) drone 0
(time2) radio 0
Then the stats would put them together like this
item_type minute_count
drone 2
radio 1
typewriter 1
And if any of them had more than five, they would pass the next test and give you your list. You can then add any wording you want with another eval.
@DalJeanis, Awesome!! Thanks a ton! This is exactly I was looking at.
@DalJeanis, I have a problem here, When the Item_count is zero for the selected time-range. The item_type is not being listed in the results.
Though the count of zero's exists with item_type drone its not showing up.
_time item_type Item_count
(time1) drone 0
(time1) typewriter 0
(time1) radio 0
(time2) drone 0
(time2) typewriter 0
(time3) drone 0
(time4) drone 0
The output looks like below:
item_type minute_count
radio 1
typewriter 1
I would like to add the item_type with the Item_count as 1, when the Item_count are all zero's.
Expected output for the above scenario:
item_type minute_count
radio 1
typewriter 2
drone 1
Can you please help with this?
Sure. You need to ensure there is a record for each category for the time period immediately below your time period in question, before the chart command is run, then you throw that record away after the chart command. Let's assume you have a csv file called mylist.csv with the list of item_type values you need. Then something like this should work....
index=main item_type=*
| fields _time item_type
| bin _time span=1m
| append [ | inputcsv mylist.csv
| addinfo
| eval _time = info_min_time - 60
| table _time item_type
]
| timechart span=1m count as item_count by item_type
| fillnull
| eventstats min(_time) as mintime
| where _time > mintime
| fields - mintime
| untable _time item_type Item_count
| where item_count=0
| stats count as minute_count by item_type
| where minute_count > 5
There might be a slightly less complicated way to use the csv, though. Hmm.
index=main item_type=*
| fields _time item_type
| bin _time span=1m
| timechart span=1m count as item_count by item_type
| fillnull
| untable _time item_type Item_count
| appendpipe [
| stats count as present by item_type
| inputcsv append=t mylist.csv
| stats count as dupcount by item_type
| where dupcount=1
| eval flag = "missing"
| table _time item_type
]
| where item_count=0
| stats count as minute_count max(flag) as flag by item_type
| where minute_count > 5 or isnotnull(flag)
@DalJeanis, Thanks for the reply! The query seems to be working only if there are records for each item_type but I would like the item_type to be updated with any of the values( 0 or 1) when there are no records found for the specific item_type.
For example:
Let's assume the count of zero for drone is 3, count of zero for typewriter is 2 and count of zero for radio is 0, i.e., radio has events for each minute for the selected time-range(15 Min) and television has no records for the selected time-range(15 Min) so I'm giving a dummy value as 15.
I should get the following results when the condition is set as where count >= 2
item_type count
drone 3
typewriter 2
television 15
Below it's returning this which is inappropriate.
drone 3
typewriter 2
radio 15
television 15
item_type radio should not be in the results as it does not contain the value 0. Can you help on this?
Query:
index=main item_type=*
| fields _time item_type
| bin _time span=1m
| timechart span=1m count as item_count by item_type useother=f
| fillnull
| untable _time item_type item_count
| where item_count = 0
| stats count by item_type
| append
[| inputlookup Testing_Alerts
| fields + item_type
| stats count by item_type
| eval count = 0
| fields item_type count ]
| dedup item_type
| rename count as missing_item_count
| eval missing_item_count = if(missing_item_count = 0, 15, missing_item_count)
| where missing_item_count >= 2
hmmmm, looks like you are filtering the item_type
in your search: .... item_type = television
@adonio, I have given it for an instance but actually the query does not contain any item_type. I want to sum the count of the events which are zero's and if the sum is greater than 5 then the results should look like below.
item_type event_count
item1 missing events are more than five for the selected time-range
item2 1
item3 2
item4 missing events are more than five for the selected time-range
Can you help me in getting the desired output? Please let me know if you require any additional information
Can you give an example of your input?
Because you would like to get the output divided by item_type but in the query you do not have any "by" condition.
I would like to better understand what you have in input, the output you want is clear.