I have static values (user groups) that I need to loop through to produce the results for each of the static values. We then use that to produce averages for the users within those groups and alert on any that are above the average by a small percent. I may be making this more difficult than necessary. While I can hard code the group info, I would prefer to not have 20 identical searches other than the group identifier. My averages are also not returning the desired results, it seems that the average is not carried through all of the data.
Group, grp1, grp2, grp3, grp4, etc...
User #access total group groupavg Usera 20 45 10 Userb 15 Userc 5 Userd 5
In this case using the variable
groupavg it would only produce the result that
usera was above the average not
userb. Now if you hardcode the value of
10 it does show both
index="foo" [|inputlookup input.csv |rename Group as query |fields query] Classification=query |dedup fooKey | eval Date=strftime(_time, "%D") | stats count by User |appendcols [search index="foo" Classification=query |dedup fooKey | eval Date=strftime(_time, "%D") | stats count as total dc(user) as totuser |eval groupavg=round(total/totuser,0)] |where count > (groupavg*1.5) |fields - total, totuser
So to see if I was way off base, I tried a simple search using a lookup file against
Log, INFO, WARNING, WARN, ERROR, error,
index="_internal" [|inputlookup testfile.csv | rename Log AS query | fields query] Log_Level=query |stats count by query
This returned nothing as far as
stats, but it did return events:
12/7/15 2:19:41.176 PM 12-07-2015 14:19:41.176 -0700 INFO StreamedSearch - Streamed search connection terminated: search_id=remote_indexmaster.localdomain_1449523206.47, server=indexmaster.localdomain, active_searches=0, elapsedTime=0.273, search='litsearch index="_internal" ( ( INFO ) OR ( WARNING) OR ( WARN ) OR ( ERROR ) OR ( error ) ) Log_Level=query | addinfo type=count label=prereport_events | fields keepcolorder=t "*" "_bkt" "_cd" "_si" "host" "index" "linecount" "prestats_reserved_*" "psrsvd_*" "query" "source" "sourcetype" "splunk_server" | remotetl nb=300 et=1449508800.000000 lt=1449523206.000000 max_count=1000 max_prefetch=100 | prestats count by query', savedsearch_name=""
So it is returning the fields from the
input.csv, but in the Boolean format
( INFO ) OR ( WARNING) OR ( WARN ) OR ( ERROR ) OR ( error )
I need to be able to extract each field run in a search against those results as stated above.
So what am I doing wrong?... Should I not use an input lookup to loop through the results?
Ideally we would like to see that
Usera who is part of
grpd is 1.5 times higher than the rest of the users in
Thank you in advance for any assistance you can provide.
query is a special field name.
|inputlookup testfile.csv | rename Log AS query | fields query OR ( WARNING) OR ( WARN ) OR ( ERROR ) OR ( error )
|inputlookup testfile.csv | rename Log AS LogLevel| fields LogLevel OR ( LogLevel=WARNING) OR ( LogLevel=WARN ) OR ( LogLevel=ERROR ) OR ( LogLevel=error )
Is this okay.
index="internal" [|inputlookup testfile.csv | rename Log AS query | fields query] LogLevel=query |stats count by query
⇒ index="internal" [|inputlookup testfile.csv | rename Log AS LogLevel| fields LogLevel] |stats count by LogLevel
LIke HiroshiSatoh posted though the key with subsearches is to have any output match fields that are going to be in the data from your parent search. One thing you can do (if you don't already know) is add
| format to the end of a search that will be come a subsearch. It will show you what will be passed to the parent.
Additionally I think you will want to be introduced to the eventstats command.
index="foo" [|inputlookup input.csv |rename Group as Classification |fields Classification] |dedup fooKey | stats count by User | eventstats sum(count) as total dc(User) as totuser |eval groupavg=round(total/totuser,0)]
I've left off your eval as because it is before the stats command and isn't referenced in the stats command it will be dropped. I've also not included the where and fields command as you can always add those back in once you've seen the outcome. The big challenge (to me) is there isn't enough information in what you've provided for us to know if this query will return the results you are looking for. For example - comparing your hardcoded results to your query there isn't a field for
group let alone what those numbers mean. What is
fooKey and does it need to be de-duplicated? Perhaps I've not had enough caffeine 😃
OK, easy answer first, the dedup is a key value that the customer uses, not important for this issue 😉
Combining HiroshiSatoh and yours I have this (working off of internal for now):
index="internal" [|inputlookup testfile.csv | rename Log AS loglevel| fields loglevel]
|stats count by loglevel
|eventstats sum(count) as totlog dc(log_level) as totgrp
This returns the following:
log_level count grpavg totgrp totlog
ERROR 54 10758 3 32275
INFO 31841 10758 3 32275
WARN 380 10758 3 32275
Ok so this returns the information that we can work with, when adding in the where clause:
index="internal" [|inputlookup testfile.csv | rename Log AS loglevel| fields loglevel] |stats count by loglevel |eventstats sum(count) as totlog dc(log_level) as totgrp|eval grpavg=round(totlog/totgrp,0)|where count > (grpavg*1.5)
INFO is the only returned result - yeah!!! Now to see if this works with the customer data, I'll let you know and accept if so
Got word back from the customer, while it is returning results, it is combining all of the classifications into a group and doing the average on the total for every classification.
The goal is to read the file for classification "A" gather all information from that classification and if a user in that classification is above the average for the group then we want to know about that user in that group, then move to the next classification and do it all over again.
Here is the actual search we ran:
index="usrindex" [|inputlookup classfile.csv | rename class AS Classification| fields Classification]
| eval Date=strftime(_time, "%D")
| stats count by UserID
| eventstats sum(count) as total dc(UserID) as totuser
|where count > (groupavg*1.5)
So how would I ensure I am only working within each classification NOT all classifications for my averages
by Classification to your eventstats. Eventstats works the same as stats in terms of what functions you can do. The main difference is the only fields available after the stats commands are the ones created as part of the functions or included in the 'by'. For example your
Date field isn't available after the stats command.