Greetings All - I have a query that gives me the data I need. However when I tried to add a timechart function to bring it back in 10min blocks i get nothing? I am new to Slunk queries and in the learning process so your expertise is greatly appreciated.
Query 1 (works as desired but only brings back results for the time specified using the "date time range" drop down in the upper right corner)
index=okta host="domain.com"
| top action.objectType
| search action.objectType="core.user_auth.login_failed"
Query 2 (timechart added and no results ??)
index=okta host="domain.com" earliest=-1d latest=now()
| top action.objectType
| search action.objectType="core.user_auth.login_failed" | timechart span=10m count
your expertise and help is greatly appreciated 🙂
Thank you soo much for all of your suggestions. I have been working with all of them and unfortunately haven't achieved the end result just yet. So.. here's what I can tell you.... If I run this query for the last 60 minutes (using the date/time/change dropdown)...
index=okta host="domain.com"
| top action.objectType
| search action.objectType="core.user_auth.login_failed"
I get a failure percentage of 3.08 percent.
If I run the query below: (changing "earliest" to 60min) i get all 0 percentage results?
index=okta host="domain.com" earliest=-60m latest=now() action.objectType=*
| timechart span=10m count(eval(action.objectType=="core.user_auth.login_failed")) as failed_login count as all_logins
| eval failed_login_percent=round((failed_login/all_logins)*100,1)
| fields - failed_login all_logins
NOTE - Query Below seems to capture failed logins correctly:
index=okta host="domain.com" earliest=-60m latest=now() action.objectType="core.user_auth.login_failed"
| timechart span=10m count BY action.objectType limit=30 useother=0
Returns Output:
_time | core.user_auth.login_failed (column headers)
2018-03-14 09:20:00 | 12
2018-03-14 09:30:00 | 52
2018-03-14 09:40:00 | 31
2018-03-14 09:50:00 | 48
2018-03-14 10:00:00 | 40
2018-03-14 10:10:00 | 57
2018-03-14 10:20:00 | 4
Hello and first of all thank you very much for your suggestions. I didn't have time to work on this yesterday but am back it today. here's where things are at. I'm not exactly sure what's getting missed but something is...
My Base Query: this query actually does show the accurate failed logins. However, I have to manually set the time/date range.
Base Query: (ran with time range manually set to last 60 min)
ndex=okta host="domain.com"
| top action.objectType
| search action.objectType="core.user_auth.login_failed"
OUTPUT:
action.objectType | count | percent (column headers)
core.user_auth.login_failed | 276 | 2.362205 (data)
Suggested Query just to see logins:
index=okta host="domain.com" earliest=-60m latest=now() action.objectType=*
| timechart span=10m count(eval(action.objectType=="core.user_auth.login_failed")) as failed_login count as all_logins
*Output: (note that failed _login is always zero *
_time | failed_login | all_logins (column headers)
2018-03-14 09:00:00 | 0 | 93
2018-03-14 09:10:00 | 0 | 1941
2018-03-14 09:20:00 | 0 | 1891
2018-03-14 09:30:00 | 0 | 1757
2018-03-14 09:40:00 | 0 | 2089
2018-03-14 09:50:00 | 0 | 2219
2018-03-14 10:00:00 | 0 | 1385
I'm not exactly sure where to go from here but would love to get this solved 🙂
Thank you soo much for all of your suggestions. I have been working with all of them and unfortunately haven't achieved the end result just yet. So.. here's what I can tell you.... If I run this query for the last 60 minutes (using the date/time/change dropdown)...
index=okta host="domain.com"
| top action.objectType
| search action.objectType="core.user_auth.login_failed"
I get a failure percentage of 3.08 percent.
If I run the query below: (changing "earliest" to 60min) i get all 0 percentage results?
index=okta host="domain.com" earliest=-60m latest=now() action.objectType=*
| timechart span=10m count(eval(action.objectType=="core.user_auth.login_failed")) as failed_login count as all_logins
| eval failed_login_percent=round((failed_login/all_logins)*100,1)
| fields - failed_login all_logins
@ZigZaggin, the first query gives overall percentage of failed logins
for 60 min
as compared with other objectTypes
. The second query breaks errors into 6 parts i.e. additional results/6
so it would come down to 0.5
or less.
Can you try to increase the precision or round()
function from 1 to 4
?
| eval failed_login_percent=round((failed_login/all_logins)*100,4)
Also can you just run the timechart command and see the count of Failed Logins every 10 min as compared to all logins?
index=okta host="domain.com" earliest=-60m latest=now() action.objectType=*
| timechart span=10m count(eval(action.objectType=="core.user_auth.login_failed")) as failed_login count as all_logins
Similarly for my query test out the following and perform a manual failed percent per row using the Total column added in the end.
index=okta host="domain.com" action.objectType=*
| timechart span=10m count by action.objectType limit=0 useother=f usenull=f
| addcoltotals col=f row=t
That did it ! Thank you so much Niketnilay ! I've accepted your answer.
Update: I adjusted the last query as follows and it appears to be grabbing what is needed:
Query:
index=okta host="domain.com" action.objectType=login
| timechart span=10m count by action.objectType limit=0 useother=f usenull=f
| addcoltotals col=f row=t
Output Returned:
_time | app.ldap.login.locked_account | app.ldap.login.success | app.ldap.login.unknown_failure | core.user_auth.login_failed core.user_auth.login_success | Total
2018-03-14 09:30:00 | 0 | 180 | 1 | 37 | 192 | 410
2018-03-14 09:40:00 | 2 | 332 | 0 | 31 | 347 | 712
2018-03-14 09:50:00 | 2 | 337 | 0 | 48 | 358 | 745
2018-03-14 10:00:00 | 0 | 294 | 0 | 40 | 317 | 651
2018-03-14 10:10:00 | 0 | 289 | 0 | 57 | 316 | 662
2018-03-14 10:20:00 | 2 | 263 | 1 | 39 | 289 | 594
2018-03-14 10:30:00 | 0 | 52 | 0 | 3 | 52 | 107
@ZigZaggin, if the above worked and you have fields core.user_auth.login_failed
and Total you can calculate the percent however, since the field has a dot .
you would need to escape the field name in eval
expression using single quotes '
i.e.
| eval failure_percent= round(('core.user_auth.login_failed'/Total)*100,1)
I will be updating my answer, however the same eval should work for the other solution approach as well. Please try out and confirm!
I mean this one did it! Piping to the eval using the single quotes was the key. Thanks again Niketnilay !
[Updated Answer] To handle special character in the field name to be escaped for eval expression
Following can be used as well, since only failed percent is required
index=okta host="domain.com" action.objectType=*
| timechart span=10m count by action.objectType limit=0 useother=f usenull=f
| addcoltotals col=f row=t
| eval failed_percent=round(('core.user_auth.login_failed_perc'/Total)*100,1)]
@ZigZaggin try the following:
index=okta host="domain.com" action.objectType=*
| timechart span=10m count by action.objectType limit=0 useother=f usenull=f
| addcoltotals col=f row=t
| foreach * [eval "<<FIELD>>_perc"=round(('<<FIELD>>'/Total)*100,1)]
| fields - Total*
| table time *_perc
PS:
1) The addition of action.objectType=*
in the base search ensures that only fields with the field action.objectType
are used for stats.
2) limit=0
in the timechart
command should be used to account for all the action.objectType
.
3) The addcoltotals command with row=t
and col=f
give the total of each row i.e. 10m
time span.
4) The foreach command applies a template eval for each field
5) If you just want to have a look at the failed login percent, the final | table
should have core.user_auth.login_failed_perc
instead of *_perc
:
| table _time core.user_auth.login_failed_perc
If you want what the login failure percentage was in 10min increments
index=okta host="domain.com" earliest=-1d latest=now()
| timechart span=10m sum(eval(if(action.objectType=="core.user_auth.login_failed",1,0))) as failure_logins, count as total
| eval percentage=round(100*failure_logins/total,2)
Let me know if it solves your issue
This appears to be really close! So Thank you TIAGOFBMM 🙂 However, when I run the first query (posted in my intial question) with a top command on action.objectType and then do the search against the collection of items returned via the top command, it tells me the percentage of failed logins vs all other action.objectType events. The latest query seems to only take into account failed logins so the percentage is always zero ?
Hmm it should not be zero. Maybe it is zero because the percentage is too small and I rounded it to 2 decimal cases only.
But I did not understand if you just want the percentage of failed logins the percentage splitted for all the values of action.objectType?
@ZigZaggin, @tiagofbmm, it should look like the following;
index=okta host="domain.com" earliest=-1d@d latest=now() action.objectType=*
| timechart span=10m count(eval(action.objectType=="core.user_auth.login_failed")) as failed_login count as all_logins
| eval failed_login_percent=round((failed_login/all_logins)*100,1)
| fields - failed_login all_logins
Hi ZigZaggin,
timechart command needs the presence of _time field, when you use top command you have as result only action.objectType, count and perc, not _time, so you cannot use timechart after top command.
In other words you have to build your search in different way, e.g. using something like this:
index=okta host="domain.com" earliest=-1d latest=now() action.objectType="core.user_auth.login_failed"
| timechart span=10m count BY action.objectType limit=10 useother=0
Bye.
Giuseppe
In the first query I posted I get a field back with the percentage. My overall goal is to implement that query so I can see what the login failure percentage was in 10min increments. I need to do this to establish a baseline / threshold at which to set an alert to fire.
@ZigZaggin you can always use Image Sharing site to upload your image like imgur
, and then include a link to image using image <img>
icon or Ctrl+G to paste the URL to display image.
Ok, thank you for your responses regarding inclusion of the time field. I'll see what I can come up with. The system won't let me post any example screenshots yet (karma points?).
Hi
If you check just the output of this part:
index=okta host="domain.com"
| top action.objectType
You'll see there is no _time field anymore, so timechart could not function without his base _time field.