Active Directory – Failed Login Events - SPL – Whi...

018Porta4 · ‎08-09-2018

Community,

New to Splunk, first post, your patience is appreciated. Also, thank you in advance.

This post is focused in the direction of efficiency, effectiveness, accuracy, and understanding rather than “How to.”

I have three queries, each created by a different entity and I am seeking to understand the difference in Results and to understand the “What’s happening under the hood” of the queries themselves.

Splunk provided consultant’s Report:

Comment: 4 hour window, Run Time: 51 minutes, does not appear to complete fully. It is as if the search continues looking for the newest events and adds those to the results.

In-house created Report:

Comment: 24 hour window, Run Time: less than 30 seconds. However this yields thousands of Event Code: 4625 events per identified user, yet the results do not match the number of user account lockouts.

Research shows that 90%+ of these events are to a server rather than Active Directory. Hence, no account lockouts I’d wager.

Dashboard Panel from Splunk App for Windows Infrastructure:

Comment: 24 hour window, Run Time: 3:39 seconds and yielded roughly 200 events, which appear to match the user account lockout numbers. The Windows Infrastructure version uses the “eventtype.” Where did this come from?

It is clear that there is different methodology, and accuracy, in each of these queries. What I do not understand is what exactly the Windows Infrastructure version is doing.

There is a lot of documentation, examples, webinars, and comments, even within this forum which indicate to use the Index and Sourcetype to narrow the search criteria.

adonio · ‎08-09-2018

use the job inspector to see the full search, iirc the app for WIN INF has plenty of macros eventtypes and othe knowledge objects
other than that, the lockout event (4740) is different that the failed login event (4625)
also, to start with the table command is most of the time wrong, you are piping all the data without any action on it. also it is best practice to filter as far back to the left as possible, and as a function of that to declare index = <index> sourcetype = <sourcetype>
try this instead:

   index = <index> sourcetype = <sourcetype> action=failure user !="$" 
    | stats count as fail_count by user 
    |  where count > 50 | sort – fail_count

hope it helps a little

018Porta4 · ‎08-09-2018

Adonio.

For point of clarification, I am following your logic regarding filtering as far back as possible, I read that someone else as well. I also understand the difference between those two event codes.

What I am seeking to understand is the WIN INF App, how they chose to craft their query versus what the community seem to recommend which is as you point out index = sourcetype = .

The results on my end, when using the index = sourcetype = method, yields thousands of failed events for one or more random users failing at one or more random servers, but not Active Directory specifically. I could open a ticket for failed login events nearly all day most days of the week. Are these false positives, app specific, and thus do not trigger the AD lockout?

The WIN INF App, appears, to be more accurate and a more efficient use of resources. I am concerned that if I stop using the recommended method index = sourcetype = , and start using, exclusively, the WIN INF App method, am I missing any legitimate failure events?

Dashboard Panel from Splunk App for Windows Infrastructure:

It is just confusing as to why so many people seem to have a slightly different flavor of the same search string, while the WIN INF is in a league of its own. Its as if you have 50 Ford Mustangs, everyone slightly different than the next, and then bring in a semi-truck. Whose leading whom?

Regarding the account lockouts, this was a point of comparison between using the index = sourcetype = method versus the WIN INF method for failed login events.

The index = sourcetype = method yields, at times, thousands of failed login events for a single user, however, there are no account lockouts as one would expect. x Failed Login Events = Account Lockout. The results just don't match up.

With the WIN INF method, x Failed Login Events = Account Lockout as one would expect per company policy.

Thank you for your feedback.

adonio · ‎08-09-2018

i think there are 2 questions here:
1. why i get false positives with EventCode=4625 vs real lockout as they are seen in WIN INF app
2. search efficieny

for (1) these are different sets of data, which can also explain why searches against EventCode 4625 takes longer. the lockout code (4740) will arrive from Active Directory server, while the 4625 will arrive from many windows machines.
i will answer (2) later as i am short on time at the moment, but, please take the time to inspect the search using the job inspector and see the FULL SEARCH STRING as it appears. in apps as large as WIN INF, there are many macros and eventtypes (and other knowledge objects) to help simplify search writing and help developers.

Active Directory – Failed Login Events - SPL – Which is most efficient and why?

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life