I've looked around but haven't found the exact same issue I am having. I need to figure out how to fix the following:
Feb 10 07:29:35 authpriv info devbox.domain.com sshd[16296]: pam_unix(sshd:session): session opened for user DOMAIN+jsmith by (uid=0)
host = splunk.domain.com
punct = __::___.._[]:__(:):_____+__(=)
source = /var/log/archive/incoming/2016/02/10/devbox.domain.com/sshd.log
sourcetype = %authlog%
Normally it would just be user jsmith but since I joined it to the windows domain it added the domain before the user. All of the results just show up as DOMAIN. Is there a way with regex or something else to get it to show up as DOMAIN+jsmith or just jsmith?
Is the field user already extracted?? If yes, update the regex from below sample, or create field extraction if not setup already
your base search | rex field=_raw "for user\s+(?<User>\S+)"
Update
Looks like there can be spaces between DOMAIN and user name, so try this
your base search | rex field=_raw "for user\s+(?<User>.+)\sby"
I guess we'd need more sample logs to finalize the reg exp here. See if you're able to post another comment.
2/10/16
7:29:35.000 AM
Feb 10 07:29:35 authpriv info devbox.domain.com sshd[16296]: pam_unix(sshd:session): session opened for user jsmith by (uid=0)
host = splunk.domain.com
punct = __::___.._[]:__(:):_____+__(=)
source = /var/log/archive/incoming/2016/02/10/devbox.domain.com/sshd.log
sourcetype = %authlog%
2/10/16
7:29:35.000 AM
Feb 10 07:29:35 authpriv info devbox.domain.com sshd[16294]: pam_unix(sshd:session): session opened for user jsmith by (uid=0)
host = splunk.domain.com
punct = __::___.._[]:__(:):_____+__(=)
source = /var/log/archive/incoming/2016/02/10/devbox.domain.com/sshd.log
sourcetype = %authlog%
1/31/16
3:39:54.000 AM
Jan 31 03:39:54 authpriv info devbox.domain.com sshd[12699]: pam_unix(sshd:session): session opened for user kwhite by (uid=0)
host = splunk.domain.com
punct = __::___.._[]:__(:):_____+__(=)
source = /var/log/archive/incoming/2016/01/31/devbox.domain.com/sshd.log
sourcetype = %authlog%
1/31/16
3:39:54.000 AM
Jan 31 03:39:54 authpriv info devbox.domain.com sshd[12697]: pam_unix(sshd:session): session opened for user kwhite by (uid=0)
host = splunk.domain.com
punct = __::___.._[]:__(:):_____+__(=)
source = /var/log/archive/incoming/2016/01/31/devbox.domain.com/sshd.log
sourcetype = %authlog%
1/31/16
3:39:54.000 AM
Jan 31 03:39:54 authpriv info devbox.domain.com sshd[12693]: pam_unix(sshd:session): session opened for user kwhite by (uid=0)
host = splunk.domain.com
punct = __::___.._[]:__(:):_____+__(=)
source = /var/log/archive/incoming/2016/01/31/devbox.domain.com/sshd.log
sourcetype = %authlog%
1/31/16
3:39:54.000 AM
Jan 31 03:39:54 authpriv info devbox.domain.com sshd[12694]: pam_unix(sshd:session): session opened for user kwhite by (uid=0)
host = splunk.domain.com
punct = __::___.._[]:__(:):_____+__(=)
source = /var/log/archive/incoming/2016/01/31/devbox.domain.com/sshd.log
sourcetype = %authlog%
1/31/16
3:39:19.000 AM
Jan 31 03:39:19 authpriv info devbox.domain.com sshd[10643]: pam_unix(sshd:session): session opened for user kwhite by (uid=0)
host = splunk.domain.com
punct = __::___.._[]:__(:):_____+__(=)
source = /var/log/archive/incoming/2016/01/31/devbox.domain.com/sshd.log
sourcetype = %authlog%
1/31/16
3:39:17.000 AM
Jan 31 03:39:17 authpriv info devbox.domain.com sshd[10321]: pam_unix(sshd:session): session opened for user kwhite by (uid=0)
but when i add the stats count by user the results are
DOMAIN 532
So its like its doing the count BEFORE the regex.
It looks like there are some other field extraction setup which is giving the wrong results. Try something like this.
sourcetype=%authlog% "session opened" NOT user=root (date_hour >= 19 OR date_hour <= 7) | table _raw | rex field=_raw "for user\s+(?<user>.+)\sby" | stats count by user
That works. It finally shows the full domain+user. Do you mind breaking down for me what its doing it? So I'm assuming the table_raw it taking raw data. Does it do something different that would cause it not to take raw data?
Apart from regular filter in the base search, the "| table _raw "
is just keeping raw data and removing all other fields. This way when you do your field extraction for user, only your custom extraction will be applicable.
I just tried:
sourcetype=%authlog% "session opened" NOT user=root (date_hour >= 19 OR date_hour <= 7) | rex field=_raw "for user\s+(?\S+)" | stats count by user |sort - count
and I get just
DOMAIN
even if it would say DOMAIN+jsmith or just jsmith but after adding the regex results didn't change. Did add your bit to it correctly?
Try the updated answer
Couldn't get it to work but it made me think about another idea.
Made some progress:
sourcetype=%authlog% "session opened" NOT user=root (date_hour >= 19 OR date_hour <= 7) | rex mode=sed "s/DOMAIN+(.*)/\1/"
It works, but not fully.
Feb 10 07:29:35 authpriv info devbox.domain.com sshd[16296]: pam_unix(sshd:session): session opened for user jsmith by (uid=0)
If i tweak it just a bit.
sourcetype=%authlog% "session opened" NOT user=root (date_hour >= 19 OR date_hour <= 7) | rex mode=sed "s/DOMAIN+(.*)/\1/" | stats count by user
I still get DOMAIN and not jsmith. If I click DOMAIN and drill down into the events it removes the domain.
And I guess this will be my last post for the day since I can only post twice in a day! That's horrible.