Hi everyone and thanks in advance.
I'm trying to collate all our SOCKS traffic on our network over the last 90 days.
Our IP's rotate and as a result I can't run this search for all time, I have to run it for 90 days individually, Which is where I got to here:
index=*proxy* SOCKS earliest=-1d latest=-0d
| eval destination=coalesce(dest, dest_port), userid=coalesce(user, username)
| rex field=url mode=sed "s/^SOCKS:\/\/|:\d+$//g"
| eval network=case(match(src_ip,"<REDACTED>"),"user",1=1,"server")
| stats
values(domain) as Domain
values(userid) as Users
values(destination) as Destinations
by url, src_ip, network
| convert ctime(First_Seen) ctime(Last_Seen)
| sort -Event_Count
| join type=left max=0 src_ip [
search index=triangulate earliest=-1d latest=-0d
|stats count by ip,username
|rename username AS userid
|rename ip as src_ip
]
| join type=left max=0 src_ip [
search index=windows_events EventID=4624 NOT src_ip="-" NOT user="*$" earliest=-1d latest=-0d
| stats count by IpAddress, user
| rename IpAddress as src_ip
| rename user as win_userid
| fields - count
]
|eval userid=coalesce(userid, win_userid)
| join type=left max=0 userid [
search index="active_directory" earliest=-1d latest=-0d
| stats count by username,fullname,title,division,mail
| rename username as userid
]
Then a colleague suggested I do it slightly differently and run it over the 90 days but link it together which is where we got to here:
index=*proxy* SOCKS
| eval destination=coalesce(dest, dest_port)
| rex field=url mode=sed "s/^SOCKS:\/\/|:\d+$//g"
| eval network=case(match(src_ip,"<Redacted>"),"user",1=1,"server")
| eval Proxy_day = strftime(_time, "%d-%m-%y")
| join type=left max=0 src_ip [
search index=windows_events EventID=4624 NOT src_ip="-" NOT user="*$"
| stats count by IpAddress, user
| rename IpAddress as src_ip
| rename user as win_userid
| fields - count
]
| eval userid=coalesce(userid, win_userid)
| join type=left max=0 userid [
search index="active_directory"
| stats count by username, fullname, title, division, mail
| rename username as userid
]
| rename src_ip as "Source IP"
| stats
values(mail) as "Email Address"
values(username) as "User ID"
values(destination) as Destination
values(network) as Network
values(Proxy_day) as Day
values(url) as URL
by "Source IP"
However the problem I'm running into now is in the data produced there could be 100's of URL's / Emails / Day associated with the source IP which makes the data unactionable and actually starts to break a .csv when exported.
Would anyone be able to help? Ideally I'd just like the top for example 5 results, but I've had no luck with that or a few other methods I've tried. Even SplunkGPT is failing me - is it even possible?
There are a number of ways you could tackle this, but generally using join is low down the list of options as it can be resource intensive. Instead you could look to bring back all the data you need and then use stats on it. You could do this with something like index=network (etc) OR index=active_directory (etc) OR index=etc.etc..
or..you could use append, which would add the additional data to the results before then using stats to bring it all together, try the following as a starting point, it might need a little refinement as I dont have your data to test against.
index=*proxy* SOCKS earliest=-90d@d latest=now
| eval destination=coalesce(dest, dest_port), userid=coalesce(user, username), type="proxy"
| rex field=url mode=sed "s/^SOCKS:\/\/|:\d+$//g"
| eval network=case(match(src_ip,"<Redacted>"),"user",1=1,"server"), Proxy_day = strftime(_time, "%d-%m-%y")
| append
[ search index=windows_events EventID=4624 NOT src_ip="-" NOT user="*$" earliest=-90d@d latest=now
| stats count by IpAddress, user
| rename IpAddress as src_ip, user as win_userid
| eval type="windows", userid=win_userid ]
| append
[ search index="active_directory" earliest=-90d@d latest=now
| stats count by username, fullname, title, division, mail
| rename username as userid
| eval type="ad" ]
| stats
values(mail) as mail, values(fullname) as fullname, values(title) as title, values(division) as division, values(userid) as userid, values(win_userid) as win_userid,
values(destination) as destination, values(network) as network, values(Proxy_day) as Proxy_day, values(url) as url, dc(url) as url_count
by src_ip, type
| where type="proxy"
| eval userid=coalesce(userid, win_userid)
| stats values(mail) as "Email Address", values(userid) as "User ID", values(destination) as Destination, values(network) as Network, dc(Proxy_day) as Day_Count, dc(url) as URL_Count by src_ip, url, Proxy_day
| stats values("Email Address") as "Email Address", values("User ID") as "User ID", values(Destination) as Destination, values(Network) as Network, values(Day_Count) as Day_Count, dc(Proxy_day) as Total_Days, values(URL_Count) as URL_Count by src_ip, url
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
This resulted in the same source IP in multiple rows, I need it by source IP.
If you just want to limit with 5 with your existing search, you can use mvindex.
Check and let me know if this is what you are trying to achieve.
Eg:
| stats values(mail) as "Email Address" values(username) as "User ID" values(destination) as Destination values(network) as Network values(Proxy_day) as Day values(url) as URL by "Source IP"
| eval "Email Address" = mvindex('Email Address', 0, 5)
| eval "User ID" = mvindex('User ID', 0, 5)
| eval Destination = mvindex(Destination, 0, 5)
| eval Day = mvindex(Day, 0, 5)
| eval URL = mvindex(URL, 0, 5)
Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!
I had already tried that annoyingly it looked like this:
index=*proxy* SOCKS
| eval destination=coalesce(dest, dest_port)
| rex field=url mode=sed "s/^SOCKS:\/\/|:\d+$//g"
| eval network=case(match(src_ip,"<REDACTED>"),"user",1=1,"server")
| eval Proxy_day = strftime(_time, "%d-%m-%y")
| join type=left max=0 src_ip [
search index=windows_events EventID=4624 NOT src_ip="-" NOT user="*$"
| stats count by IpAddress, user
| rename IpAddress as src_ip
| rename user as win_userid
| fields - count
]
| eval userid=coalesce(userid, win_userid)
| join type=left max=0 userid [
search index="active_directory"
| stats count by username, fullname, title, division, mail
| rename username as userid
]
| rename src_ip as "Source IP"
| sort -Proxy_day
| stats
values(mail) as "Email Address"
values(username) as "User ID"
values(destination) as Destination
values(network) as Network
values(Proxy_day) as Day
values(url) as URL
by "Source IP"
| eval
"Email Address"=mvindex('Email Address',0,4),
Day=mvindex(Day,0,4),
URL=mvindex(URL,0,4)
It reduced the source IP's to below what was visible on a unfiltered search which meant it was removing results that I needed.