Splunk Search

Challenges Joining tables and returning values from both indexes

lbrhyne
Path Finder

Hello,

Based on some suggested changes by @jawaharas I was able to successfully lookup the value of user from the Vantage sourcetype and match it against the winevetlog and return only the matched data. However, I could only populate the fields from the Vantage index and not the wineventlog. I'm thinking it is due to not having a join in the below code:

index=identities sourcetype="Vantage" (NOT ee_status="R" NOT ee_status="T") earliest=-1d@d latest=-0d@d
| eval user = trim(replace(email, "@domain.com", ""))
| search [ | search index="wineventlog" sourcetype="WinEventLog:Security" ((EventCode=4624 AND Logon_Type=2 OR Logon_Type=10) OR (EventCode=4625 NOT Sub_Status=0xC0000064)) earliest=-5m latest=now()| fields user | format]
| table user,department,ad_fname,ad_last_name,ee_status, Workstation_Name
| eval fullname = toString(ad_fname) + " " + toString(ad_last_name)
| rename user as "Login ID", ee_status as "HR Employment Status" fullname as "Full Name", department as Department, Workstation_Name as "Device"
| table "Login ID" "Full Name" "HR Employment Status" Department, Device  | sort "Full Name"

In my attempt to add a join in, the below query runs with no errors but returns no data.
What i'm attempting to do is return a few fields from both indexes.

sourcetype="Vantage" ee_status=* earliest=-1d@d latest=-0d@d | eval HRuserID = trim(replace(email, "@domain.com", "")) 
| join HRuserID [ | search index="wineventlog" sourcetype="WinEventLog:Security" ((EventCode=4624 AND Logon_Type=2 OR Logon_Type=10) OR (EventCode=4625 NOT Sub_Status=0xC0000064)) earliest=-5m latest=now()]
| where HRuserID=user | stats values(user) values(department) values(Workstation_Name)

Thanks in advance

0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi lbrhyne,
for me, in the subsearch there isn't the HRuserID field, so the join doesn't run.
You should rename in subsearch the user field in HRuserID field.
In addition, remeber that subsearches have the 50,000 results limit, so if you have more results in subsearch, they couldn't match.
For these reasons and because join is a very slow command I suggest to use a different approach:

(index=identities sourcetype="Vantage" ee_status=* earliest=-1d@d latest=-0d@d ) OR (index="wineventlog" sourcetype="WinEventLog:Security" ((EventCode=4624 AND Logon_Type=2 OR Logon_Type=10) OR (EventCode=4625 NOT Sub_Status=0xC0000064)) earliest=-5m latest=now)
| eval HRuserID = trim(replace(email, "@domain.com", "")) , HRuserID=coalesce(HRuserID,user)
| stats values(user) values(department) values(Workstation_Name) BY HRuserID

Bye.
Giuseppe

P.S.: use always index in your searches.

View solution in original post

gcusello
SplunkTrust
SplunkTrust

Hi lbrhyne,
for me, in the subsearch there isn't the HRuserID field, so the join doesn't run.
You should rename in subsearch the user field in HRuserID field.
In addition, remeber that subsearches have the 50,000 results limit, so if you have more results in subsearch, they couldn't match.
For these reasons and because join is a very slow command I suggest to use a different approach:

(index=identities sourcetype="Vantage" ee_status=* earliest=-1d@d latest=-0d@d ) OR (index="wineventlog" sourcetype="WinEventLog:Security" ((EventCode=4624 AND Logon_Type=2 OR Logon_Type=10) OR (EventCode=4625 NOT Sub_Status=0xC0000064)) earliest=-5m latest=now)
| eval HRuserID = trim(replace(email, "@domain.com", "")) , HRuserID=coalesce(HRuserID,user)
| stats values(user) values(department) values(Workstation_Name) BY HRuserID

Bye.
Giuseppe

P.S.: use always index in your searches.

View solution in original post

gcusello
SplunkTrust
SplunkTrust

Hi lbrhyne,
try something like this

index="wineventlog" sourcetype="WinEventLog:Security" ((EventCode=4624 AND Logon_Type=2 OR Logon_Type=10) OR (EventCode=4625 NOT Sub_Status=0xC0000064)) earliest=-5m latest=now [ search index=identities sourcetype="Vantage" ee_status=* earliest=-1d@d latest=-0d@d | eval user= trim(replace(email, "@domain.com", ""))| fields user ]
| table user department Workstation_Name

buth remember that there's a limit of 50,000 on subsearch results.
If results are more than 50,000, try:

 (index=identities sourcetype="Vantage" ee_status=* earliest=-1d@d latest=-0d@d ) OR (index="wineventlog" sourcetype="WinEventLog:Security" ((EventCode=4624 AND Logon_Type=2 OR Logon_Type=10) OR (EventCode=4625 NOT Sub_Status=0xC0000064)) earliest=-5m latest=now)
 | eval HRuserID = trim(replace(email, "@domain.com", "")) , HRuserID=coalesce(HRuserID,user)
 | stats values(user) values(department) values(Workstation_Name) values(index) AS index BY HRuserID
 | search index=wineventlog

Bye.
Giuseppe

0 Karma

lbrhyne
Path Finder

Hi @gcusello, I was finally able to put a little more time into your tip and was able to get it to work! I agree, your solution ins much faster than a join! Thank again!

(index="wineventlog" sourcetype="WinEventLog:Security" (earliest=-5m latest=now()) ((EventCode=4624 AND Logon_Type=2 OR Logon_Type=10) OR (EventCode=4625 NOT Sub_Status=0xC0000064 NOT Logon_Type=3)) )  OR 
(index=identities sourcetype="Vantage" (NOT ee_status="A" NOT ee_status="L") earliest=-1d@d latest=-0d@d) | eval hruser = trim(replace(email, "@domain.com", ""))
| eval user=coalesce(user,hruser) | stats values(*) AS * BY user | search hruser=* user=*
| table user,department,ad_fname,ad_last_name,ee_status, Workstation_Name
| eval fullname = (toString(ad_fname) + " " + toString(ad_last_name)) 
| rename user as "Login ID", ee_status as "HR Employment Status" fullname as "Full Name", department as Department, Workstation_Name as "Device"
| where Device!=""
| table "Login ID" "Full Name" "HR Employment Status" Department, Device
0 Karma

lbrhyne
Path Finder

For those of you that are interested...
Time to complete query's:

Join = 54 seconds
Stats = 19 seconds

Stats WINS by 3X 🙂

gcusello
SplunkTrust
SplunkTrust

Thank you for your feedback, surely it will be interesting for many Splunkers!
Bye and see next time.
Giuseppe

0 Karma

lbrhyne
Path Finder

Thanks again for your help @gcusello! I tried the two suggestions above, but neither would pull back the results I was looking for. I think it may have something to do with OR statement for the two indexes.
However, I was able to resolve the issue by using an inner Join.

index=identities sourcetype="Vantage" (NOT ee_status="T" NOT ee_status="R") earliest=-1d@d latest=-0d@d
| eval user = trim(replace(email, "@domain1.com", ""))
| join type=inner user [ | search index="wineventlog" sourcetype="WinEventLog:Security" ((EventCode=4624 AND Logon_Type=2 OR Logon_Type=10) OR (EventCode=4625 NOT Sub_Status=0xC0000064)) earliest=-5m latest=now()]
| table user,department,ad_fname,ad_last_name,ee_status, Workstation_Name,src_ip
| eval fullname = toString(ad_fname) + " " + toString(ad_last_name)
| rename user as "Login ID", ee_status as "HR Employment Status" fullname as "Full Name", department as Department, Workstation_Name as "Device"
| table "Login ID" "Full Name" "HR Employment Status" Department, Device | sort "Full Name"

Thanks again.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi lbrhyne,
I'm sure that you can solve your problem with join, but I suggest to use additional time to try with my hints because join command is a very slow solution!
Bye.
Giuseppe

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi lbrhyne,
If my answer helped you, please accept and/or upvote it!
Bye.
Giuseppe

0 Karma

lbrhyne
Path Finder

Hi @gcusello, I have not had time to dig deeper into why I can't get your solution to work. I think it has something to do with converting the email address to user | eval user = trim(replace(email, "@domain1.com", "")), when doing the lookup, but not sure. Right now the Join is working, although it is slow. When I get time to look further into your suggestion and I get it to work I will most certainly accept and/or upvote it.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Thank you.
Bye.
Giuseppe

0 Karma