Splunk Search

Correlate events from different indexes with different field names & values

neerajs_81
Contributor

Hi, how can i correlate events from different indexes when both( field names and  values) are different ?
For example:
I have a some app logs  in an index=id1 . There is field called user in this index   which has values like:
SmithJ
JohnK 


Now i want to find out what is the IP address of these users from our firewall index.  But  In the Firewall index,  the user names are in following format:
Field Name:  PanOSSourceUserName
Value:  <Domain>\SmithJ

Field Name:  PanOSSourceUserName
Value:  <Domain>\JohnK

As you can see, the firewall index has names appended by our <domain name>\ while the app index doesn't have domain name in the user field.    There are other fields called src_ip and Country in firewall events.

How can i craft  a search that takes the user field from app index and compares/ correlates that with the PanOSSourceUserName field from Firewall index and accordingly displays the src_ip of the user .  Hope i am clear.

End result:  Table or Stats whatever works with following columns 

_timeuserPanOSSourceUserNamesrc_ipCountry 
      


Thanks in advance 
 

Labels (2)
Tags (2)
0 Karma

gcusello
Legend

Hi @neerajs_81,

to correlate different searches (it isn't relevant that they are in the same or different indexes) you need to have the same field names and the same value format.

In your case you can solve the first issue using the rename command and the second using the rex command.

Something like this:

index=idx1 OR index=firewall
| rex field=PanOSSourceUserName "\\(?<user>.*)"
| stats earliest(_time) AS _time values(src_ip) AS src_ip values(country) AS country values(PanOSSourceUserName) AS PanOSSourceUserName BY user 
| table _time user PanOSSourceUserName src_ip country 

Ciao.

Giuseppe

 

neerajs_81
Contributor

Thanks for responding, Your rex command is throwing the following error

Error in 'rex' command: Encountered the following error while compiling the regex '\(?<user>.*)': Regex: unmatched closing parenthesis.

0 Karma

gcusello
Legend

Hi @neerajs_81,

it's the usual problem with regexes in Splunk, please try this:

index=idx1 OR index=firewall
| rex field=PanOSSourceUserName "\\\(?<user>.*)"
| stats earliest(_time) AS _time values(src_ip) AS src_ip values(country) AS country values(PanOSSourceUserName) AS PanOSSourceUserName BY user 
| table _time user PanOSSourceUserName src_ip country 

Ciao.

Giuseppe

neerajs_81
Contributor

Giuseppe,  I see a minor issue.  Because we are doing

index=idx1 OR index=firewall

 
the search results are only showing me events for users which are in Firewall index.  I want it other way round.   Basically

1. Check user in index=idx1.  

2. Only for those users found above in idx1,  get their corresponding  src_ip from index=Firewall  by matching against PanOSSourceName.
 
The rex command you gave is a good suggestion for comparing the user values between the two indexes.  But i only want to see the results of folks found in idx1 

Hope i am clear.

Tags (2)
0 Karma

gcusello
Legend

Hi @neerajs_81,

ok, alittle adjustement to the search: you have to filter eìresults taking only results in index idx1:

index=idx1 OR index=firewall
| rex field=PanOSSourceUserName "\\\(?<user>.*)"
| stats earliest(_time) AS _time values(src_ip) AS src_ip values(country) AS country values(PanOSSourceUserName) AS PanOSSourceUserName dc(index) AS dc_index BY user 
| where dc_index=2
| table _time user PanOSSourceUserName src_ip country

Ciao.

Giuseppe

 

neerajs_81
Contributor

Still some adjustment required. This new search shows me results from both indexes meaning even those user records which are not in index= idx1 but there in the other index=firewall. 
I have validated they are not in idx1 by running a "index=idx1" for that time range and  it shows about 5 users. 
But now when we run your query,  the results shows more than 5 users.   So somehow It is fetching some users from index=firewall which are not part of idx1 .

Will keep poking around.  Awarding you karma points for your responses so far.



0 Karma

gcusello
Legend

HI @neerajs_81,

the behavior you're reporting it's strange because the check on the indexes number (the where condition) assures that the user value is present in both indexes.

Only for debugging, run this search that displays the indexes in which the user value is present:

index=idx1 OR index=firewall
| rex field=PanOSSourceUserName "\\\(?<user2>.*)"
| eval user=coalesce(user,user2)
| stats earliest(_time) AS _time values(src_ip) AS src_ip values(country) AS country values(PanOSSourceUserName) AS PanOSSourceUserName dc(index) AS dc_index values(index) AS index BY user 
| where dc_index=2
| table _time user PanOSSourceUserName src_ip country index

 Ciao.

Giuseppe

0 Karma

neerajs_81
Contributor

Ok found the culprit, but not the solution.  Problem is with the regex. 

 

rex field=PanOSSourceUserName "\\\(?<user2>.*)"

 


The value of user2 field shows as \johnk .  There is still the backslash \ that is showing in the value.
While the value of user1 field from idx1 is johnk

The PanOSSourceUserName value is   domain\\johnk 

Even if i put four backslashes in the rex 
rex field=PanOSSourceUserName "\\\\(?<user2>.*)" ,  the value of user2 still shows as \johnk .  
That is odd because on regex101.com  it shows \\\\ is a 100% match for user2.
Any thoughts

0 Karma

gcusello
Legend

Hi @neerajs_81,

please try this regex

^[^\\]+\\(?<user>.*)

or

^[^\\]+\\\(?<user>.*)

if it continues to not run, you can add, after the rex command and before the stats command:

| eval user=if(index="firewall",substr(user,2,50),user)

But the regex should be correct.

Ciao.

Giuseppe

neerajs_81
Contributor

Tried your updated search and it shows "No results found".  I saw some posts where folks have used | join=INNER .  Maybe a join would for this use case.
Checking.

0 Karma

gcusello
Legend

Hi @neerajs_81,

it's alway more strange!

I use join only when there isn't any additional solution to an issue because it's a very slow and resource expensive command

anyway debug the data in this way:

run the following search

(index=idx1 user=*) OR (index=firewall PanOSSourceUserName =*)
| rex field=PanOSSourceUserName "\\\(?<user2>.*)"
| eval user=coalesce(user,user2)
| table _time user user2 PanOSSourceUserName index

the field "user" should have the 100% of occurrences.

in addition check if user2 is correctly extracted and the values correspond to the ones in user

If not check the regex, if yes, run the following search:

(index=idx1 user=*) OR (index=firewall PanOSSourceUserName=*)
| rex field=PanOSSourceUserName "\\\(?<user2>.*)"
| eval user=coalesce(user,user2)
| stats earliest(_time) AS _time values(src_ip) AS src_ip values(country) AS country values(PanOSSourceUserName) AS PanOSSourceUserName dc(index) AS dc_index values(index) AS index BY user 
| where dc_index=2
| table _time user PanOSSourceUserName src_ip country index

in the index column, you should have always two values.

Ciao.

Giuseppe

Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...