Splunk Search

Join fields from two log entries in the same index

ajacobi
Path Finder

Hi All,

I am trying to join fields from two separate log entries in the same index.

There is a common field called sessionid in both log entries. I wanted to get the groupname from one log entry and the username from a different log entry that correlated to the sessionid. The username can also differ between short name and UPN in the logs so I wanted to ensure that the username displayed in the output would be UPN (user@domain). I want a simple table output showing time, sessionid, groupname, username

I've tried using transaction, stats and join but have not been able to get the search working correctly. I think I'm missing something obvious but it's eluding me at present.

Thanks,
Andy

Labels (3)
0 Karma
1 Solution

bowesmana
Champion

@ajacobi 

I'm a little confused from your message. You say you want User_Name and UPN, but not short_name. In your example, User_Name="shortname" in the second log entry.

I am working on the assumption that the third log message is the data you want to get User_Name from and that is shown in your log as CN=groupname.

See this example with your data

| makeresults
| eval x=split("Feb 18 08:00:33 #src_ip# hostname=\"hostname\",errdefs_msgno=\"989898989: 5:\",partition_name=\"ptnname\",session_id=\"00000000\",Access_Profile=\"profile_name\",Partition=\"partition_name\",Session_Id=\"00000000\",User_Name=\"user@domain\"!Feb 18 08:00:33 #src_ip# hostname=\"hostname\",errdefs_msgno=\"989898989: 5:\",partition_name=\"ptnname\",session_id=\"00000000\",Access_Profile=\"profile_name\",Partition=\"partition_name\",Session_Id=\"00000000\",User_Name=\"shortname\"!Feb 18 08:00:33 #src_ip# hostname=\"hostname\",errdefs_msgno=\"01490113: 5:\",partition_name=\"partition_name\",session_id=\"00000000\",Access_Profile=\"profile_name\",Partition=\"partition_name\",Session_Id=\"00000000\",Session_Variable_Name=\"session.ad.last.attr.memberOf\",Session_Variable_Value=\"| | CN=groupname,OU=OU3,OU=OU2,OU=OU1,DC=domain,DC=com,DC=au |\"", "!")
| mvexpand x
| rename x as _raw
| extract
| rex field=User_Name "(?<UPN>[^@]*@.*)"
| eval User_Name=if(isnull(UPN), User_Name, null())
| eval _time=strptime(_raw, "%b %d %H:%M:%S")
| stats values(CN) as GroupName values(UPN) as UPN by _time, Session_Id

View solution in original post

0 Karma

bowesmana
Champion

@ajacobi 

Avoid transaction and join where possible - both has undesirable side effects, this should be doable with stats

main search from both log entries
| eval username=coalesce(UPN, username)
| stats values(groupname) as groupname, values(username) as username by time sessionid

 perhaps you could post what you've previously tried, but from your description this should do the combine.

However, if the log times are different, then you would put values(_time) as _time instead of split by.

how you handle time will depend on what you want to represent in terms of time.

 

0 Karma

ajacobi
Path Finder

Hi,

Thanks for your response and sorry for the delay getting back. I don't actually want to get the shortname for the username. Just the UPN. The problem is there are two identical log entries created in this process. One includes the shortname and one includes the UPN. They both have the same field name. But I only want to return the UPN format field value.

The following are the two main log entries that I have to get the inform out of. The first (and second) is just the connection log with either the UPN or shortname as the User_Name value.  The last  is based on a AD query that returns their group memberships.

Feb 18 08:00:33 #src_ip# hostname="hostname",errdefs_msgno="989898989: 5:",partition_name="ptnname",session_id="00000000",Access_Profile="profile_name",Partition="partition_name",Session_Id="00000000",User_Name="user@domain"

Feb 18 08:00:33 #src_ip# hostname="hostname",errdefs_msgno="989898989: 5:",partition_name="ptnname",session_id="00000000",Access_Profile="profile_name",Partition="partition_name",Session_Id="00000000",User_Name="shortname"

Feb 18 08:00:33 #src_ip# hostname="hostname",errdefs_msgno="01490113: 5:",partition_name="partition_name",session_id="00000000",Access_Profile="profile_name",Partition="partition_name",Session_Id="00000000",Session_Variable_Name="session.ad.last.attr.memberOf",Session_Variable_Value="| | CN=groupname,OU=OU3,OU=OU2,OU=OU1,DC=domain,DC=com,DC=au |"

 

I just wanted to create a table output showing the time the user connected, session_id, User_Name and UPN. My earlier queries were fairly basic and didn't address the username issue

Thanks

0 Karma

bowesmana
Champion

@ajacobi 

I'm a little confused from your message. You say you want User_Name and UPN, but not short_name. In your example, User_Name="shortname" in the second log entry.

I am working on the assumption that the third log message is the data you want to get User_Name from and that is shown in your log as CN=groupname.

See this example with your data

| makeresults
| eval x=split("Feb 18 08:00:33 #src_ip# hostname=\"hostname\",errdefs_msgno=\"989898989: 5:\",partition_name=\"ptnname\",session_id=\"00000000\",Access_Profile=\"profile_name\",Partition=\"partition_name\",Session_Id=\"00000000\",User_Name=\"user@domain\"!Feb 18 08:00:33 #src_ip# hostname=\"hostname\",errdefs_msgno=\"989898989: 5:\",partition_name=\"ptnname\",session_id=\"00000000\",Access_Profile=\"profile_name\",Partition=\"partition_name\",Session_Id=\"00000000\",User_Name=\"shortname\"!Feb 18 08:00:33 #src_ip# hostname=\"hostname\",errdefs_msgno=\"01490113: 5:\",partition_name=\"partition_name\",session_id=\"00000000\",Access_Profile=\"profile_name\",Partition=\"partition_name\",Session_Id=\"00000000\",Session_Variable_Name=\"session.ad.last.attr.memberOf\",Session_Variable_Value=\"| | CN=groupname,OU=OU3,OU=OU2,OU=OU1,DC=domain,DC=com,DC=au |\"", "!")
| mvexpand x
| rename x as _raw
| extract
| rex field=User_Name "(?<UPN>[^@]*@.*)"
| eval User_Name=if(isnull(UPN), User_Name, null())
| eval _time=strptime(_raw, "%b %d %H:%M:%S")
| stats values(CN) as GroupName values(UPN) as UPN by _time, Session_Id

View solution in original post

0 Karma

ajacobi
Path Finder

Thanks for your response again and sorry for the confusion. My sanitising of logs obviously hasn't helped.

The UPN and shortname are the two options for values in the User_Name field. The infrastructure writes both versions (to pass on both credential options). Other than that the two log entries are identical. 

I need to create a report for someone analysing the data and they need it to include only the UPN format value and ignore the shortname. The problem being that they are values in the same field in virtually identical log entries.

And I should have mentioned that the groupname is pulled out via field extraction from the list of AD groups so is a field in itself.


So from the log entries:

I need to take the time, session_id and User_Name from the following log entry.

Feb 18 08:00:33 #src_ip# hostname="hostname",errdefs_msgno="989898989: 5:",partition_name="ptnname",session_id="00000000",Access_Profile="profile_name",Partition="partition_name",Session_Id="00000000",User_Name="user@domain"


I can ignore this entry. It was included just to show the fact that the two log entries are identical except for the field value for User_Name

Feb 18 08:00:33 #src_ip# hostname="hostname",errdefs_msgno="989898989: 5:",partition_name="ptnname",session_id="00000000",Access_Profile="profile_name",Partition="partition_name",Session_Id="00000000",User_Name="shortname"


I need to correlate the session_id from log entry one and use that to correlate the group_name from this log entry (It is a field extraction so literally group_name=#returned_value#)

Feb 18 08:00:33 #src_ip# hostname="hostname",errdefs_msgno="01490113: 5:",partition_name="partition_name",session_id="00000000",Access_Profile="profile_name",Partition="partition_name",Session_Id="00000000",Session_Variable_Name="session.ad.last.attr.memberOf",Session_Variable_Value="| | CN=groupname,OU=OU3,OU=OU2,OU=OU1,DC=domain,DC=com,DC=au |"


Which hopefully will end up with a table that has time, session_id, groupname and USer_Name (in UPN format only)

0 Karma

bowesmana
Champion

Did you look at the search I supplied. The last 4 lines do what you want I believe - let me know if that's not correct

 

0 Karma

ajacobi
Path Finder

Thank you. I've checked the search and it's almost what I'm after. It's looking very close. It has definitely sorted out the UPN issue.

It does produce the fields but over two lines in the output. I'm hoping for all fields to be returned in one table line. That way it can be sent as a csv output with the data already correlated. 

At present one line contains:

time, session_id, group_name 

The next line returns :

time, session_id, User_Name(UPN)

Ideally I would like to get a one line output showing time, session_id, group_name, User_Name

I would also like to avoid having lines in the table that only contain session_id and time but no group_name 

0 Karma

bowesmana
Champion

Can you give an example of the real data that is causing this - sanitised as necessary. Your example data works with the query given, so clearly there's a difference with the data you actually have and the example.

 

0 Karma

ajacobi
Path Finder

You are correct. I simplified my initial search. I was using two searches to try to pull the relevant fields from the different entries when really I just needed a broader initial search because all data is in the same index. That was a hang over from trying to use join and transaction.

This has meant that when the table is output I now get all values on one row. I also get a lot of rows that have no group name or UPN so I'm just trying to make it so they don't appear but I do now have the information required in the appropriate format. 

Really appreciate the help you've given.

Thanks

0 Karma