Splunk Search

Grouping data by multiple attribute values

alphadog00
Splunk Employee
Splunk Employee

I have basic web logs with username and jsessionid. I want to group (assume a single index, with one set of data). So thousands of events. 

I want to group by jsessionid and username - creating supergroups. Example:

username:jsessionid

tom:1234

frank:1234

bob:1234

bob:5467

sally:5467

sally:9012

amy:9012

harry:4709

tony:4709

I would wind up with 2 groups - a small group with just harry and tony, and a larger group with tom, frank, bob, sally, and amy due to shared jsessionid.

I would like my output to contain some kind of group ID or Group Name. I would have no knowledge of username or jsessionid - I just want to be able to loop through the data and assign users/jsessionids to groups where they exist. 

My first thought is to sort by jsessionid, but I can't figure out how to loop through the data and create dynamic group names. 

Thanks for any ideas, not an SPL expert. 

Labels (2)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

This sort of works - the issue is knowing how many iterations to do to resolve all the indirections

| makeresults | eval _raw="tom:1234

frank:1234

bob:1234

bob:5467

sally:5467

sally:9012

amy:9012

harry:4709

tony:4709"
| multikv noheader=t 
| rex "(?<username>[^\:]+)\:(?<jsessionid>.+)"
| fields - _raw _time
| fields username jsessionid
| eventstats values(jsessionid) as jsessionids by username
| eventstats values(username) as usernames by jsessionid
| eventstats values(jsessionids) as jsessionidss by usernames
| eventstats values(usernames) as usernamess by jsessionids
| eventstats values(jsessionidss) as jsessionidsss by usernamess
| eventstats values(usernamess) as usernamesss by jsessionidss
| eventstats values(jsessionidsss) as jsessionidssss by usernamesss
| eventstats values(usernamesss) as usernamessss by jsessionidsss
| eventstats values(jsessionidssss) as jsessionidsssss by usernamessss
| eventstats values(usernamessss) as usernamesssss by jsessionidssss
| eval supergroup=mvjoin(usernamesssss,",")
| dedup supergroup
| table supergroup

View solution in original post

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

This sort of works - the issue is knowing how many iterations to do to resolve all the indirections

| makeresults | eval _raw="tom:1234

frank:1234

bob:1234

bob:5467

sally:5467

sally:9012

amy:9012

harry:4709

tony:4709"
| multikv noheader=t 
| rex "(?<username>[^\:]+)\:(?<jsessionid>.+)"
| fields - _raw _time
| fields username jsessionid
| eventstats values(jsessionid) as jsessionids by username
| eventstats values(username) as usernames by jsessionid
| eventstats values(jsessionids) as jsessionidss by usernames
| eventstats values(usernames) as usernamess by jsessionids
| eventstats values(jsessionidss) as jsessionidsss by usernamess
| eventstats values(usernamess) as usernamesss by jsessionidss
| eventstats values(jsessionidsss) as jsessionidssss by usernamesss
| eventstats values(usernamesss) as usernamessss by jsessionidsss
| eventstats values(jsessionidssss) as jsessionidsssss by usernamessss
| eventstats values(usernamessss) as usernamesssss by jsessionidssss
| eval supergroup=mvjoin(usernamesssss,",")
| dedup supergroup
| table supergroup
0 Karma

alphadog00
Splunk Employee
Splunk Employee

thanks, the data is actually a little cleaner - i did username:sessionid for this post, really they are already separate fields of data in the index.  But i will try what you posted. I had looked at eventstats, but not gone to this level

0 Karma

alphadog00
Splunk Employee
Splunk Employee

To clarify, I could have thousands of sessionIDs and since Tom shares different jsessionids with multiple different people, and they share jsessionids with others, this would be one big group.

0 Karma

alphadog00
Splunk Employee
Splunk Employee

thanks, but if I have no idea of username or group names - i need to create groups there could "n" groups, and the case statement would have to be built with unknown usernames - that is part of my sticking point.

I also have a "X" usernames and "Y" jsessionids

0 Karma

isoutamo
SplunkTrust
SplunkTrust
How you could group those without knowing which names belongs to which group?
0 Karma

alphadog00
Splunk Employee
Splunk Employee

If you look at this data:

tom:1234

frank:1234

bob:1234

bob:5467

sally:5467

sally:9012

amy:9012

- tom, frank, bob are connected via one JSESSIONID, bob and sally by another, so sally has an indirect or transitive relationship to tom and frank. Amy is connected to Sally - so in the end, all of these above are part of one big group due some direct and indirect relationships

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

you could try with next 

....
| eval group = case (username == "harry" OR username == "tom", "grp1", true(), "grp2")
| stats values(*) as * by group, jsessiond
| ....

please check the syntax as I haven’t splunk in my hands to check it.

r. Ismo 

0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...