Splunk Search

Grouping data by multiple attribute values

alphadog00
Splunk Employee
Splunk Employee

I have basic web logs with username and jsessionid. I want to group (assume a single index, with one set of data). So thousands of events. 

I want to group by jsessionid and username - creating supergroups. Example:

username:jsessionid

tom:1234

frank:1234

bob:1234

bob:5467

sally:5467

sally:9012

amy:9012

harry:4709

tony:4709

I would wind up with 2 groups - a small group with just harry and tony, and a larger group with tom, frank, bob, sally, and amy due to shared jsessionid.

I would like my output to contain some kind of group ID or Group Name. I would have no knowledge of username or jsessionid - I just want to be able to loop through the data and assign users/jsessionids to groups where they exist. 

My first thought is to sort by jsessionid, but I can't figure out how to loop through the data and create dynamic group names. 

Thanks for any ideas, not an SPL expert. 

Labels (2)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

This sort of works - the issue is knowing how many iterations to do to resolve all the indirections

| makeresults | eval _raw="tom:1234

frank:1234

bob:1234

bob:5467

sally:5467

sally:9012

amy:9012

harry:4709

tony:4709"
| multikv noheader=t 
| rex "(?<username>[^\:]+)\:(?<jsessionid>.+)"
| fields - _raw _time
| fields username jsessionid
| eventstats values(jsessionid) as jsessionids by username
| eventstats values(username) as usernames by jsessionid
| eventstats values(jsessionids) as jsessionidss by usernames
| eventstats values(usernames) as usernamess by jsessionids
| eventstats values(jsessionidss) as jsessionidsss by usernamess
| eventstats values(usernamess) as usernamesss by jsessionidss
| eventstats values(jsessionidsss) as jsessionidssss by usernamesss
| eventstats values(usernamesss) as usernamessss by jsessionidsss
| eventstats values(jsessionidssss) as jsessionidsssss by usernamessss
| eventstats values(usernamessss) as usernamesssss by jsessionidssss
| eval supergroup=mvjoin(usernamesssss,",")
| dedup supergroup
| table supergroup

View solution in original post

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

This sort of works - the issue is knowing how many iterations to do to resolve all the indirections

| makeresults | eval _raw="tom:1234

frank:1234

bob:1234

bob:5467

sally:5467

sally:9012

amy:9012

harry:4709

tony:4709"
| multikv noheader=t 
| rex "(?<username>[^\:]+)\:(?<jsessionid>.+)"
| fields - _raw _time
| fields username jsessionid
| eventstats values(jsessionid) as jsessionids by username
| eventstats values(username) as usernames by jsessionid
| eventstats values(jsessionids) as jsessionidss by usernames
| eventstats values(usernames) as usernamess by jsessionids
| eventstats values(jsessionidss) as jsessionidsss by usernamess
| eventstats values(usernamess) as usernamesss by jsessionidss
| eventstats values(jsessionidsss) as jsessionidssss by usernamesss
| eventstats values(usernamesss) as usernamessss by jsessionidsss
| eventstats values(jsessionidssss) as jsessionidsssss by usernamessss
| eventstats values(usernamessss) as usernamesssss by jsessionidssss
| eval supergroup=mvjoin(usernamesssss,",")
| dedup supergroup
| table supergroup
0 Karma

alphadog00
Splunk Employee
Splunk Employee

thanks, the data is actually a little cleaner - i did username:sessionid for this post, really they are already separate fields of data in the index.  But i will try what you posted. I had looked at eventstats, but not gone to this level

0 Karma

alphadog00
Splunk Employee
Splunk Employee

To clarify, I could have thousands of sessionIDs and since Tom shares different jsessionids with multiple different people, and they share jsessionids with others, this would be one big group.

0 Karma

alphadog00
Splunk Employee
Splunk Employee

thanks, but if I have no idea of username or group names - i need to create groups there could "n" groups, and the case statement would have to be built with unknown usernames - that is part of my sticking point.

I also have a "X" usernames and "Y" jsessionids

0 Karma

isoutamo
SplunkTrust
SplunkTrust
How you could group those without knowing which names belongs to which group?
0 Karma

alphadog00
Splunk Employee
Splunk Employee

If you look at this data:

tom:1234

frank:1234

bob:1234

bob:5467

sally:5467

sally:9012

amy:9012

- tom, frank, bob are connected via one JSESSIONID, bob and sally by another, so sally has an indirect or transitive relationship to tom and frank. Amy is connected to Sally - so in the end, all of these above are part of one big group due some direct and indirect relationships

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

you could try with next 

....
| eval group = case (username == "harry" OR username == "tom", "grp1", true(), "grp2")
| stats values(*) as * by group, jsessiond
| ....

please check the syntax as I haven’t splunk in my hands to check it.

r. Ismo 

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...