Splunk Search

Grouping data by multiple attribute values

alphadog00
Splunk Employee
Splunk Employee

I have basic web logs with username and jsessionid. I want to group (assume a single index, with one set of data). So thousands of events. 

I want to group by jsessionid and username - creating supergroups. Example:

username:jsessionid

tom:1234

frank:1234

bob:1234

bob:5467

sally:5467

sally:9012

amy:9012

harry:4709

tony:4709

I would wind up with 2 groups - a small group with just harry and tony, and a larger group with tom, frank, bob, sally, and amy due to shared jsessionid.

I would like my output to contain some kind of group ID or Group Name. I would have no knowledge of username or jsessionid - I just want to be able to loop through the data and assign users/jsessionids to groups where they exist. 

My first thought is to sort by jsessionid, but I can't figure out how to loop through the data and create dynamic group names. 

Thanks for any ideas, not an SPL expert. 

Labels (2)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

This sort of works - the issue is knowing how many iterations to do to resolve all the indirections

| makeresults | eval _raw="tom:1234

frank:1234

bob:1234

bob:5467

sally:5467

sally:9012

amy:9012

harry:4709

tony:4709"
| multikv noheader=t 
| rex "(?<username>[^\:]+)\:(?<jsessionid>.+)"
| fields - _raw _time
| fields username jsessionid
| eventstats values(jsessionid) as jsessionids by username
| eventstats values(username) as usernames by jsessionid
| eventstats values(jsessionids) as jsessionidss by usernames
| eventstats values(usernames) as usernamess by jsessionids
| eventstats values(jsessionidss) as jsessionidsss by usernamess
| eventstats values(usernamess) as usernamesss by jsessionidss
| eventstats values(jsessionidsss) as jsessionidssss by usernamesss
| eventstats values(usernamesss) as usernamessss by jsessionidsss
| eventstats values(jsessionidssss) as jsessionidsssss by usernamessss
| eventstats values(usernamessss) as usernamesssss by jsessionidssss
| eval supergroup=mvjoin(usernamesssss,",")
| dedup supergroup
| table supergroup

View solution in original post

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

This sort of works - the issue is knowing how many iterations to do to resolve all the indirections

| makeresults | eval _raw="tom:1234

frank:1234

bob:1234

bob:5467

sally:5467

sally:9012

amy:9012

harry:4709

tony:4709"
| multikv noheader=t 
| rex "(?<username>[^\:]+)\:(?<jsessionid>.+)"
| fields - _raw _time
| fields username jsessionid
| eventstats values(jsessionid) as jsessionids by username
| eventstats values(username) as usernames by jsessionid
| eventstats values(jsessionids) as jsessionidss by usernames
| eventstats values(usernames) as usernamess by jsessionids
| eventstats values(jsessionidss) as jsessionidsss by usernamess
| eventstats values(usernamess) as usernamesss by jsessionidss
| eventstats values(jsessionidsss) as jsessionidssss by usernamesss
| eventstats values(usernamesss) as usernamessss by jsessionidsss
| eventstats values(jsessionidssss) as jsessionidsssss by usernamessss
| eventstats values(usernamessss) as usernamesssss by jsessionidssss
| eval supergroup=mvjoin(usernamesssss,",")
| dedup supergroup
| table supergroup
0 Karma

alphadog00
Splunk Employee
Splunk Employee

thanks, the data is actually a little cleaner - i did username:sessionid for this post, really they are already separate fields of data in the index.  But i will try what you posted. I had looked at eventstats, but not gone to this level

0 Karma

alphadog00
Splunk Employee
Splunk Employee

To clarify, I could have thousands of sessionIDs and since Tom shares different jsessionids with multiple different people, and they share jsessionids with others, this would be one big group.

0 Karma

alphadog00
Splunk Employee
Splunk Employee

thanks, but if I have no idea of username or group names - i need to create groups there could "n" groups, and the case statement would have to be built with unknown usernames - that is part of my sticking point.

I also have a "X" usernames and "Y" jsessionids

0 Karma

isoutamo
SplunkTrust
SplunkTrust
How you could group those without knowing which names belongs to which group?
0 Karma

alphadog00
Splunk Employee
Splunk Employee

If you look at this data:

tom:1234

frank:1234

bob:1234

bob:5467

sally:5467

sally:9012

amy:9012

- tom, frank, bob are connected via one JSESSIONID, bob and sally by another, so sally has an indirect or transitive relationship to tom and frank. Amy is connected to Sally - so in the end, all of these above are part of one big group due some direct and indirect relationships

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

you could try with next 

....
| eval group = case (username == "harry" OR username == "tom", "grp1", true(), "grp2")
| stats values(*) as * by group, jsessiond
| ....

please check the syntax as I haven’t splunk in my hands to check it.

r. Ismo 

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...