- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have basic web logs with username and jsessionid. I want to group (assume a single index, with one set of data). So thousands of events.
I want to group by jsessionid and username - creating supergroups. Example:
username:jsessionid
tom:1234
frank:1234
bob:1234
bob:5467
sally:5467
sally:9012
amy:9012
harry:4709
tony:4709
I would wind up with 2 groups - a small group with just harry and tony, and a larger group with tom, frank, bob, sally, and amy due to shared jsessionid.
I would like my output to contain some kind of group ID or Group Name. I would have no knowledge of username or jsessionid - I just want to be able to loop through the data and assign users/jsessionids to groups where they exist.
My first thought is to sort by jsessionid, but I can't figure out how to loop through the data and create dynamic group names.
Thanks for any ideas, not an SPL expert.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This sort of works - the issue is knowing how many iterations to do to resolve all the indirections
| makeresults | eval _raw="tom:1234
frank:1234
bob:1234
bob:5467
sally:5467
sally:9012
amy:9012
harry:4709
tony:4709"
| multikv noheader=t
| rex "(?<username>[^\:]+)\:(?<jsessionid>.+)"
| fields - _raw _time
| fields username jsessionid
| eventstats values(jsessionid) as jsessionids by username
| eventstats values(username) as usernames by jsessionid
| eventstats values(jsessionids) as jsessionidss by usernames
| eventstats values(usernames) as usernamess by jsessionids
| eventstats values(jsessionidss) as jsessionidsss by usernamess
| eventstats values(usernamess) as usernamesss by jsessionidss
| eventstats values(jsessionidsss) as jsessionidssss by usernamesss
| eventstats values(usernamesss) as usernamessss by jsessionidsss
| eventstats values(jsessionidssss) as jsessionidsssss by usernamessss
| eventstats values(usernamessss) as usernamesssss by jsessionidssss
| eval supergroup=mvjoin(usernamesssss,",")
| dedup supergroup
| table supergroup
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This sort of works - the issue is knowing how many iterations to do to resolve all the indirections
| makeresults | eval _raw="tom:1234
frank:1234
bob:1234
bob:5467
sally:5467
sally:9012
amy:9012
harry:4709
tony:4709"
| multikv noheader=t
| rex "(?<username>[^\:]+)\:(?<jsessionid>.+)"
| fields - _raw _time
| fields username jsessionid
| eventstats values(jsessionid) as jsessionids by username
| eventstats values(username) as usernames by jsessionid
| eventstats values(jsessionids) as jsessionidss by usernames
| eventstats values(usernames) as usernamess by jsessionids
| eventstats values(jsessionidss) as jsessionidsss by usernamess
| eventstats values(usernamess) as usernamesss by jsessionidss
| eventstats values(jsessionidsss) as jsessionidssss by usernamesss
| eventstats values(usernamesss) as usernamessss by jsessionidsss
| eventstats values(jsessionidssss) as jsessionidsssss by usernamessss
| eventstats values(usernamessss) as usernamesssss by jsessionidssss
| eval supergroup=mvjoin(usernamesssss,",")
| dedup supergroup
| table supergroup
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks, the data is actually a little cleaner - i did username:sessionid for this post, really they are already separate fields of data in the index. But i will try what you posted. I had looked at eventstats, but not gone to this level
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To clarify, I could have thousands of sessionIDs and since Tom shares different jsessionids with multiple different people, and they share jsessionids with others, this would be one big group.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks, but if I have no idea of username or group names - i need to create groups there could "n" groups, and the case statement would have to be built with unknown usernames - that is part of my sticking point.
I also have a "X" usernames and "Y" jsessionids
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you look at this data:
tom:1234
frank:1234
bob:1234
bob:5467
sally:5467
sally:9012
amy:9012
- tom, frank, bob are connected via one JSESSIONID, bob and sally by another, so sally has an indirect or transitive relationship to tom and frank. Amy is connected to Sally - so in the end, all of these above are part of one big group due some direct and indirect relationships
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
you could try with next
....
| eval group = case (username == "harry" OR username == "tom", "grp1", true(), "grp2")
| stats values(*) as * by group, jsessiond
| ....
please check the syntax as I haven’t splunk in my hands to check it.
r. Ismo