Splunk Search

regroup similar url without using tons of regex

BenSI
New Member

Hi, Is there a way to regroup similar values without defining tons of regex.

Let say I do a search that return urls.  Those urls contains params in the path. 

 

/api/12345/info

/api/1234/info

/api/info/124/service

/api/info/123/service

I know we all see  a pattern there that could fit a regex;)  But remember I don't wan to use it.

I live in the hope that there is some magic that can regroup url that are similar

Something like : 

 

/api//info

/api/info//service
Labels (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

You did great to tell us what you do not want, but forgot to tell us what you do want.  What is the rule of grouping you expect?  Even though you listed /api//info and api/info//service, "regroup url that are similar" does not make this super clear.  An illustrative results table will be very helpful.

Here, I interpret your intention as to collage /api/12345/info and /api/1234/info as one group, then /api/info/124/service and /api/info/123/service as another.  Is this what you wanted?   If this is, how would you, without Splunk, determine why those are grouped that way in the most general approach, without looking at specific string patterns (aka regex)?  Without a rule, there is no meaningful answer to your question.  And what is your intended result format?

It is never a good idea to let volunteers read your mind because even if they enjoy (I don't) reading other people's mind, more often than not, mind readers will be wrong.

This said, I am willing to give mind reading one try.  I interpret your rule as to remove the second-to-last URI path segment, then group with the remainder.  Is this correct?  Then, I'll pick an arbitrary result format that feels most natural to Splunk.  You can do

 

| eval uril = split(uri, "/")
| eval group = mvjoin(mvappend(mvindex(uril, 0, -3), "", mvindex(uril, -1)), "/")
| stats values(uri) as uri by group

 

Here, field uri is your input, what you call URL (it is not).  This is an emulation using your illustrated data.  Play with it and compare with real data.

 

| makeresults format=csv data="uri
/api/12345/info
/api/1234/info
/api/info/124/service
/api/info/123/service"
``` data emulation above ```

 

Output using this emulation is

group
uri
/api//info
/api/1234/info
/api/12345/info
/api/info//service
/api/info/123/service
/api/info/124/service

Hope this helps

Tags (3)
0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...