Splunk Search

How to write a regex to group based on a particular field?

sp1711
Path Finder

I am looking to see how many times a particular uri was hit on a daily basis and group it based on a field.
say the uri is POST {base_url}/user/{user_id}/def/{def_id}/xyz

I have done the first part of how many times this uri is hit daily,

index="something" sourcetype=blah OR meh "def"| stats count by uri | bucket _time span=1d | time chart count

Now I want to group this based on different user_id's.

                  user_id1    user_id2         user_id3
day1             10                20                 2
day2             21                22                 50
day3             20                30                 10

I'm looking for this kind of an output. Any ideas?

Tags (3)
1 Solution

rsennett_splunk
Splunk Employee
Splunk Employee

Basically what you want to do is create a field that contains the userid so you can group by it...

POST\s+\/user\/(?<user>[^\/]+)

will create the userid field for you.

if you want to grab the whole thing (and maybe create a field for def_id) just use the slash to jump from segment to segment so that there can be anything between them. Or... if what's between them is static... then use literals:

POST\s+\/user\/(?<userid>[^\/]+)\/[^\/]+\/(?<defid>[^\/]+)\/\S+ 

makes two fields should you need them.

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!

View solution in original post

jacobwilkins
Communicator

One thing to keep in mind is that extracting the field via a regex is a totally separate step from grouping an aggregated result.

index="something" sourcetype=blah OR meh "def"
| rex field=uri "POST\s+\/user\/(?<user_id>[^\/]+)"
| timechart span=1d count by user_id

That should do it.

sp1711
Path Finder

When I tried rex"\s+\/user\/(\?[^\/]+)" , it gives me the following error.

Error in 'rex' command: The regex '\s+\/user\/(\?[^\/]+)' does not extract anything. It should specify at least one named group. Format: (?...).

0 Karma

jacobwilkins
Communicator

Whoops. I copy-pasted the wrong rex into my post. I just edited it, so try that.

0 Karma

sp1711
Path Finder

I tried that and I still get Error in 'SearchOperator:rex': Usage: regex [field=]

0 Karma

sp1711
Path Finder

Okay I get the query working now but the output I get is weird.

                    NULL
day1             23
day2             10
day3              25

This is what I get. Why is it taking user_id as null.?

0 Karma

rsennett_splunk
Splunk Employee
Splunk Employee

try it in regex101.com make sure you are capturing what you think you are capturing.
Also... double check it by adding the filter uri=/user/* to the start of your search.

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
0 Karma

rsennett_splunk
Splunk Employee
Splunk Employee

Basically what you want to do is create a field that contains the userid so you can group by it...

POST\s+\/user\/(?<user>[^\/]+)

will create the userid field for you.

if you want to grab the whole thing (and maybe create a field for def_id) just use the slash to jump from segment to segment so that there can be anything between them. Or... if what's between them is static... then use literals:

POST\s+\/user\/(?<userid>[^\/]+)\/[^\/]+\/(?<defid>[^\/]+)\/\S+ 

makes two fields should you need them.

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!

View solution in original post

sp1711
Path Finder

how do I use this? DO i just pass the query as regex and group it by the same. Something like,

"the search"|rex field=new_raw"POST\s+\/user\/(?[^\/]+)". will this work??

rsennett_splunk
Splunk Employee
Splunk Employee

No... the field= is the value you are looking in for the rex which is by default, _raw.
As jacobwilkins showed you, you could if you like, tell Splunk to look in the uri field...
the new field is established in the capturing group (?[^\/]+)

after the field is named you identify WHAT to capture... which in this case will translate to "everything that is not a slash" the markup is on the fritz and is removing part of the capturing group...

check it out here: https://regex101.com/r/zB0aV1/1

You can see on the right hand side, everything that the regex is doing, step by step.

Best thing for you to do, given that it seems you are quite new to Splunk, is to use the "Field Extractor" and use the regex pattern to extract the field as a search time field extraction.

You could also let Splunk do the extraction for you.

When looking at your events (enter everything up to the first pipe and run it), to make things easier you might put in also uri=/user/* just to be sure you get enough examples of what you want to pull
note that the first column in the events grid is a > greater than symbol. click that. (the column is topped with an "i")
Click "Event Actions" and then "Extract Fields".

Then you can use the field extraction wizard to let Splunk do the work. The regex Splunk comes up with may be a bit more cryptic than the one I'm using because it doesn't really have any context to work with.

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!

rsennett_splunk
Splunk Employee
Splunk Employee

(markup is removing some characters so click the link below and see the actual regex. the new filename comes after the question mark.
So the code should read
open left paren
question mark
less than sign
name of field
greater than sign
left square bracket
carrot
escape
forward slash
right square bracket
plus sign
right paren

https://regex101.com/r/zB0aV1/1

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!