Splunk Search

How can I filter partial duplicates?

Yukie
Observer

Hello,

I'm new to splunk (Internship) and couldn't find and answer.

I'd need a way to filter my search.

I'm curently using a ".... | ... | stats count by RequestPath" search.

The problem is that the "RequestPath" can contain variable/random data at the end.

 

Exemple:
x/y/first

x/y/second/randomText

x/y/second/randomText

x/y/third

 

 

There are millions outputs and i would like to filter them so i only keep :

x/y/first

x/y/second

x/y/third

Thanks 🙂

Labels (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Yukie,

you have to extract a part of the RequestPath using the rex command and use it for the stats command, something like this:

<your_search>
| rex field=RequestPath "\w+\/\w+\/(?<SubPath>[^\/\n]+)"
| stats count BY SubPath

Ciao.

Giuseppe

0 Karma

Yukie
Observer

Hi @gcusello ,
Thanks for the fast answer.

 

It definitly helped but I realised it's a bit more complicated that what i described.

Your suggestion gives as output :

first
second
third

Where i'd like to have te full path until there
x/y/first

x/y/second

x/y/third


Because there might be for exemple :
x/y/first

x/z/second/random.pdf

x/z/second/random.pdf

x/y/third

Not an expert in regex and even less in splunk regex synthax. Sorry if it sound like something so simple i should have found myself.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Yukie,

no problem, please try this regex 

^(?<SubPath>[^\/]+\/[^\/]+\/[^\/\n]+)

that you an test at https://regex101.com/r/0hzRax/1

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...