Splunk Search

Basic Regex Field Extraction

JPrictoe
Loves-to-Learn

I want to extract from "Mozilla" to the closed quotes, pulling everything up to and including 27.0", how come my regex (\s.+") goes all the way to the final quote on the other side of the word analytics. I know the regex is poor, I'm just trying to get the concept.

 "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0" OBSERVED "Web Ads/Analytics"
Tags (2)
0 Karma

paulbannister
Communicator
"Mozilla\/(?P<FIELD_NAME>[^"]+)

As above with the REGEX being greedy, the attached regex will also generate the name for your new field.... just replace "FIELD_NAME" with the desired name of your new field

As a side note https://regex101.com/ is a fantastic place to experiment with/hone your REGEX skills

0 Karma

elliotproebstel
Champion

The reason your regex is capturing more than you intend is because regexes are greedy by default. So (\s.+") will match until the last double-quote it finds. Here's a revised regex that should work for you:

^\"[^"]+\"

This will look for the double-quotes at the start of the line, collect everything that's not a double-quote followed by the next instance of double-quotes. That prevents the greedy nature from kicking in.

cpetterborg
SplunkTrust
SplunkTrust

The .+ at the end of your regex is going to go all the way to the end. This should work for the regex:

Mozilla[^)]*\)

It will include the paren at the end as well, so you can decide if you want to include that.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

March Community Office Hours Security Series Uncovered!

Hello Splunk Community! In March, Splunk Community Office Hours spotlighted our fabulous Splunk Threat ...

Stay Connected: Your Guide to April Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars in April. This post ...