Splunk Search

Extracting a field at search time - rex question

jstockamp
Communicator

I've looked at the splunk documentation but can't make sense of it, maybe it's too early int he morning. I'm having a problem extracting a field at search time.

I'm going through some web logs and I've got a field called referer. It's got values in in it like

http://www.mysite12.com
http://www.1234.org
http://wkjew23.ajkda.com/abc?1234
http://1254.splunk.com/Test

What I'd like to do is create a field that is just the domain name (i.e. just mysite12.com, 1234.org, etc.). I believe the correct regex to use is "\w*(.com|.net|.org)"

How do I extract this field in my search. I've used

rex field=referer "(?<refer_domain>)\w*(.com|.net|.org)"

But that doesn't seem to work. I'm unclear where/how I specify the field name for the extraction.

Tags (2)
1 Solution

hazekamp
Builder

jstockamp, In general your ?<field> goes inside a capture group. The regular expression below might be a bit better for you.

Updated: Proper escaping of slashes:

rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

For testing:

index=_internal | stats count | eval count="http://www.mysite12.com/" | rename count as referer | rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

View solution in original post

jstockamp
Communicator

Hmmm, that errors out. Here's my complete search command:

eventtype = "evt_all" | eval refer_domain = (coalesce(sc_Referer_, referer_domain)) | rex field=refer_domain "(?:https?|ftp|gopher|telnet|file|notes|ms-help):(?:(?://)|(?:\\\\))(?<refer>.*?)(?:[/\\]|$)" | table refer_domain, refer

and the error is

Error in 'rex' command: Encountered the following error while compiling the regex '(?:https?|ftp|gopher|telnet|file|notes|ms-help):(?:(?://)|(?:\\))(?<refer>.*?)(?:[/\]|$)': Regex: missing terminating ] for character class
0 Karma

hazekamp
Builder

Simple issue w/ escaping slashes. See updated rex above; also w/ search to test

0 Karma

hazekamp
Builder

jstockamp, In general your ?<field> goes inside a capture group. The regular expression below might be a bit better for you.

Updated: Proper escaping of slashes:

rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

For testing:

index=_internal | stats count | eval count="http://www.mysite12.com/" | rename count as referer | rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

jstockamp
Communicator

Thanks, after the edit this works great.

0 Karma
Get Updates on the Splunk Community!

What You Read The Most: Splunk Lantern’s Most Popular Articles!

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Announcing the General Availability of Splunk Enterprise Security 8.1!

We are pleased to announce the general availability of Splunk Enterprise Security 8.1. Splunk becomes the only ...

Developer Spotlight with William Searle

The Splunk Guy: A Developer’s Path from Web to Cloud William is a Splunk Professional Services Consultant with ...