Splunk Search

Extracting a field at search time - rex question

Communicator

I've looked at the splunk documentation but can't make sense of it, maybe it's too early int he morning. I'm having a problem extracting a field at search time.

I'm going through some web logs and I've got a field called referer. It's got values in in it like

http://www.mysite12.com
http://www.1234.org
http://wkjew23.ajkda.com/abc?1234
http://1254.splunk.com/Test

What I'd like to do is create a field that is just the domain name (i.e. just mysite12.com, 1234.org, etc.). I believe the correct regex to use is "\w*(.com|.net|.org)"

How do I extract this field in my search. I've used

rex field=referer "(?<refer_domain>)\w*(.com|.net|.org)"

But that doesn't seem to work. I'm unclear where/how I specify the field name for the extraction.

Tags (2)
1 Solution

Builder

jstockamp, In general your ?<field> goes inside a capture group. The regular expression below might be a bit better for you.

Updated: Proper escaping of slashes:

rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

For testing:

index=_internal | stats count | eval count="http://www.mysite12.com/" | rename count as referer | rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

View solution in original post

Communicator

Hmmm, that errors out. Here's my complete search command:

eventtype = "evt_all" | eval refer_domain = (coalesce(sc_Referer_, referer_domain)) | rex field=refer_domain "(?:https?|ftp|gopher|telnet|file|notes|ms-help):(?:(?://)|(?:\\\\))(?<refer>.*?)(?:[/\\]|$)" | table refer_domain, refer

and the error is

Error in 'rex' command: Encountered the following error while compiling the regex '(?:https?|ftp|gopher|telnet|file|notes|ms-help):(?:(?://)|(?:\\))(?<refer>.*?)(?:[/\]|$)': Regex: missing terminating ] for character class
0 Karma

Builder

Simple issue w/ escaping slashes. See updated rex above; also w/ search to test

0 Karma

Builder

jstockamp, In general your ?<field> goes inside a capture group. The regular expression below might be a bit better for you.

Updated: Proper escaping of slashes:

rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

For testing:

index=_internal | stats count | eval count="http://www.mysite12.com/" | rename count as referer | rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

View solution in original post

Communicator

Thanks, after the edit this works great.

0 Karma