Splunk Search
Highlighted

Search extract grabbing more than tested/verified.

Communicator

I'm extracting a partial line from a multi-line event. When I test the extract out everything returns as it should. However, when I perform a search and view the extractions the rest of the multi-line event is showing up. Any insights on this?

Here is the event:

Date: 2011-01-05 13:48:49
Request made by: wwwrun   /opt/apache2/bin/httpd -k start
Actual request: db_auth  /usr/bin/perl /home/db_auth/db_auth dashboard host
==============================================================

This is the regex used for extraction:

(?i)Request\s+made\s+by:\s+\w+\s+(?P<dbauth_request>.+) 

In testing it extracts correctly:

dbauth_request"=/opt/apache2/bin/httpd -k start"

In searching, however, this is being extracted:

    dbauth_request="/opt/apache2/bin/httpd -k start
    Actual request: db_auth  /usr/bin/perl /home/db_auth/db_auth dashboard host
    =============================================================="
Tags (1)
Highlighted

Re: Search extract grabbing more than tested/verified.

Super Champion

I assume you are comparing interactive field extraction using "|rex" vs setting up a permanent field extraction in transforms.conf or props.conf. Is that correct, you exact situation wasn't clearly spelled out.

Try one of these:

(?i)^Request\s+made\s+by:\s+\w+\s+(?P<dbauth_request>.+)$
(?im)[\r\n]Request\s+made\s+by:\s+\w+\s+(?P<dbauth_request>.+?)[\r\n]

If I wasn't feeling lazy I'd stick it in my regex tool, but that takes all the fun out of it. Best of luck.

View solution in original post

Highlighted

Re: Search extract grabbing more than tested/verified.

Communicator

These are search-time extractions. I'm just using the 'extract field' tool from my search results. Within there there is a 'test' button that will apply the regex.

Of your two suggestions I tried the former yesterday with the same results. I then tried a modified version of the latter today and it is extracting correctly. Thanks!

(?im)Request\s+made\s+by:\s+\w+\s+(?P.+?)[\r\n]

0 Karma
Highlighted

Re: Search extract grabbing more than tested/verified.

Super Champion

Yeah. I do see that I made a typo on the second one. You don't need the [\r\n] before Request, but it may be helpful to tell the regex engine to always expect the word Request to be the start of a line. Glad you have a working solution.

0 Karma
Highlighted

Re: Search extract grabbing more than tested/verified.

Super Champion

BTW. I think the interactive field extraction (IFX) tool uses a python search command (and therefore the python regex engine) whereas Splunk uses PCRE for built-in field extractions (as well as the rex search command), so it is conceivable to get some subtle rexex flavor difference like this when using IFX (although, the normally don't seem subtle when your looking at them). It's also possible something else was going on.

0 Karma