Splunk Search

How to use regex to extract from _raw and return in table format?

DLT76
Path Finder

I have logs with data in two fields: _raw and _time. I want to search the _raw field for an IP in a specific pattern and return a URL the follows the IP. I'd like to see it in a table in one column named "url" and also show the date/time a second column using the contents of the _time field.

Here's an example of the data in _raw:

 

  [1.2.3.4 lookup] : http://www.dummy-url.com/ -- 

 

I'd like to use a query like the following which will look for a specified IP and return the URL that follows after the colon:

 

rex field=_raw "1.2.3.4 lookup\] \: (?<url>[\w\:\/\.\-]+)"

 

The datasource looks like this:

 

sourcetype="datasource.out"

 

Can you help me with a query that searches for the IP and returns the URL (from _raw) and date/time (from _time) in table format?

Thanks!

Labels (2)
0 Karma
1 Solution

DLT76
Path Finder

Update #3 (and solution):

I think I figured it out.  I added this to the end of the query:

 | where ipaddress != ""

And now my table shows only those rows where the IP address matches.

Thank you for the help!

View solution in original post

0 Karma

richgalloway
SplunkTrust
SplunkTrust

You appear to have everything you need except for the table command.  What do you get with this query?

index=foo sourcetype="datasource.out"
| rex field=_raw "1.2.3.4 lookup\] \: (?<url>[\w\:\/\.\-]+)"
| table _time url

 

---
If this reply helps you, Karma would be appreciated.

DLT76
Path Finder

It does return a table with the date/time in one column, but the url column is blank.  It appears to be returning a row for every row during the date range.  I know I have rows with the IP in the _raw field because I get back rows when I search my source for just the IP in quotes.  And the regex looks good.  From regex101:

Capture.PNG

Ideas?

0 Karma

DLT76
Path Finder

Update:  It does appear to return every row from the raw field (or at least many more than have the specific IP), but when I sorted on the empty url column, I found that there are some rows with data, but they're not all URLs.

0 Karma

DLT76
Path Finder

Update #2:

So when I add a field for the ip address and display it in the table and sort on that column, I find matching results (yay!), but I'm also getting tons of records that don't match.  Here's the new query:

 

sourcetype="datasource.out" | rex field=_raw "(?<ipaddress>1.2.3.4) lookup\] \: (?<url>[\w\:\/\.\-]+)" | table _time url ipaddress

 

Is there a way to update the query to exclude non-matches from the table?

Capture2.PNG

0 Karma

DLT76
Path Finder

Update #3 (and solution):

I think I figured it out.  I added this to the end of the query:

 | where ipaddress != ""

And now my table shows only those rows where the IP address matches.

Thank you for the help!

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Will this work?

sourcetype="datasource.out"
| rex field=_raw "1.2.3.4 lookup\] \: (?<url>[\w\:\/\.\-]+)"
| table url _time

DLT76
Path Finder

Partly works--see above reply.  Thanks for your help.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

You could make the match more specific

sourcetype="datasource.out"
| rex field=_raw "1.2.3.4 lookup\] \: (?<url>http[\w\:\/\.\-]+)"
| table url _time
0 Karma

DLT76
Path Finder

Good idea--I thought of that too, but the table still returns gazillions of records that don't match, and the url and ipaddress fields are blank.  I'd like to see in the table only records that have a matching IP (see reply above).  Thanks again!

0 Karma

DLT76
Path Finder

I think I figured it out.  See Update #3 above.  I appreciate the assist!

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...