topic Re: regex help in Splunk Search

regex help

pc1234 — Tue, 02 Mar 2021 19:38:47 GMT

Can someone assist extracting fields using the string below?

The first line is header info: date, protocol, response_status, response_type

each line following (one to many) is a website and an error code

i can't figure out a regex to capture the header line AND the successive lines of websites and error codes.

02-Mar-2021 UDP Response Found Response Type: ABC
www.site1.com 404
www.site10.com 100
www.site4.com 400
.....

Thanks in Advance.

Re: regex help

gcusello — Tue, 02 Mar 2021 20:20:49 GMT

Hi @pc1234,

let me understand:

have you a log or a csv file?
then I don't understan the structure of the file:
- have you an header containing infos and each row is an event?
- the event is the full file (headet + rows)?
Then do you want only one regex or is it acceptable for you to use two regexes?

Ciao.

Giuseppe

Re: regex help

pc1234 — Tue, 02 Mar 2021 20:36:30 GMT

I'm reading a log file. this is a single event:

02-Mar-2021 UDP Response Status:Found Response Type:ABC www.site1.com 404 www.site10.com 100 www.site4.com 400

I'd like to create a regex/field extraction that captures all the fields below. website would be a multivalue field since there are multiple occurrences (one to many)

fields and values
date:02-Mar-2021
protocol: UDP
Response status: Found
Response Type: ABC
website: www.site1.com
status: 404
website: www.site10.com
status: 100
website: www.site4.com
status: 400

Re: regex help

gcusello — Wed, 03 Mar 2021 07:30:34 GMT

Hi @pc1234,

you have to use two regexes:

the first to extract the header:

| rex "^(?<date>[^ ]+)\s+(?<protocol>\w+)\sResponse\s+(?<response_status>\w+).+Response\s+Type:\s+(?<response_type>\w+)"

that you can test at https://regex101.com/r/wP3LyX/1

the second to extract the sites:

| rex "(?<site>www\.[^ ]+)\s+(?<response_code>\d+)"

that you can test at https://regex101.com/r/UCwx2h/1

Ciao.

Giuseppe