Splunk Search

regex help

pc1234
Engager

Can someone assist extracting fields using the string below?

The first line is header info: date, protocol, response_status, response_type

each line following (one to many) is a website and an error code

i can't figure out a regex to capture the header line AND the successive lines of websites and error codes. 

 

02-Mar-2021 UDP Response Found Response Type: ABC
www.site1.com 404
www.site10.com 100
www.site4.com 400
.....

 

Thanks in Advance.

 

Labels (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @pc1234,

let me understand: 

  • have you a log or a csv file?
  • then I don't understan the structure of the file:
    • have you an header containing infos and each row is an event?
    • the event is the full file (headet + rows)?
  • Then do you want only one regex or is it acceptable for you to use two regexes?

Ciao.

Giuseppe

0 Karma

pc1234
Engager

I'm reading a log file. this is a single event:

02-Mar-2021 UDP Response Status:Found Response Type:ABC www.site1.com 404 www.site10.com 100 www.site4.com 400

I'd like to create a regex/field extraction that captures all the fields below. website would be a multivalue field since there are multiple occurrences (one to many)

fields and values
date:02-Mar-2021
protocol: UDP
Response status: Found
Response Type: ABC
website: www.site1.com
status: 404
website: www.site10.com
status: 100
website: www.site4.com
status: 400

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @pc1234,

you have to use two regexes:

the first to extract the header:

| rex "^(?<date>[^ ]+)\s+(?<protocol>\w+)\sResponse\s+(?<response_status>\w+).+Response\s+Type:\s+(?<response_type>\w+)"

that you can test at https://regex101.com/r/wP3LyX/1

the second to extract the sites:

| rex "(?<site>www\.[^ ]+)\s+(?<response_code>\d+)"

that you can test at https://regex101.com/r/UCwx2h/1

Ciao.

Giuseppe

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!