Splunk Search

Why does Splunk suggest using Regex101.com?

panderla
New Member

The Regex I create extract fields inside the Regex101 site, but do nothing in Splunk. What gives?

Regex in use on REGEX101

(?m)(?:Port Sing.+)|((?(\w{2}\d\/\d\/\d+|\w{2}\d\/\d+|\w{2}\d))\s(?\d+)\s(?\d+)\s(?\d+)\s(?\d+)\s(?\d+)\s(?\d+)\s)

Data to extract fields:

08:05:24.378 CST Thu Feb 8 2018 show interface counter errors ! Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards Te1/0/1 0 0 1744790 0 0 1744790 Te1/0/2 0 0 6469254 0 0 6469254 Te1/0/3 0 0 0 0 0 0 Te1/0/4 0 0 0 0 0 0 Te1/0/5 0 0 0 0 0 0 Te1/0/6 0 0 1267548 1 0 1267548 Te1/0/7 0 0 7684 0 0 7684 Te1/0/8 0 0 73834268 0 0 73834268 Te1/0/9 0 0 15942062 0 0 15942062 Te1/0/10 0 0 0 0 0 0 Te1/0/11 0 0 0 0 0 0 Te1/0/12 0 0 0 0 0 0 Te2/0/1 0 0 848712196 0 0 848712196 Te2/0/2 0 0 865058003 0 0 865058003 Te2/0/3 0 0 2889546544 0 0 2889546544 Te2/0/4 0 0 3572229813 0 0 3572229813 Te2/0/5 0 0 3909332507 0 0 3909332507 Te2/0/6 0 0 5020658442 0 0 5020658442 Te2/0/7 0 0 4707980415 0 0 4707980415 Te2/0/8 0 0 430216868 0 0 430216868 Te2/0/9 0 0 105839820 0 0 105839820 Te2/0/10 0 0 382786 0 0 382786 Te2/0/11 0 0 379470 0 0 379470 Te2/0/12 0 0 242880 0 0 242880 Port Single-Col Multi-Col Late-Col Excess-Col Carri-Sen Runts Te1/0/1 0 0 0 0 0 0 Te1/0/2 0 0 0 0 0 0 Te1/0/3 0 0 0 0 0 0 Te1/0/4 0 0 0 0 0 0 Te1/0/5 0 0 0 0 0 0 Te1/0/6 0 0 0 0 0 0 Te1/0/7 0 0 0 0 0 0 Te1/0/8 0 0 0 0 0 0 Te1/0/9 0 0 0 0 0 0 Te1/0/10 0 0 0 0 0 0 Te1/0/11 0 0 0 0 0 0 Te1/0/12 0 0 0 0 0 0 Te2/0/1 0 0 0 0 0 0 Te2/0/2 0 0 0 0 0 0 Te2/0/3 0 0 0 0 0 0 Te2/0/4 0 0 0 0 0 0 Te2/0/5 0 0 0 0 0 0 Te2/0/6 0 0 0 0 0 0 Te2/0/7 0 0 0 0 0 0 Te2/0/8 0 0 0 0 0 0 Te2/0/9 0 0 0 0 0 0 Te2/0/10 0 0 0 0 0 0 Te2/0/11 0 0 0 0 0 0 Te2/0/12 0 0 0 0 0 0 !
0 Karma

richgalloway
SplunkTrust
SplunkTrust

The forum stripped out your field names. Please post a new comment in backticks (`) to show the full regex string you're using.

Regex101.com is a great site. Be sure to select the "PCRE" flavor so the site's behavior best matches Splunk's.

Why do you have a non-capturing group as one option in your regex? It won't do anything.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

panderla
New Member

My non-capturing group is meant to exclude that bit of data, so the other data is searchable and meaningful.

I have the PCRE box checked as recommended.

Try again on the regex paste process....

(?m)(?:Port Sing.+)|((?<interface>(\w{2}\d\/\d\/\d+|\w{2}\d\/\d+|\w{2}\d))\s(?<align_err>\d+)\s(?<fcs_err>\d+)\s(?<xmit_err>\d+)\s(?<rcv_err>\d+)\s(?<undersize>\d+)\s(?<outdiscards>\d+)\s)

0 Karma

elliotproebstel
Champion

Hmm...when you say this does nothing inside Splunk, can you elaborate? I tried making a result with your sample data and applying the regular expression via rex, and it extracted all fields correctly, I believe. Just to verify, this is what you're using in Splunk:

| rex field=_raw "(?m)(?:Port Sing.+)|((?<interface>(\w{2}\d\/\d\/\d+|\w{2}\d\/\d+|\w{2}\d))\s(?<align_err>\d+)\s(?<fcs_err>\d+)\s(?<xmit_err>\d+)\s(?<rcv_err>\d+)\s(?<undersize>\d+)\s(?<outdiscards>\d+)\s)"

Is that correct? Because on my system, that works perfectly with the sample event.

0 Karma

493669
Super Champion

also include max_match=0 to match all and gives same output as regex101

|rex max_match=0 field=_raw "(?m)(?:Port Sing.+)|((?<interface>(\w{2}\d\/\d\/\d+|\w{2}\d\/\d+|\w{2}\d))\s(?<align_err>\d+)\s(?<fcs_err>\d+)\s(?<xmit_err>\d+)\s(?<rcv_err>\d+)\s(?<undersize>\d+)\s(?<outdiscards>\d+)\s)"
0 Karma

panderla
New Member

Where do I define the max_match=0 so this is auto extracted? props.conf?

0 Karma

493669
Super Champion

not sure if we can mention it in props.conf
Else you need to move the extraction to transforms.conf, and specify MV_ADD=true

0 Karma

panderla
New Member

I have a sourcetype defined as follows:

[interface_counter_errors_1]
SHOULD_LINEMERGE = true
LINE_BREAKER = ((Port Single-Col Multi-Col Late-Col Excess-Col Carri-Sen Runts.+)|(\w{2}\d\/\d\/\d+\s\d+\s\d+\s\d+\s\d+\s\d+\s\d+\s)|(\w{2}\d\/\d+\s\d+\s\d+\s\d+\s\d+\s\d+\s\d+\s)|(\w{2}\d\s\d+\s\d+\s\d+\s\d+\s\d+\s\d+\s))
NO_BINARY_CHECK = true
CHARSET = UTF-8
MAX_TIMESTAMP_LOOKAHEAD = 32
disabled = false
TIME_FORMAT = %H:%M:%S.%3N %Z %a %b %e %Y
TIME_PREFIX = ^
TRUNCATE = 10000
DATETIME_CONFIG =
category = Custom
EXTRACT-error_data = (?m)(?:Port Sing.+)|((?(\w{2}\d\/\d\/\d+|\w{2}\d\/\d+|\w{2}\d))\s(?\d+)\s(?\d+)\s(?\d+)\s(?\d+)\s(?\d+)\s(?\d+)\s)
LOOKUP-interface_counter_errors = SnowMirror_cmdb_ci_netgear name AS host OUTPUTNEW dv_u_mosaic_machine_line AS machine_line
pulldown_type = 1

0 Karma

panderla
New Member

When the data is indexed with the EXTRACT-error_data, I expect the extracted fields to show up in the interesting fields area of search, however no such luck......

0 Karma

panderla
New Member

(?m)(?:Port Sing.+)|((?(\w{2}\d\/\d\/\d+|\w{2}\d\/\d+|\w{2}\d))\s(?\d+)\s(?\d+)\s(?\d+)\s(?\d+)\s(?\d+)\s(?\d+)\s)

This is the final regex I desire to work, so I have names attached to the data points....

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!