Splunk Search

How to edit my REGEX in transforms.conf to allow certain data to get indexed in Splunk?

a212830
Champion

Hi,

I have a regex to allow certain data into Splunk via a transforms, and now I need to update it. I made some changes, but the data still isn't coming in, so I'm assuming that my regex is wrong.

Here's my transforms:

[save_fil_wc_ips_ive_tr0_asr]
REGEX = (?i)^[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|(fil|fidc|wc|tr0|asr|[0-9][0-9[0-9]rtr-1.fmr.com|rtr-2.fmr.com)
DEST_KEY = queue
FORMAT = indexQueue

Here's some sample data:

1478196000000|3176866|NormalizedPortInfo|UnknownProtocolPkts|0|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|Bits|1333972272|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|UnicastOut|280872|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|ErrorsIn|0|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|AdminStatusPollable|1|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|FrameSize|292.6625456115502|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|SpeedIn|30000000|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|BitsOut|327007456|Interface|150rtr-1.fmr.com|Gi0/0/0
0 Karma
1 Solution

gokadroid
Motivator

You can use below regex at search time, or while field extraction to get all the fields and once done you can then extract any sub info if needed:

yourBasaeQuery to search rtr-1.fmr.com
| rex "^(?<id1>[^\|]+)\|(?<id12>[^\|]+)\|(?<portInfo>[^\|]+)\|(?<protocolInfo>[^\|]+)\|(?<bytes>[^\|]+)\|(?<interface>[^\|]+)\|(?<dotCom>[^\|]+)\|(?<gi0>[^\s]+)"
| complete your query

remane the id1, id2 and all other field names according to your convenience.

See here the results.

Updating as per comments
Escape any of the special characters like dots in hostname \. which might be missing and close off the regex with a \|.* in the end if required.

View solution in original post

horsefez
SplunkTrust
SplunkTrust

Hi there,

for the making of new regular expressions I personally love to use the following site.
https://regex101.com/
It has the possibility to even debug the regular expression at every step.

Try the following:

(?i)(?:[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|)(\d{3,}\w{3,}\-\d\.\w{3,}\.\w+)(?:\|.*)
0 Karma

gokadroid
Motivator

You can use below regex at search time, or while field extraction to get all the fields and once done you can then extract any sub info if needed:

yourBasaeQuery to search rtr-1.fmr.com
| rex "^(?<id1>[^\|]+)\|(?<id12>[^\|]+)\|(?<portInfo>[^\|]+)\|(?<protocolInfo>[^\|]+)\|(?<bytes>[^\|]+)\|(?<interface>[^\|]+)\|(?<dotCom>[^\|]+)\|(?<gi0>[^\s]+)"
| complete your query

remane the id1, id2 and all other field names according to your convenience.

See here the results.

Updating as per comments
Escape any of the special characters like dots in hostname \. which might be missing and close off the regex with a \|.* in the end if required.

a212830
Champion

This is for filtering out data to get to Splunk, so I can't do it at search time.

0 Karma

gokadroid
Motivator

use the similar regex...in your regex you are extracting only 6 elements whereas your data has 8 elements..can you try to put a .* in the end to allow the remaining fields and see if that works, or extract all 8 fields via regex to match to data coming.

0 Karma

a212830
Champion

The issue isn't the fields, it's the last set of regexes. If I take out the .fmr.com options, (last 2), it works fine for those elements. I think the problem is with the ".fmr.com" regexes.

0 Karma

gokadroid
Motivator

use \. to escape your dots. Maybe thatts whats putting it off as dot is a special charater in regex. and try to close off that regex with a pipe after your round bracket grouping (fil|fidc|wc|tr0|asr|[0-9][0-9[0-9]rtr-1.fmr.com|rtr-2.fmr.com)\|

0 Karma

sloshburch
Splunk Employee
Splunk Employee

Spot on @gokadroid! I want to highlight for any padawans that in the original [^|]+\| was used. But notice how the first match says anything that is not a pipe - but in reality, that pipe is not escaped and could mean anything that is not nothing OR nothing (I guess? I'm not really sure what happens there). The point is, the pipe is not escaped in the brackets, but then escaped after. So I agree about escaping it both times. Huzzah.

a212830
Champion
got it - (?i)^[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|(fil|fidc|wc|tr0|asr|[0-9]*rtr-1\.fmr\.com|[0-9]*rtr-2\.fmr\.com)
0 Karma

sloshburch
Splunk Employee
Splunk Employee

I forget the exact syntax, but you should be able to say [^|]+\| several times with some regex syntax - rather than having to explicitly list them (which is more typo prone). If anyone recalls what that is, or what terms to search for to learn it, please do shout.

0 Karma

gokadroid
Motivator

Thanks if it helped, then please accept or upvote the answer.. Updating it in original answer

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...