Hi,
I have a regex to allow certain data into Splunk via a transforms, and now I need to update it. I made some changes, but the data still isn't coming in, so I'm assuming that my regex is wrong.
Here's my transforms:
[save_fil_wc_ips_ive_tr0_asr]
REGEX = (?i)^[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|(fil|fidc|wc|tr0|asr|[0-9][0-9[0-9]rtr-1.fmr.com|rtr-2.fmr.com)
DEST_KEY = queue
FORMAT = indexQueue
Here's some sample data:
1478196000000|3176866|NormalizedPortInfo|UnknownProtocolPkts|0|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|Bits|1333972272|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|UnicastOut|280872|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|ErrorsIn|0|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|AdminStatusPollable|1|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|FrameSize|292.6625456115502|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|SpeedIn|30000000|Interface|150rtr-1.fmr.com|Gi0/0/0
1478196000000|3176866|NormalizedPortInfo|BitsOut|327007456|Interface|150rtr-1.fmr.com|Gi0/0/0
You can use below regex at search time, or while field extraction to get all the fields and once done you can then extract any sub info if needed:
yourBasaeQuery to search rtr-1.fmr.com
| rex "^(?<id1>[^\|]+)\|(?<id12>[^\|]+)\|(?<portInfo>[^\|]+)\|(?<protocolInfo>[^\|]+)\|(?<bytes>[^\|]+)\|(?<interface>[^\|]+)\|(?<dotCom>[^\|]+)\|(?<gi0>[^\s]+)"
| complete your query
remane the id1, id2 and all other field names according to your convenience.
See here the results.
Updating as per comments
Escape any of the special characters like dots in hostname \.
which might be missing and close off the regex with a \|.*
in the end if required.
Hi there,
for the making of new regular expressions I personally love to use the following site.
https://regex101.com/
It has the possibility to even debug the regular expression at every step.
Try the following:
(?i)(?:[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|)(\d{3,}\w{3,}\-\d\.\w{3,}\.\w+)(?:\|.*)
You can use below regex at search time, or while field extraction to get all the fields and once done you can then extract any sub info if needed:
yourBasaeQuery to search rtr-1.fmr.com
| rex "^(?<id1>[^\|]+)\|(?<id12>[^\|]+)\|(?<portInfo>[^\|]+)\|(?<protocolInfo>[^\|]+)\|(?<bytes>[^\|]+)\|(?<interface>[^\|]+)\|(?<dotCom>[^\|]+)\|(?<gi0>[^\s]+)"
| complete your query
remane the id1, id2 and all other field names according to your convenience.
See here the results.
Updating as per comments
Escape any of the special characters like dots in hostname \.
which might be missing and close off the regex with a \|.*
in the end if required.
This is for filtering out data to get to Splunk, so I can't do it at search time.
use the similar regex...in your regex you are extracting only 6 elements whereas your data has 8 elements..can you try to put a .*
in the end to allow the remaining fields and see if that works, or extract all 8 fields via regex to match to data coming.
The issue isn't the fields, it's the last set of regexes. If I take out the .fmr.com options, (last 2), it works fine for those elements. I think the problem is with the ".fmr.com" regexes.
use \.
to escape your dots. Maybe thatts whats putting it off as dot is a special charater in regex. and try to close off that regex with a pipe after your round bracket grouping (fil|fidc|wc|tr0|asr|[0-9][0-9[0-9]rtr-1.fmr.com|rtr-2.fmr.com)\|
Spot on @gokadroid! I want to highlight for any padawans that in the original [^|]+\|
was used. But notice how the first match says anything that is not a pipe - but in reality, that pipe is not escaped and could mean anything that is not nothing OR nothing (I guess? I'm not really sure what happens there). The point is, the pipe is not escaped in the brackets, but then escaped after. So I agree about escaping it both times. Huzzah.
got it - (?i)^[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|(fil|fidc|wc|tr0|asr|[0-9]*rtr-1\.fmr\.com|[0-9]*rtr-2\.fmr\.com)
I forget the exact syntax, but you should be able to say [^|]+\|
several times with some regex syntax - rather than having to explicitly list them (which is more typo prone). If anyone recalls what that is, or what terms to search for to learn it, please do shout.
Thanks if it helped, then please accept or upvote the answer.. Updating it in original answer