Splunk Search

Need a regex improvement

tinylund
Explorer

| rex field=_raw "(?<dscvIP>[^\.]\d+\.\d+\.\d+\.\d+[\s|\:])"

Using the above rex command to try to capture IP addresses, an it works most of the time, but I still get a few false positives for ESX log entries that contain the following

Rcv-tx-10.20.30.45.78.80

The rex field captures 30.45.78.80 as the dscvIP field. I thought by adding the [^\.] to the beginning of the regex match it would not capture an string matching the IP syntax that had a period(.) immediately preceding the string. And I also thought this string would be skipped all together because for it to not start with a period(.) , the match would have to go to the dash(-) following tx, but then would not match because there is not a space or colon after the 4th \d+ match.

What am 

Labels (2)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Well. The typical regex used often to capture IP addresses is \d{1,3}\.\d{1,3}\.\d{1,3}\d{1,3}

Unfortunately, it doesn't validate the validity of the captured addres and accepts values over 255. It can be improved but it quickly gets ugly

((1?\d{1,2})|2([0-4]\d|5[0-5]))\d((1?\d{1,2})|2([0-4]\d|5[0-5]))\d((1?\d{1,2})|2([0-4]\d|5[0-5]))\d((1?\d{1,2})|2([0-4]\d|5[0-5]))

Something like that - I'm writing it on the fly so can't guarantee correctness 😉 Regex is not very well suited for matching IP's

Anyway, if you want to capture just the four octet sequence with a guarantee that it's not preceedee or continued by any dot-delimited sequence, you might want to match (^|[^.]) at the beginning and ($|[^.]) at the end (you don't escape the dot in character set)

So effectively (in the simple - non-validating form) you end up with

(^|[^.])(?<IP>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})($|[^.])
0 Karma

tinylund
Explorer

date FQN date:time FQN vmkernel: cpu16:66435)CpuSched: 694: user latency of 418901 RPC-tx-10.20.30.45.78.80 0 changed by 66435 NFSv3-RemountHandler -6

0 Karma

ashvinpandey
Contributor

@tinylund Please try using the below rex:

| rex field=_raw "RPC-tx-(?P<dscvIP>.*?)\s"

Also, If this reply helps you, an upvote would be appreciated.

0 Karma

tinylund
Explorer

This is just the raw log that fails, the above rex finds the correct combination in other logs, so I don't want to single out the RPC-tx- (this solution doesn't resolve the issue)

0 Karma

ashvinpandey
Contributor

@tinylund Can you share the raw event ?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...