Splunk Search

IP version agnostic regular expression

mikaelbje
Motivator

Just wondering if anybody's succeeded in creating an IP version agnostic regular expression?

I'd like one regex to match both IPv4 and IPv6 addresses, matching against any of these tests:

  • TEST: 1:2:3:4:5:6:7:8
  • TEST: 1:: 1:2:3:4:5:6:7::
  • TEST: 1::8 1:2:3:4:5:6::8 1:2:3:4:5:6::8
  • TEST: 1::7:8 1:2:3:4:5::7:8 1:2:3:4:5::8
  • TEST: 1::6:7:8 1:2:3:4::6:7:8 1:2:3:4::8
  • TEST: 1::5:6:7:8 1:2:3::5:6:7:8 1:2:3::8
  • TEST: 1::4:5:6:7:8 1:2::4:5:6:7:8 1:2::8
  • TEST: 1::3:4:5:6:7:8 1::3:4:5:6:7:8 1::8
  • TEST: ::2:3:4:5:6:7:8 ::2:3:4:5:6:7:8 ::8 ::
  • TEST: fe08::7:8%eth0 fe08::7:8%1 (link-local IPv6 addresses with zone index)
  • TEST: ::255.255.255.255 ::ffff:255.255.255.255 ::ffff:0:255.255.255.255 (IPv4-mapped IPv6 addresses and IPv4-translated addresses)
  • TEST: 2001:db8:3:4::192.0.2.33 64:ff9b::192.0.2.33 (IPv4-Embedded IPv6 Address)
  • TEST: 192.168.1.1

The script at https://gist.github.com/syzdek/6086792 does this, but it involves some extra magic to work, not just plain regex.

The closest I've come is the following:

[ipv46]
# matches a valid IPv4 or IPv6 address (change to [[octet]] and [[ipv6]]. 
# Has a problem with 1::3 (http://stackoverflow.com/questions/53497/regular-expression-that-matches-valid-ipv6-addresses)
Stolen from: https://gist.github.com/syzdek/6086792
# Extracts: ip
REGEX = (?<ip>(?:2(?:5[0-5]|[0-4][0-9])|[0-1][0-9][0-9]|[0-9][0-9]?)(?:\.(?:2(?:5[0-5]|[0-4][0-9])|[0-1][0-9][0-9]|[0-9][0-9]?)){3}|([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe08:(:[0-9a-fA-F]{1,4}){2,2}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))

However it breaks at tests like 2001:db8:3:4::192.0.2.33 and 1::8

Splunk has a built-in transform called octet, but no such transform for ipv6 addresses.

Anyone?

Tags (3)
1 Solution

alacercogitatus
SplunkTrust
SplunkTrust

So this will match a lot of your examples. BUT it will also match single characters from [a-f].... that needs fixed.

NOTE: These regexes will NOT VALIDATE the IP, merely match the structure.

((::)?[\da-f]{1,4}[:\.]{0,2}){1,8}

It may be easier to match IPv4, and then IPv6 and combine it with an |.

This matches every single item in your list, without single characters and places it into a single capture group for use.

((?:(?:\d{1,3}\.){3}(?:\d{1,3}))|(?:(?:::)?(?:[\dA-Fa-f]{1,4}:{1,2}){1,7}(?:[\d\%A-Fa-z\.]+)?(?:::)?)|(?:::[\dA-Fa-f\.]{1,15})|(?:::))

So you could do:

| rex field=_raw "(?<src_ip>(?:(?:\d{1,3}\.){3}(?:\d{1,3}))|(?:(?:::)?(?:[\dA-Fa-f]{1,4}:{1,2}){1,7}(?:[\d\%A-Fa-z\.]+)?(?:::)?)|(?:::[\dA-Fa-f\.]{1,15})|(?:::))"

View solution in original post

alacercogitatus
SplunkTrust
SplunkTrust

So this will match a lot of your examples. BUT it will also match single characters from [a-f].... that needs fixed.

NOTE: These regexes will NOT VALIDATE the IP, merely match the structure.

((::)?[\da-f]{1,4}[:\.]{0,2}){1,8}

It may be easier to match IPv4, and then IPv6 and combine it with an |.

This matches every single item in your list, without single characters and places it into a single capture group for use.

((?:(?:\d{1,3}\.){3}(?:\d{1,3}))|(?:(?:::)?(?:[\dA-Fa-f]{1,4}:{1,2}){1,7}(?:[\d\%A-Fa-z\.]+)?(?:::)?)|(?:::[\dA-Fa-f\.]{1,15})|(?:::))

So you could do:

| rex field=_raw "(?<src_ip>(?:(?:\d{1,3}\.){3}(?:\d{1,3}))|(?:(?:::)?(?:[\dA-Fa-f]{1,4}:{1,2}){1,7}(?:[\d\%A-Fa-z\.]+)?(?:::)?)|(?:::[\dA-Fa-f\.]{1,15})|(?:::))"

mikaelbje
Motivator

This is pure gold! Thanks a lot. I will add this to my Cisco Networks app to make it IP version agnostic. I'll attribute you!

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...