Splunk Search
Highlighted

IP version agnostic regular expression

Motivator

Just wondering if anybody's succeeded in creating an IP version agnostic regular expression?

I'd like one regex to match both IPv4 and IPv6 addresses, matching against any of these tests:

  • TEST: 1:2:3:4:5:6:7:8
  • TEST: 1:: 1:2:3:4:5:6:7::
  • TEST: 1::8 1:2:3:4:5:6::8 1:2:3:4:5:6::8
  • TEST: 1::7:8 1:2:3:4:5::7:8 1:2:3:4:5::8
  • TEST: 1::6:7:8 1:2:3:4::6:7:8 1:2:3:4::8
  • TEST: 1::5:6:7:8 1:2:3::5:6:7:8 1:2:3::8
  • TEST: 1::4:5:6:7:8 1:2::4:5:6:7:8 1:2::8
  • TEST: 1::3:4:5:6:7:8 1::3:4:5:6:7:8 1::8
  • TEST: ::2:3:4:5:6:7:8 ::2:3:4:5:6:7:8 ::8 ::
  • TEST: fe08::7:8%eth0 fe08::7:8%1 (link-local IPv6 addresses with zone index)
  • TEST: ::255.255.255.255 ::ffff:255.255.255.255 ::ffff:0:255.255.255.255 (IPv4-mapped IPv6 addresses and IPv4-translated addresses)
  • TEST: 2001:db8:3:4::192.0.2.33 64:ff9b::192.0.2.33 (IPv4-Embedded IPv6 Address)
  • TEST: 192.168.1.1

The script at https://gist.github.com/syzdek/6086792 does this, but it involves some extra magic to work, not just plain regex.

The closest I've come is the following:

[ipv46]
# matches a valid IPv4 or IPv6 address (change to [[octet]] and [[ipv6]]. 
# Has a problem with 1::3 (http://stackoverflow.com/questions/53497/regular-expression-that-matches-valid-ipv6-addresses)
Stolen from: https://gist.github.com/syzdek/6086792
# Extracts: ip
REGEX = (?<ip>(?:2(?:5[0-5]|[0-4][0-9])|[0-1][0-9][0-9]|[0-9][0-9]?)(?:\.(?:2(?:5[0-5]|[0-4][0-9])|[0-1][0-9][0-9]|[0-9][0-9]?)){3}|([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe08:(:[0-9a-fA-F]{1,4}){2,2}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))

However it breaks at tests like 2001:db8:3:4::192.0.2.33 and 1::8

Splunk has a built-in transform called octet, but no such transform for ipv6 addresses.

Anyone?

Tags (3)
Highlighted

Re: IP version agnostic regular expression

SplunkTrust
SplunkTrust

So this will match a lot of your examples. BUT it will also match single characters from [a-f].... that needs fixed.

NOTE: These regexes will NOT VALIDATE the IP, merely match the structure.

((::)?[\da-f]{1,4}[:\.]{0,2}){1,8}

It may be easier to match IPv4, and then IPv6 and combine it with an |.

This matches every single item in your list, without single characters and places it into a single capture group for use.

((?:(?:\d{1,3}\.){3}(?:\d{1,3}))|(?:(?:::)?(?:[\dA-Fa-f]{1,4}:{1,2}){1,7}(?:[\d\%A-Fa-z\.]+)?(?:::)?)|(?:::[\dA-Fa-f\.]{1,15})|(?:::))

So you could do:

| rex field=_raw "(?<src_ip>(?:(?:\d{1,3}\.){3}(?:\d{1,3}))|(?:(?:::)?(?:[\dA-Fa-f]{1,4}:{1,2}){1,7}(?:[\d\%A-Fa-z\.]+)?(?:::)?)|(?:::[\dA-Fa-f\.]{1,15})|(?:::))"

View solution in original post

Highlighted

Re: IP version agnostic regular expression

Motivator

This is pure gold! Thanks a lot. I will add this to my Cisco Networks app to make it IP version agnostic. I'll attribute you!

0 Karma