Getting Data In

extract a specific IP with positive lookahead

avoelk
Communicator

I'm trying to extract multiple fields out of my log. my problem is that I do have multiplie ip adresses - one for the source, one from the webserver etc. so to counter having extracted four ip adresses in every event into the same field I want to use a positive lookahead to tell the regex "yes this ip but only if afterwards there comes x and y"

this is an example log:

 

2014-03-27 23:54:58 1 10.5.6.121 304 TCP_HIT 422 501 GET http assets.razerzone.com 80 /eeimages/products/13785/razer-naga-2014-right-03.png - - - - 54.230.18.168 image/png http://imgur.com/gallery/u3o7l "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Iron/31.0.1700.0 Chrome/31.0.1700.0 Safari/537.36" OBSERVED "Technology/Internet;Shopping" - 163.252.254.203 - 54351
2014-03-28 23:54:59 90670 10.62.0.120 200 TCP_NC_MISS 601 693 GET http realtime.services.disqus.com 80 /api/2/thread/2221828111 ?bust=1780 - - - realtime.services.disqus.com application/json http://disqus.com/embed/comments/?base=default&disqus_version=6c05c0ca&f=bootsnipp&t_i=9WgD&t_u=http%3A%2F%2Fbootsnipp.com%2Fsnippets%2Ffeatured%2Fminimal-preview-thumbnails&t_d=Viewing%20snippet%20Minimal%20Preview%20Thumbnails%20%7C%20Bootsnipp.com&t_t=Viewing%20snippet%20Minimal%20Preview%20Thumbnails%20%7C%20Bootsnipp.com&s_o=default "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.154 Safari/537.36" OBSERVED "Newsgroups/Forums" - 163.252.254.203 184.173.90.195 52401

 

 

here I want  for example extract the source_ip (the first ip in the event) like this:

 

(\d+\.\d+\.\d+\.\d+)[[:blank:]](?=\d*)

 

 still this gives me many other ips... so I think I have to use multiple lookaheads? but I really don't know how to use it in that case. Especially when I want to extract the ip and give it a field name. In my head this means that I have to encapsulate EVERYTHING including the lookaheads ? 

 

thanks a lot for your help!

Labels (2)
0 Karma
1 Solution

avoelk
Communicator

I got my solution tha works: 

 

so first, when I want to capture a specific IP out of many I need my positive Lookahead with (?=\...)

so the src_ip regex is therefore: 

(?P<src_ip>\d*\.\d*\.\d*\.\d*)(?=\ \d* TCP_)

this makes sure that I capture the ip which has some numbers after a blank and then a TCP_ after another blank space.

to capture many different groups I just had to account for the blank spaces between the different values with \s and then start with my capture group again. : 

(?P<src_ip>\d*\.\d*\.\d*\.\d*)(?=\ \d* TCP_)\s(?P<bits>\d*)\s(?P<tcp_state>\w*_\w*)

 

View solution in original post

0 Karma

avoelk
Communicator

I got my solution tha works: 

 

so first, when I want to capture a specific IP out of many I need my positive Lookahead with (?=\...)

so the src_ip regex is therefore: 

(?P<src_ip>\d*\.\d*\.\d*\.\d*)(?=\ \d* TCP_)

this makes sure that I capture the ip which has some numbers after a blank and then a TCP_ after another blank space.

to capture many different groups I just had to account for the blank spaces between the different values with \s and then start with my capture group again. : 

(?P<src_ip>\d*\.\d*\.\d*\.\d*)(?=\ \d* TCP_)\s(?P<bits>\d*)\s(?P<tcp_state>\w*_\w*)

 

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Casting Call: Compete in Cyber Games

Lights, Camera, SecOps: Apply to Compete in Cyber Games     Think you have what it takes to beat the clock? ...

Data Management Digest – June 2026

Welcome to the June 2026 edition of Data Management Digest! This month’s update is short and sweet, with a ...

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

In cybersecurity, defenders respond to threats. Architects design the systems that stop them.    As ...