Getting Data In

Mutlivalue Field Extraction

nateloepker
Explorer

Hello,

I'm writing some field extractions for a Tomcat access log. The logging format is

"%{E M/d/y @ hh:mm:ss.S a z}t %h (%{X-Forwarded-For}i) > %A:%p "%r" %{requestBodyLength}r %D %s %B %I "%{Referer}i" "%{User-Agent}i" %u %S %{username}s %{sessionTracker}s"

The X-Forwarded Field has multiple headers, so multiple X-Forwarded-For IP's are being logged for a small, but important, percentage of these events.

An example log is

Thu 1/18/2024 @ 06:52:30.918 PM UTC 00.000.00.000 (00.000.000.000, 00.000.00.00, 00.000.00.00) > 00.000.00.0:0000 "PUT /uri/query/here HTTP/1.1" -  1270 200 3466 https-openssl-nio-00.000.00.0-000-exec-15 "hxxps://url.splunk.com/" "user_agent" - - - -

How can I perform a multivalue field extraction to grab 0, 1, 2 or 3 x-forwarded-for IP's?

Labels (3)
0 Karma
1 Solution

nateloepker
Explorer

I solved it by using the max_match option in the rex command. The x-forwarded-fors were extracted into a multivalue field x_forwarded_single

| rex field=_raw "^(?P<timestamp>\w+\s\d+\/\d+\/\d+\s.\s\d+:\d+:\d+\.\d+\s\w+\s\w+)\s(?P<remote_hostname>\S+)\s\((?P<x_forwarded_for>[^\)]*)\)\s\>\s(?P<local_ip_address>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):(?P<local_port>[\d\-]+)\s\"(?<request>[^\"]+)\"\s(?<request_body_length>\S+)\s(?<time_milli>\S+)\s(?<http_status>\S+)\s(?<bytes_sent>\S+)\s(?<request_thread_name>\S+)\s\"(?<referer>[^\"\s]*)\"\s\"(?<user_agent>[^\"]*)\"\s(?<remote_user>\S+)\s(?<user_session_id>\S+)\s(?<username>\S+)\s(?<session_tracker>\S+)"
| rex field=request "(?<http_method>\w*)\s+(?<url>[^ ]*)\s+(?<http_version>[^\"]+)[^ \n]*"
| rex field=url "(?<uri_path>[^?]+)(?:(?<uri_query>\?.*))?"
| rex field=x_forwarded_for max_match=3 "(?<x_forwarded_single>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"

View solution in original post

dural_yyz
Motivator
| makeresults 
| eval tmp="Thu 1/18/2024 @ 06:52:30.918 PM UTC 00.000.00.000 (00.000.000.001, 00.000.00.01, 00.000.00.03) > 00.000.00.0:0000 \"PUT /uri/query/here HTTP/1.1\" - 1270 200 3466 https-openssl-nio-00.000.00.0-000-exec-15 \"hxxps://url.splunk.com/\" \"user_agent\" - - - -"
| rex field=tmp "^(?<timestamp>\w+\s\d+\/\d+\/\d+\s\@\s\d+:\d+:\d+\.\d+\s\w+\s\w+)\s(?<remote_hostname>\S+)\s\((?<x_forwarded_for>[^\)]+).*$"
| table tmp timestamp remote_hostname x_forwarded_for
| eval x_forwarded_for=split(replace(x_forwarded_for,"\s",""),",")

Hello,

This will auto extract a variable number of x-forwarded-for addresses and place into a multi value field. 

0 Karma

nateloepker
Explorer

I solved it by using the max_match option in the rex command. The x-forwarded-fors were extracted into a multivalue field x_forwarded_single

| rex field=_raw "^(?P<timestamp>\w+\s\d+\/\d+\/\d+\s.\s\d+:\d+:\d+\.\d+\s\w+\s\w+)\s(?P<remote_hostname>\S+)\s\((?P<x_forwarded_for>[^\)]*)\)\s\>\s(?P<local_ip_address>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):(?P<local_port>[\d\-]+)\s\"(?<request>[^\"]+)\"\s(?<request_body_length>\S+)\s(?<time_milli>\S+)\s(?<http_status>\S+)\s(?<bytes_sent>\S+)\s(?<request_thread_name>\S+)\s\"(?<referer>[^\"\s]*)\"\s\"(?<user_agent>[^\"]*)\"\s(?<remote_user>\S+)\s(?<user_session_id>\S+)\s(?<username>\S+)\s(?<session_tracker>\S+)"
| rex field=request "(?<http_method>\w*)\s+(?<url>[^ ]*)\s+(?<http_version>[^\"]+)[^ \n]*"
| rex field=url "(?<uri_path>[^?]+)(?:(?<uri_query>\?.*))?"
| rex field=x_forwarded_for max_match=3 "(?<x_forwarded_single>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Continue Your Federation Journey: Join Session 3 of the Bootcamp Series

To help practitioners build a stronger foundation, we launched the Data Management & Federation ...

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Casting Call: Compete in Cyber Games

Lights, Camera, SecOps: Apply to Compete in Cyber Games     Think you have what it takes to beat the clock? ...