Getting Data In

Mutlivalue Field Extraction

nateloepker
Explorer

Hello,

I'm writing some field extractions for a Tomcat access log. The logging format is

"%{E M/d/y @ hh:mm:ss.S a z}t %h (%{X-Forwarded-For}i) > %A:%p "%r" %{requestBodyLength}r %D %s %B %I "%{Referer}i" "%{User-Agent}i" %u %S %{username}s %{sessionTracker}s"

The X-Forwarded Field has multiple headers, so multiple X-Forwarded-For IP's are being logged for a small, but important, percentage of these events.

An example log is

Thu 1/18/2024 @ 06:52:30.918 PM UTC 00.000.00.000 (00.000.000.000, 00.000.00.00, 00.000.00.00) > 00.000.00.0:0000 "PUT /uri/query/here HTTP/1.1" -  1270 200 3466 https-openssl-nio-00.000.00.0-000-exec-15 "hxxps://url.splunk.com/" "user_agent" - - - -

How can I perform a multivalue field extraction to grab 0, 1, 2 or 3 x-forwarded-for IP's?

Labels (3)
0 Karma
1 Solution

nateloepker
Explorer

I solved it by using the max_match option in the rex command. The x-forwarded-fors were extracted into a multivalue field x_forwarded_single

| rex field=_raw "^(?P<timestamp>\w+\s\d+\/\d+\/\d+\s.\s\d+:\d+:\d+\.\d+\s\w+\s\w+)\s(?P<remote_hostname>\S+)\s\((?P<x_forwarded_for>[^\)]*)\)\s\>\s(?P<local_ip_address>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):(?P<local_port>[\d\-]+)\s\"(?<request>[^\"]+)\"\s(?<request_body_length>\S+)\s(?<time_milli>\S+)\s(?<http_status>\S+)\s(?<bytes_sent>\S+)\s(?<request_thread_name>\S+)\s\"(?<referer>[^\"\s]*)\"\s\"(?<user_agent>[^\"]*)\"\s(?<remote_user>\S+)\s(?<user_session_id>\S+)\s(?<username>\S+)\s(?<session_tracker>\S+)"
| rex field=request "(?<http_method>\w*)\s+(?<url>[^ ]*)\s+(?<http_version>[^\"]+)[^ \n]*"
| rex field=url "(?<uri_path>[^?]+)(?:(?<uri_query>\?.*))?"
| rex field=x_forwarded_for max_match=3 "(?<x_forwarded_single>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"

View solution in original post

dural_yyz
Motivator
| makeresults 
| eval tmp="Thu 1/18/2024 @ 06:52:30.918 PM UTC 00.000.00.000 (00.000.000.001, 00.000.00.01, 00.000.00.03) > 00.000.00.0:0000 \"PUT /uri/query/here HTTP/1.1\" - 1270 200 3466 https-openssl-nio-00.000.00.0-000-exec-15 \"hxxps://url.splunk.com/\" \"user_agent\" - - - -"
| rex field=tmp "^(?<timestamp>\w+\s\d+\/\d+\/\d+\s\@\s\d+:\d+:\d+\.\d+\s\w+\s\w+)\s(?<remote_hostname>\S+)\s\((?<x_forwarded_for>[^\)]+).*$"
| table tmp timestamp remote_hostname x_forwarded_for
| eval x_forwarded_for=split(replace(x_forwarded_for,"\s",""),",")

Hello,

This will auto extract a variable number of x-forwarded-for addresses and place into a multi value field. 

0 Karma

nateloepker
Explorer

I solved it by using the max_match option in the rex command. The x-forwarded-fors were extracted into a multivalue field x_forwarded_single

| rex field=_raw "^(?P<timestamp>\w+\s\d+\/\d+\/\d+\s.\s\d+:\d+:\d+\.\d+\s\w+\s\w+)\s(?P<remote_hostname>\S+)\s\((?P<x_forwarded_for>[^\)]*)\)\s\>\s(?P<local_ip_address>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):(?P<local_port>[\d\-]+)\s\"(?<request>[^\"]+)\"\s(?<request_body_length>\S+)\s(?<time_milli>\S+)\s(?<http_status>\S+)\s(?<bytes_sent>\S+)\s(?<request_thread_name>\S+)\s\"(?<referer>[^\"\s]*)\"\s\"(?<user_agent>[^\"]*)\"\s(?<remote_user>\S+)\s(?<user_session_id>\S+)\s(?<username>\S+)\s(?<session_tracker>\S+)"
| rex field=request "(?<http_method>\w*)\s+(?<url>[^ ]*)\s+(?<http_version>[^\"]+)[^ \n]*"
| rex field=url "(?<uri_path>[^?]+)(?:(?<uri_query>\?.*))?"
| rex field=x_forwarded_for max_match=3 "(?<x_forwarded_single>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...