Splunk Search

Extract access_combined from JSON msg field

johnansett
Communicator

Hello! I have JSON events coming from Pivotal Cloud Foundry. Included in the JSON is the 'msg' field which includes what looks like a access_combined event:

{   [-] 
     cf_app_id:  caf9c86b-8672-48b8-90eb-d04b96e36cf3   
     cf_app_name:    app-name-1
     cf_ignored_app:     false  
     cf_org_id:  82e3a4a8-a40c-48bc-82e8-488acd0976ce   
     cf_org_name:    orgname    
     cf_origin:  firehose   
     cf_space_id:    df0e696d-93ca-4c69-ba91-e69ae8d2ab15   
     cf_space_name:  qa-web 
     deployment:     p-isolation-segment-f2a8ba4dfa4dca195b26   
     event_type:     LogMessage 
     ip:     10.1.1.1   
     job:    isolated_router    
     job_index:  fb34ddbe-0f9d-4645-847b-17d833fee1b1   
     message_type:   OUT    
     msg:    app1.company.com - [2019-06-03T18:42:33.399+0000] "PUT /api/updateQueue HTTP/1.1" 200 3394 3394 "https://qaapps2.company.com/app/app2/queue/venmie" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/74.0.3729.169 Safari/537.36" "10.10.10.254:43698" "10.10.10.24:61044" x_forwarded_for:"10.10.10.254, 10.10.10.254" x_forwarded_proto:"https" vcap_request_id:"4cedcf4b-8651-4a55-692d-77a25b790381" response_time:1.801544121 app_id:"caf9c86b-8672-48b8-90eb-d04b96e36cf3" app_index:"0" x_b3_traceid:"a7aadd84ddbc3b8c" x_b3_spanid:"a7aadd84ddbc3b8c" x_b3_parentspanid:"-"

     origin:     gorouter   
     source_instance:    1  
     source_type:    RTR    
     timestamp:  1559587355201956600    
}

I would like to expand out the 'msg' field and then extract the events - e.g. status (200), URI (https://qaapps2.company.com/app/app2/queue/venmie). Ideally I'd like Splunk to do this automatically as it has these fields defined in the sourcetype=access_combined.

What options do I have to do this?

Thanks!

0 Karma

martynoconnor
Communicator

HI there,

So while you are right in as much that there is an out of the box sourcetype definition for access_combined, this data will likely be coming into Splunk not under that sourcetype (and even if it was, the field extractions wouldn't work as it's not actually access_combined. You could achieve the extraction of the fields however using either custom EXTRACT-name definitions in props.conf (the better way to do things if you have, or will have, more than one indexer), or through the search itself using the rex command to extract the fields at search time.

An example of this might be something like

index=<your_index> sourcetype=<your_sourcetype>
| rex field=<the_field_to_extract_from> "<the_regular_expression_with_named_capture_group>"
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...