Splunk Search

Extract access_combined from JSON msg field

johnansett
Communicator

Hello! I have JSON events coming from Pivotal Cloud Foundry. Included in the JSON is the 'msg' field which includes what looks like a access_combined event:

{   [-] 
     cf_app_id:  caf9c86b-8672-48b8-90eb-d04b96e36cf3   
     cf_app_name:    app-name-1
     cf_ignored_app:     false  
     cf_org_id:  82e3a4a8-a40c-48bc-82e8-488acd0976ce   
     cf_org_name:    orgname    
     cf_origin:  firehose   
     cf_space_id:    df0e696d-93ca-4c69-ba91-e69ae8d2ab15   
     cf_space_name:  qa-web 
     deployment:     p-isolation-segment-f2a8ba4dfa4dca195b26   
     event_type:     LogMessage 
     ip:     10.1.1.1   
     job:    isolated_router    
     job_index:  fb34ddbe-0f9d-4645-847b-17d833fee1b1   
     message_type:   OUT    
     msg:    app1.company.com - [2019-06-03T18:42:33.399+0000] "PUT /api/updateQueue HTTP/1.1" 200 3394 3394 "https://qaapps2.company.com/app/app2/queue/venmie" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/74.0.3729.169 Safari/537.36" "10.10.10.254:43698" "10.10.10.24:61044" x_forwarded_for:"10.10.10.254, 10.10.10.254" x_forwarded_proto:"https" vcap_request_id:"4cedcf4b-8651-4a55-692d-77a25b790381" response_time:1.801544121 app_id:"caf9c86b-8672-48b8-90eb-d04b96e36cf3" app_index:"0" x_b3_traceid:"a7aadd84ddbc3b8c" x_b3_spanid:"a7aadd84ddbc3b8c" x_b3_parentspanid:"-"

     origin:     gorouter   
     source_instance:    1  
     source_type:    RTR    
     timestamp:  1559587355201956600    
}

I would like to expand out the 'msg' field and then extract the events - e.g. status (200), URI (https://qaapps2.company.com/app/app2/queue/venmie). Ideally I'd like Splunk to do this automatically as it has these fields defined in the sourcetype=access_combined.

What options do I have to do this?

Thanks!

0 Karma

martynoconnor
Communicator

HI there,

So while you are right in as much that there is an out of the box sourcetype definition for access_combined, this data will likely be coming into Splunk not under that sourcetype (and even if it was, the field extractions wouldn't work as it's not actually access_combined. You could achieve the extraction of the fields however using either custom EXTRACT-name definitions in props.conf (the better way to do things if you have, or will have, more than one indexer), or through the search itself using the rex command to extract the fields at search time.

An example of this might be something like

index=<your_index> sourcetype=<your_sourcetype>
| rex field=<the_field_to_extract_from> "<the_regular_expression_with_named_capture_group>"
0 Karma
Get Updates on the Splunk Community!

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

For Splunk Cloud customers, understanding and optimizing Splunk Virtual Compute (SVC) usage and resource ...

Automatic Discovery Part 3: Practical Use Cases

If you’ve enabled Automatic Discovery in your install of the Splunk Distribution of the OpenTelemetry ...