Splunk Search

Help with Field Extraction using Regex?

alexspunkshell
Contributor

I am trying to extract 2 fields from my logs. 

Logs:

 

10.218.136.20 - - [30/Jun/2023:02:36:32 +0000] "GET /api/v2/runs/run-g1mhsXooK6aKV9bS?include=plan%2Ccost_estimate%2Capply%2Ccreated_by HTTP/1.1" 200 5460 "https://terraform.srv.companyname.com.au/app/customer/workspaces/a00ccc-tfe-test02-customer_infra_ping/runs/run-g1mhsXooK6aKV9bS" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"

 

Here i want to extract 2 new fields

 1. workspace_name="a00ccc-tfe-test02-customer_infra_ping"

2. workspace_id="g1mhsXooK6aKV9bS"

Please help me with regex & thanks in advance!

 

 

Labels (4)
0 Karma
1 Solution

isoutamo
SplunkTrust
SplunkTrust

This rex shouldn't match this first occurrence of runs/run-xxx as it expecting that there must be workspaces/xxxx before it. So it is matching the second 

runs/run-Y63d5qeBk3pDHpJZ"

which didn't contains / character. BTW there was mistake \/ instead of \\ on  rex. I have fixed it already, so you should also fix it in your extractions.

Can you test this without extractions just with SPL?

Something like 

index=test sourcetype="testdata"
| rex "workspaces\/(?<workspace_name>[^\/]+)\/runs\/run-(?<workspace_id>[^\"\\]+)" 
| table _time workspace_name workspace_id

just check that this works also on your real data. If need you could change the workspace_id to AAAAAA on rex to be sure that it's not defined somewhere else.

Are you sure that you haven't earlier definition for workspace_id somewhere? If you could login to command line you could try 

splunk btool props list --debug testdata

This should show all props definition to that sourcetype and where those are defined. 

View solution in original post

isoutamo
SplunkTrust
SplunkTrust

Hi

you could try 

rex "workspaces\/(?<workspace_name>[^\/]+)\/runs\/run-(?<workspace_id>[^\"]+)"

see https://regex101.com/r/aywynb/1

r. Ismo 

alexspunkshell
Contributor

@isoutamo Thanks for your reply.

Results for Workspace_name is fine. But workspace_id ends with a backslash. Can i get the result without \

alexspunkshell_0-1688490556809.png

 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Your examples didn' t contains / on workspace_id! You could try this

 

rex "workspaces\/(?<workspace_name>[^\/]+)\/runs\/run-(?<workspace_id>[^\"\\]+)"

 

alexspunkshell
Contributor

@isoutamo Still i am getting "\" at the end of the workspace_id results.

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Then your logs are not as your examples was on this thread. Can you give some more events as an example? Especially those with has the \ at the end on w_id!

alexspunkshell
Contributor

@isoutamo 

Other logs

 

10.218.136.20 - - [30/Jun/2023:02:36:37 +0000] "GET /api/v2/workspaces/ws-ukz9TnHNE9kN4eCa?include=agent_pool%2Ccurrent_configuration_version%2Ccurrent_run%2Ccurrent_state_version%2Clocked_by%2Creadme%2Coutputs HTTP/1.1" 304 0 "https://terraform.srv.companyname.com.au/app/customer/workspaces/a00ccc-tfe-dev02-customer_infra/runs/run-ACevPzmMTYE6UP5e" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"


10.218.136.20 - - [30/Jun/2023:02:36:40 +0000] "GET /api/v2/runs/run-Y63d5qeBk3pDHpJZ/run-events?include=comment%2Cactor HTTP/1.1" 304 0 "https://terraform.srv.companyname.com.au/app/customer/workspaces/a00964-tfe-test-customer_infra_main/runs/run-Y63d5qeBk3pDHpJZ" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"

 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Very interesting. As you can see here https://regex101.com/r/0gc9lk/1 those are extracted correctly. Are you sure that you don't add that \ to this string? On your examples there are no \ character at the end of n_id!

Can you show your SPL?

alexspunkshell
Contributor

@isoutamo 

alexspunkshell_0-1688493970752.png

 

alexspunkshell_1-1688493832345.png

 

0 Karma

alexspunkshell
Contributor

@isoutamo 

First line of the below log Y63......ends with /. Whereas third line y63....ends usually. Does this make change?

10.218.136.20 - - [30/Jun/2023:02:36:40 +0000] "GET /api/v2/runs/run-Y63d5qeBk3pDHpJZ/run-events?include=comment%2Cactor HTTP/1.1" 304 0 "https://terraform.srv.companyname.com.au/app/customer/workspaces/a00964-tfe-test-customer_infra_main/runs/run-Y63d5qeBk3pDHpJZ" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"

 

Tags (2)
0 Karma

isoutamo
SplunkTrust
SplunkTrust

This rex shouldn't match this first occurrence of runs/run-xxx as it expecting that there must be workspaces/xxxx before it. So it is matching the second 

runs/run-Y63d5qeBk3pDHpJZ"

which didn't contains / character. BTW there was mistake \/ instead of \\ on  rex. I have fixed it already, so you should also fix it in your extractions.

Can you test this without extractions just with SPL?

Something like 

index=test sourcetype="testdata"
| rex "workspaces\/(?<workspace_name>[^\/]+)\/runs\/run-(?<workspace_id>[^\"\\]+)" 
| table _time workspace_name workspace_id

just check that this works also on your real data. If need you could change the workspace_id to AAAAAA on rex to be sure that it's not defined somewhere else.

Are you sure that you haven't earlier definition for workspace_id somewhere? If you could login to command line you could try 

splunk btool props list --debug testdata

This should show all props definition to that sourcetype and where those are defined. 

Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...