Splunk Search

How to extract portion of the different strings using Regex?

aditsss
Motivator

Hi Eveyone,

Can anyone help me out in this.

I have a field name    Request_URL which is different each time.

Below are some examples for my Request_URL

https://xyz/api/connections/c1d30603ddf0

https://yte/api/flow/groups/314e8fead333/controller-services

 

https://tyu/api/services/968d06b5666b

https://hju/api/processors/b5f990b529f4/run-status

I want to extract "c1d30603ddf0" ,"b5f990b529f4" ,"314e8fead333" portion from every Request_URL  as Request_URL is different  for each one.

Can someone guide me with the regular expression of it in splunk

Thanks In advance

 

 

Labels (2)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @aditsss,

the only way is to identify the possible words befor the field to extract (in your examples: connections, groups, services processors to insert in the regex, something like this:

| rex "\/(connections|groups|services|processors)\/(?<Request_URL>\w*)"

that you can test at  https://regex101.com/r/Tt0jLf/1

Ciao.

Giuseppe

0 Karma

aditsss
Motivator

Hi gcusello,

| rex "\/(connections|groups|services|processors)\/(?<Request_URL>\w*)"

I can follow this but can you guide me I need to extract id from Request URL and need to put it one column (like id1 or something any name) which only  only contain id part of Request_URL .

Currently by this changes 

| rex "\/(connections|groups|services|processors)\/(?<Request_URL>\w*)"

 

Its making the change directly in request URL . I need both REQUEST_URL which will contain complete URL and other is id which only contain id from Request_URL.

Can you provide me the regex for that?

Tags (1)
0 Karma

aditsss
Motivator

Can you please suggest me on this ASAP as its urgently required

0 Karma

aditsss
Motivator

Hi gcusello,

| rex "\/(connections|groups|services|processors)\/(?<Request_URL>\w*)"

I can follow this but can you guide me I need to extract id from Request URL and need to put it one column (like id1 or something any name) which only  only contain id part of Request_URL .

Currently by this changes 

| rex "\/(connections|groups|services|processors)\/(?<Request_URL>\w*)"

 

Its making the change directly in request URL . I need both REQUEST_URL ehich will contain complete URL and other is id which only contain id from Request_URL.

Can you provide me the regex for that?

0 Karma

aditsss
Motivator

Hi gcusello

I cant hardcoded the words like this as there are mutiple URL's .These are some examples I have given.

 rex "\/(connections|groups|services|processors)\/(?<Request_URL>\w*)"

 

Previously I only have this URL

 https://xyz/api/groups/230df08c/registry.

So I tried like this and it works .It was creating one new column process and fetching "230df08c" part in the process column. that is (ID).

 | rex field=Request_URL "groups\/(?<process>[^\/]+)"

Can you please guide me how can I do this now .

0 Karma

SplunkRaz
Loves-to-Learn

It looks like you have a dynamic string to regex.

See this post - 

https://community.splunk.com/t5/Splunk-Search/Regex-for-dynamic-string/m-p/183009/highlight/true#M52... 

 

Tags (1)
0 Karma

aditsss
Motivator

Hi,

My URL's are  not dynamic.

But I dont want to put all the words that will come before id 

https://xyz/api/connections/c1d30603ddf0

https://yte/api/flow/groups/314e8fead333/controller-services

 

https://tyu/api/services/968d06b5666b

https://hju/api/processors/b5f990b529f4/run-status 

I tried by putting this:

rex field=Request_URL "\/(controller|process-groups|connections|processors)\/(?<process>[^\/]+)"

Then only URl's related to these 4 words are coming . Other Url's like(shown below) are not coming.

https://yui/api/flow/config

 

I want all data should be displayed but URL's which contains id part"b5f990b529f4" like this

I want to extract id part "b5f990b529f4" from the URL's which contain ID's. Should I need multiple regex or any number number regex we can use.

Please guide me on that.

0 Karma

SplunkRaz
Loves-to-Learn

If you check the link i provided it reads - 

The easiest solution is probably to rewrite the events with SEDCMD in props.conf on your indexer (or Heavy Forwarder);

[your sourcetype]
SEDCMD-blah = s/(\w+\.exe=\d{4,})/m_\1/g
As you can see, there are some assumptions here;
1) that all the stuff you want to rename ends in .exe
2) that they have at least a 4-digit value (i.e. greater than 1000)
3) that the binaries (i.e. field names) can contain only certain characters.

Adjust these things to suit your actual environment. Please note that this will actually change the events before the are written to disk, so if your'e not allowed to tamper with the data, this might not be the way to go.

 

I agree with that approach , your could try rewriting your events in the header  

0 Karma

aditsss
Motivator

Can please someone guide me on that part. Its really required.

0 Karma

aditsss
Motivator

Hi Can anyone help me out with this.Its really urgent.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...