Splunk Search

Question for extracting a string out of longer string

strehb18
Path Finder

Hello, I am looking to create a new field based on a section from a longer string/web address. I didn't see what i was looking in a search on this site. I assume its possible with a regex but I am not good at those. 

Here are examples of fields I currently have... I am looking to extract the Location part of the string. The Company\Segment\Name does not change before Location. Location will change and is what I need. 

\Company\Segment\Name\Location\Area

\Company\Segment\Name\Location\Area

I do not need the Area. 

Thanks,

Labels (3)
Tags (3)
1 Solution

chaday00
Path Finder

I'm not complete sure how your location is configured so I made some assumptions.

First I created a CSV file to test with and then created a lookup table:

location_number, location_path
1,"\Oshkosh Corporation\Defense\Oshkosh Defense\New York, NY\Staten Island"
2,"\Oshkosh Corporation\Defense\Oshkosh Defense\Atlanta, GA\Bankhead"
3,"\Oshkosh Corporation\Defense\Oshkosh Defense\Atlanta, GA\Peachtree Center"
4,"\Oshkosh Corporation\Defense\Oshkosh Defense\Seattle, WA\Pier"

Then using the query below I was able to get the rex working--turns out I needed 3 backslashes:

 

| inputlookup test_urls.csv | rex field=location_path "\\\Oshkosh Corporation\\\Defense\\\Oshkosh Defense\\\(?<location>\w+.*\,[\sA-Z]{3})" | table location

 

Output: 

location
New York, NY
Atlanta, GA
Atlanta, GA
Seattle, WA

 

View solution in original post

chaday00
Path Finder

Might need a little more information than you provided, but the below may work:

 

| rex field=url \\Company\\Segment\\Name\\(?<location>\w+)\\Area

 

This assumes the following are static: Company, Segment, Name, Area and creates a new field called 'location'

\w+ means match any word character (i.e., Matches any letter, digit or underscore. Equivalent to [a-zA-Z0-9_]), one or more times

You could certainly use regex on the entire string too and only return the location if you wanted. 

0 Karma

strehb18
Path Finder

I think this is probably very close to what I need. Area is not static however, it will change based on what location is but I won't need that value. 

Also, if the field that the original string is in is called location_path. Does that replace "field" or "url" in the above code snippet? 

And just so I am clear, that line of code should return what is the "location" part of the string and put it into its own field labeled as location? 

chaday00
Path Finder

Correct. If I did it correctly, this will create a field called 'location'. You could then view a list of 'locations' like so:

 

| table location

 

The 'url' would need to be replaced with whatever field name you have this data stored in. 

To handle the 'area' being different, we'd want to vary the ending a little:

 

| rex field=url \\Company\\Segment\\Name\\(?<location>\w+)\\\w+.*

 

 This will return only what was in the Location portion of your URL. Everything after the `\` would be ignored. 

0 Karma

strehb18
Path Finder

I get the following error with this but it seems on the verge of working. 

"Error in 'rex' command: Encountered the following error while compiling the regex '\Company': Regex: unrecognized character follows \.

Does this potentially have to do with what actually fills spots between \ has spaces in it? 

ex: \Company Name\Segment\Name Name\Location\Area

chaday00
Path Finder

You have to escape the backslash with another backslash `\\`, this will process the `\` as a literal otherwise it will try to parse \C as a regex expression which it is not. 

Also, I forgot the quotes:

| rex field=url "\\Company\\Segment\\Name\\(?<location>\w+)\\\w+.*"
0 Karma

strehb18
Path Finder

I think this is close but I think it may be my error. Even with the " " I still get the following error using my actual data. I get myself confused trying to transfer to the data I am actually using. To be clear, I am trying to extract everything between the \ \ where location is. 

Actual Input (location_path is the field the data is in.) 

| rex field=location_path "\\Oshkosh Corporation\\Defense\\Oshkosh Defense\\(?<location>\w+)\\\w +.*"

Error Message

Error in 'rex' command: Encountered the following error while compiling the regex '\Oshkosh Corporation\Defense\Oshkosh Defense\(?<location>\w+)\\w +.*': Regex: unrecognized character follows \.

chaday00
Path Finder

In your 'area' section, it looks like there is  a space after the 'w', it should be `\w+.*`

I couldn't find it in the documentation, but you may have to escape the white spaces too. Either way, the last query below, may help. 🤞

| rex field=location_path "\\Oshkosh Corporation\\Defense\\Oshkosh Defense\\(?<location>\w+)\\\w+.*"

| rex field=location_path "\\Oshkosh\ Corporation\\Defense\\Oshkosh\ Defense\\(?<location>\w+)\\\w+.*"

\\\w+.*\s\w+.*\\\w+.*\\\w+.*\s\w+.*\\(?<location>\w+)\\\w+.*

 

0 Karma

strehb18
Path Finder

I get the same error without the space. 

chaday00
Path Finder

I'm not complete sure how your location is configured so I made some assumptions.

First I created a CSV file to test with and then created a lookup table:

location_number, location_path
1,"\Oshkosh Corporation\Defense\Oshkosh Defense\New York, NY\Staten Island"
2,"\Oshkosh Corporation\Defense\Oshkosh Defense\Atlanta, GA\Bankhead"
3,"\Oshkosh Corporation\Defense\Oshkosh Defense\Atlanta, GA\Peachtree Center"
4,"\Oshkosh Corporation\Defense\Oshkosh Defense\Seattle, WA\Pier"

Then using the query below I was able to get the rex working--turns out I needed 3 backslashes:

 

| inputlookup test_urls.csv | rex field=location_path "\\\Oshkosh Corporation\\\Defense\\\Oshkosh Defense\\\(?<location>\w+.*\,[\sA-Z]{3})" | table location

 

Output: 

location
New York, NY
Atlanta, GA
Atlanta, GA
Seattle, WA

 

strehb18
Path Finder

Yes! This worked for my data. Thank you for all the help. 

Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...