Dashboards & Visualizations

Extracting first and last pattern from a string using regex

cindygibbs_08
Communicator

Hello People I hope everyone is doing just fine,

I have been trying to extract some values from a field without any luck. I work for a hotel company and whenever a customer uses our transportation services a field named "travel_stops" is recorded and updated. This field due to the way is "programmed" will always come in the following format:

 

 

 

.Madrid-plane.taxi$comp$uber.domestic$depart 

 

 

 

please note that the string will always start with a dot (.)

in other words every "stop" is separated by a (.) dot I want to be able to extract for each value in this field the first and last stops which for this case will be Madrid-plane and domestic$depart 

I am trying to use something like this:

 

 

 

| rex field=Stops "(?<First_Stop>[^.]+).(?<Last_Stop>.+)"

 

 

 

I also have another "setback" and it if when a custumer is awaiting his/her first transportation service, dependiong on the type of custumer the system may record something like "awaiting","push","never" ect.. and in those cases I want to be able to leave that record as it is... since I will be doing some extra work on those fiedls..

What I want:

travel_stopsFirst_StopLast_Stop
.Madrid-plane.taxi$comp$uber.domestic$departMadrid-planedomestic$depart
nevernevernull
pullpullnull


but is not givving me the expected result thank you for much to anyone who is willing to help me out! Like I truly appreciate it. also, if you have a have a link to a blog on regex in Splunk that will be so much appreciated as I will be using more of these in the future 

 Kindly,

Cindy!

Labels (1)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @cindygibbs_08,

if you have so different situations, you could try to use two regexes and the eval command, something like this:

| rex field=MenuFinal "(?<First_Stop>\.[^\.]+)\.[^\.]+\.(?<Last_Stop>.+)"
| rex field=MenuFinal "^(?<First_Stop1>\w+)(\.|$)"
| eval 
     First_Stop=coalesce(First_Stop,First_Stop1), 
     Last_Stop=if(isnull(Last_Stop),"N.A:", Last_Stop)

have a good evening

Ciao.

Giuseppe

View solution in original post

s2_splunk
Splunk Employee
Splunk Employee

Many ways to do it, here's another one: 

 

| makeresults 
| eval noTrip = "^never|pull|push$" 
| eval Stops=".madrid-plane.taxi$tour.Home&depart push pull never" 
| makemv delim=" " Stops | mvexpand Stops
| rex field=Stops "(?<First_Stop>\.[^\.]+)\.[^\.]+\.(?<Last_Stop>.+)" 
| eval First=if(match(Stops, noTrip), Stops, First_Stop) 
| eval Last=if(match(Stops, noTrip), "N/A", Last_Stop) 
| fields Stops, First, Last

 

The first four lines are there to generate some test data, so you can run this search without pulling actual data. Here's what it should produce: 

Screen Shot 2021-05-03 at 10.48.43 AM.png

Happy Splunking! 

cindygibbs_08
Communicator

this is amazing thank you so much so so much

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @cindygibbs_08,

please try this:

| rx field=MenuFinal "(?<First_Stop>\.[^\.]+)\.[^\.]+\.(?<Last_Stop>.+)"

that you can test at https://regex101.com/r/imkXTy/1

Ciao.

Giuseppe

cindygibbs_08
Communicator

Ciao caro grazie mille per il tuo aiuto!.. excuse my italian... I am using your code but then I have values like "awaiting" or "pull" is giving me nothing....

Thank you for your help I have a question what would change if the delimitator character will be a "?" question mark... 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @cindygibbs_08,

thank you for you greetings in Italian!

Anyway, if you could share a sample for each case I could better analyze your needs.

Ciao.

Giuseppe

0 Karma

cindygibbs_08
Communicator

OMG thank you so much!

I want to be able to extract the first stops and the last stop for each costumer in the hotel agency.

The values look like this: 

.madrid-plane.taxi$tour.Home&depart 

as you can see they are separated by a comma, but I want to be able to extract only the first and last so for my laste example I would get 

first=madrid-plane 
last=Home&depart 

but sometimes the values in the field can look like this:


pull
push
never

since they dont have a comma.. or anything such as I would just like them to remain as they are...s basically I want a table like this

StopsFirstLast
.madrid-plane.taxi$tour.Home&departmadrid-planeHome$depart
pullpullN.A
pushpushN.A

 

Grazie mille mio amato amico

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @cindygibbs_08,

if you have so different situations, you could try to use two regexes and the eval command, something like this:

| rex field=MenuFinal "(?<First_Stop>\.[^\.]+)\.[^\.]+\.(?<Last_Stop>.+)"
| rex field=MenuFinal "^(?<First_Stop1>\w+)(\.|$)"
| eval 
     First_Stop=coalesce(First_Stop,First_Stop1), 
     Last_Stop=if(isnull(Last_Stop),"N.A:", Last_Stop)

have a good evening

Ciao.

Giuseppe

Get Updates on the Splunk Community!

Splunk App for Anomaly Detection End of Life Announcment

Q: What is happening to the Splunk App for Anomaly Detection?A: Splunk is officially announcing the ...

Aligning Observability Costs with Business Value: Practical Strategies

 Join us for an engaging Tech Talk on Aligning Observability Costs with Business Value: Practical ...

Mastering Data Pipelines: Unlocking Value with Splunk

 In today's AI-driven world, organizations must balance the challenges of managing the explosion of data with ...