Solved: How to extract bunch of UUIDs from a string using ...

Splunk_321 · ‎12-02-2022

Hi,

I have a string in splunk logs something like below.

msg.message="Matches Logs :: Logger{clientId='hFKfFkF-K7jlp5epzCnZASazoYmXxgUzBLQ8cixb7f23afb8', apiName='Matches', apiStatus='Success', error='NA', locationIdMerchDetail=[6d65fcb6-8885-4f56-93c1-7050c8bef906 :: QUALITY COLLISION 1 LLC :: 1, e5ff5b47-839c-4ed0-86a3-87fc18f4bfda :: P A JOLLY'S LLC :: 2, 2053428f-f6ba-4038-a03e-4dbc8737c37d :: CREATIVE EXCELLENCE SALON LLC :: 3, c3e9e6fc-8388-49fd-ba7b-3b9d76f5f9ea :: QUALITY SERVICES AND APP :: 4, 75ca5712-f7a1-4a63-a69f-d73c8e7d187b :: FREEDOM COMICS LLC :: 5, e87a96e8-de73-47f8-bfbd-6099c83376f7 :: S AND G STORES LLC :: 6, 732f9d61-3916-4664-9601-dd0745b68837 :: QUALITY RESALE :: 7, d666bef7-e2fa-498f-a74f-e80f6d2701e7 :: CAKE ART SUPPLIES LLC :: 8, 23ca4856-5908-4bd6-b90d-cace07036b05 :: INTUIT PAYMENT SOLUTIONS, LLC :: 9, b583405f-bb3d-4dba-9bb3-ee9b3713b8f7 :: LA FIESTA TOLEDO LLC :: 10], numReturnedMatches='10'}"

My string contains locationIdMerchDetail as highlighted above. I need to extract locationId, rank into table first item being locationid and last item being rank in every comma separated item.

Ex: In 6d65fcb6-8885-4f56-93c1-7050c8bef906 :: QUALITY COLLISION 1 LLC :: 1

locationId : 6d65fcb6-8885-4f56-93c1-7050c8bef906

rank: 1

I am able to extract locationIds into table, using below query, but not sure how to include corresponding rank

######################################

index=app_pcf AND cf_app_name="credit-analytics-api" AND message_type=OUT AND msg.logger=c.m.c.d.MatchesApiDelegateImpl | rex field=msg.message "(?<LocationId>[0-9a-f]{8}-([0-9a-f]{4}\-){3}[0-9a-f]{12})" | table LocationId

######################################

I want a table something like below.

LocationId rank

6d65fcb6-8885-4f56-93c1-7050c8bef906 1

e5ff5b47-839c-4ed0-86a3-87fc18f4bfda 2

2053428f-f6ba-4038-a03e-4dbc8737c37d 3

..............................................................................................................

and so on

Any regex to filter these into table.

Please help.

ITWhisperer · ‎12-02-2022

Assuming the "extra" comma after "PAYMENT SOLUTIONS" is just a typo, you could try this:

| rex max_match=0 "locationIdMerchDetail=\[(?<details>[^\]]+?)\]"
| eval locationIdRank=split(details,",")
| mvexpand locationIdRank
| rex field=locationIdRank "(?<locationId>\S+).*(?<rank>\d+)"
| table locationId rank

View solution in original post

yuanliu · ‎12-02-2022

As always, I advise not to use string manipulation to handle structured data like this. If you talk with your developer, they'll tell you that this is a clever way to embed a JSON object into the log. I'm sure you already have a field named msg.messages. If you unwrap the JSON object, you'll get all fields that your developers want you to use. Not only data set is richer, but you'll not worry about future changes that developers may make to their logs.

This is how you can do it.

| eval logs = replace('msg.message', "^[^{]+", "") ``` retain embedded JSON only ```
``` unwrap several levels of escapements below ```
| eval logs = replace(logs, " *([^{=,]+)=", "\"\1\":")
| eval logs = replace(logs, ":'([^']+)'([,}])", ":\"\1\"\2")
| eval logs = replace(logs, " *([^\[:]+::[^:]+::[^,\]]+)([,\]])", "\"\1\"\2")
``` | eval valid = if(json_valid(logs), "yes", "not") ```
| spath input=logs
| mvexpand locationIdMerchDetail{}
| eval locationIdMerchDetail = split('locationIdMerchDetail{}', " :: ")
| eval locationId = mvindex(locationIdMerchDetail, 0)
| eval MerchDetail = mvindex(locationIdMerchDetail, 1)
| eval rank = mvindex(locationIdMerchDetail, 2)

Splunk_321 · ‎12-04-2022

Great approach to filter the required data. Appreciate your help. Btw, I was the developer who is new to splunk😄

yuanliu · ‎12-02-2022

With that, your sample data gives

MerchDetail	apiName	apiStatus	clientId	error	locationId	numReturnedMatches	rank
QUALITY COLLISION 1 LLC	Matches	Success	hFKfFkF-K7jlp5epzCnZASazoYmXxgUzBLQ8cixb7f23afb8	NA	6d65fcb6-8885-4f56-93c1-7050c8bef906	10	1
P A JOLLY'S LLC	Matches	Success	hFKfFkF-K7jlp5epzCnZASazoYmXxgUzBLQ8cixb7f23afb8	NA	e5ff5b47-839c-4ed0-86a3-87fc18f4bfda	10	2
CREATIVE EXCELLENCE SALON LLC	Matches	Success	hFKfFkF-K7jlp5epzCnZASazoYmXxgUzBLQ8cixb7f23afb8	NA	2053428f-f6ba-4038-a03e-4dbc8737c37d	10	3
QUALITY SERVICES AND APP	Matches	Success	hFKfFkF-K7jlp5epzCnZASazoYmXxgUzBLQ8cixb7f23afb8	NA	c3e9e6fc-8388-49fd-ba7b-3b9d76f5f9ea	10	4
FREEDOM COMICS LLC	Matches	Success	hFKfFkF-K7jlp5epzCnZASazoYmXxgUzBLQ8cixb7f23afb8	NA	75ca5712-f7a1-4a63-a69f-d73c8e7d187b	10	5
S AND G STORES LLC	Matches	Success	hFKfFkF-K7jlp5epzCnZASazoYmXxgUzBLQ8cixb7f23afb8	NA	e87a96e8-de73-47f8-bfbd-6099c83376f7	10	6
QUALITY RESALE	Matches	Success	hFKfFkF-K7jlp5epzCnZASazoYmXxgUzBLQ8cixb7f23afb8	NA	732f9d61-3916-4664-9601-dd0745b68837	10	7
CAKE ART SUPPLIES LLC	Matches	Success	hFKfFkF-K7jlp5epzCnZASazoYmXxgUzBLQ8cixb7f23afb8	NA	d666bef7-e2fa-498f-a74f-e80f6d2701e7	10	8
INTUIT PAYMENT SOLUTIONS, LLC	Matches	Success	hFKfFkF-K7jlp5epzCnZASazoYmXxgUzBLQ8cixb7f23afb8	NA	23ca4856-5908-4bd6-b90d-cace07036b05	10	9
LA FIESTA TOLEDO LLC	Matches	Success	hFKfFkF-K7jlp5epzCnZASazoYmXxgUzBLQ8cixb7f23afb8	NA	b583405f-bb3d-4dba-9bb3-ee9b3713b8f7	10	10

ITWhisperer · ‎12-02-2022

Assuming the "extra" comma after "PAYMENT SOLUTIONS" is just a typo, you could try this:

| rex max_match=0 "locationIdMerchDetail=\[(?<details>[^\]]+?)\]"
| eval locationIdRank=split(details,",")
| mvexpand locationIdRank
| rex field=locationIdRank "(?<locationId>\S+).*(?<rank>\d+)"
| table locationId rank

How to extract bunch of UUIDs from a string using regex?

field extraction

regex

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes

Welcome to the Splunk Community!