Splunk Search

How to match string which include random numbers with a lookup?

klischatb
Path Finder

Hello Team,
i have the following problem.

Inside my data i have a String like:
Error in Data | 5432323 from endpoint 543336
Error in Data | 1344214 from endpoint 543446
Error in Data | 1323214 from endpoint 545536

The field in Splunk is called: error_message.

The Goal is to filter these events out from the search results with a lookup.
So that when i dont want to see these messages in futher searches i can adapt the lookup.

The idea was something like
test.csv
check, error_message
true, Error in Data | * from endpoint *

| lookup test.csv error_message output check
| search check!=true

I tried the things from https://community.splunk.com/t5/Splunk-Search/Can-we-use-wildcard-characters-in-a-lookup-table/td-p/....
but this doesnt worked for me.

Thank you all.



 

Labels (1)
Tags (1)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

You probably realize that "doesn't work" conveys little information in most discussions, let alone when others do not share the same context and insight about your data and use case.  Please illustrate your wildcard lookup (anonymize as needed but leave accurate characteristics) AND output from your illustrated code, explain why that output is not desired.

One more suggestion: wildcard match is expensive, and to produce a table with wildcard as you illustrated is quite laborious.  There might be better ways to organize the lookup; done properly, it also increases readability (hence maintainability).  Using your example, you can do something like

| rex field=error_message mode=sed "s/Error in Data |\d+\s+(\D+)\s+\d+/\1/"

Then,  your lookup table can be simplified to

checkerror_message
trueError in Data |from endpoint

In fact, I bet that "Error in Data" is not of interest either.  If that is the case, you can further simplify lookup to:

checkof_interest
truefrom endpoint

And use this search

| rex field=error_message "s/Error in Data |\d+(?<of_interest>\D+)\d+/\1/"
| lookup test.csv of_interest output check
| search check!=true

 

View solution in original post

yuanliu
SplunkTrust
SplunkTrust

You probably realize that "doesn't work" conveys little information in most discussions, let alone when others do not share the same context and insight about your data and use case.  Please illustrate your wildcard lookup (anonymize as needed but leave accurate characteristics) AND output from your illustrated code, explain why that output is not desired.

One more suggestion: wildcard match is expensive, and to produce a table with wildcard as you illustrated is quite laborious.  There might be better ways to organize the lookup; done properly, it also increases readability (hence maintainability).  Using your example, you can do something like

| rex field=error_message mode=sed "s/Error in Data |\d+\s+(\D+)\s+\d+/\1/"

Then,  your lookup table can be simplified to

checkerror_message
trueError in Data |from endpoint

In fact, I bet that "Error in Data" is not of interest either.  If that is the case, you can further simplify lookup to:

checkof_interest
truefrom endpoint

And use this search

| rex field=error_message "s/Error in Data |\d+(?<of_interest>\D+)\d+/\1/"
| lookup test.csv of_interest output check
| search check!=true

 

klischatb
Path Finder

Hello @yuanliu,

thank you for your feedback, the tipp for writting better questions and your answer.

One last question regarding the provided solution:

I had the idea of creating a lookup and only entering text pieces to filter not needed messages out of the search results. The solution with the regex will only work on this specific events. For this reason, i would then have to create a regex for each message to be filtered.

Is there another possibility to match something like this?

thank you very much again for your answer.

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

First, the last example was incorrect.  It should be

| rex field=error_message "^Error in Data |\d+(?<of_interest>\D+)\d+$"
| lookup test.csv of_interest output check
| search check!=true

as there is no substitution.  The first example also should be corrected to

| rex field=error_message mode=sed "s/Error in Data\s+\|\s+\d+\s+(\D+)\s+\d+/\1/"

because the vertical bar is a special character in regex.

Note that the expression "^Error in Data\s+\ |\s+\d+(?<of_interest>\D+)\d+$" extracts "from endpoint" from "Error in Data | 5432323 from endpoint 543336" , as well as "Error in Data | 5432323 not from endpoint 543336", "Error in Data | 5432323 to endpoint 543336", "Error in Data | 5432323 is not in this space 543336", and so on and so forth.  The point is, you try to establish a pattern with regex.

And regex is very versatile and powerful.  You do have to study the data to find pattern, however.  For example, if you also want to also include "Error in Data | 5432323 from BD820F to 2F7A63 and back 543336" in a similar solution, where BD820F and 2F7A63 are two hexadecimals you want to replace with wildcards, you can update the solution to

| rex field=error_message mode=sed "s/Error in Data\s+\|\s+\d+\s+(.+)\s+\d+/\1/"
| rex field=error_message mode=sed "s/\s[\dA-F]+/ <hex>/g"
| lookup test.csv of_interest output check
| search check!=true

The entry to match  "Error in Data | 5432323 from BD820F to 2F7A63 and back 543336" in lookup would be "from <hex> to <hex> and back", without any change to any other entries in lookup.

This is a long way to say that in most cases, wildcard lookup can be replaced with regex.  Of course, wildcard exists for a reason.  And wildcard lookup does work.  You just need to define it correctly.  See Create a CSV lookup definition.

0 Karma
Get Updates on the Splunk Community!

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...