Splunk Search

Capture similar strings in the logs

poojak2579
Path Finder

Is there any way to search for similar strings dynamically in different  logs?
I want to group unique error string coming from different logs.
Events are from different application having different logging format.
I am creating a report that shows count of events for all the unique error string.


Sample Events:
error events found for key a1
Invalid requestTimestamp abc
error event found for key a2
Invalid requestTimestamp def
correlationID - 1234 Exception while calling some API ...java.util.concurrent.TimeoutException
correlationID - 2345 Exception while calling some API ...java.util.concurrent.TimeoutException

Required results:
I am looking for the following stats from the above error log statements
1) Invalid requestTimestamp - 2
2) error events found for key - 2
3) Exception while calling some API ...java.util.concurrent.TimeoutException -2

0 Karma

bowesmana
SplunkTrust
SplunkTrust

You can do this in a number of ways.

1. Use a lookup definition based on a lookup file with the messages you want to match.

  • Create a CSV lookup with the matches you are interested in and prefix/suffix them with *, e.g.
| makeresults format=csv data="match
*error events found for key*
*Invalid requestTimestamp*
*Exception while calling some API ...java.util.concurrent.TimeoutException*"
| outputlookup matches.csv
  • Set up a look definition based on that CSV and in Advanced options, define match type as WILDCARD(match) 

Then in your search do

your search...
| lookup matches match as message OUTPUT match
| where isnotnull(match)
| stats count by match

So, you can see this actually working like this

| makeresults format=csv data="message
error events found for key a1
Invalid requestTimestamp abc
error event found for key a2
Invalid requestTimestamp def
correlationID - 1234 Exception while calling some API ...java.util.concurrent.TimeoutException
correlationID - 2345 Exception while calling some API ...java.util.concurrent.TimeoutException"
| lookup matches match as message OUTPUT match
| where isnotnull(match)
| stats count by match

Note that this actually produces on a SINGLE match for the error events found because your second example was event, not events.

There are other ways to do the same, but it depends on what your trying to do.

Note that your lookup can contain additional fields you could output, e.g. a description, which you could OUTPUT instead to report on.

Also note that wildcard is *, so put the wildcard where you want it and it will match anything between.

0 Karma

poojak2579
Path Finder

Thanks for looking into it.
I think your solution will work if there is any specific set of errors but in my case there is no specific list of errors. Errors are from different logs with different logging format

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

You could try something like this

| stats count(eval(match(_raw, "Invalid requestTimestamp"))) as IrT count(eval(match(_raw, "error events found for key"))) as eeffk count(eval(match(_raw, "Exception while calling some API ...java.util.concurrent.TimeoutException"))) as toe
0 Karma

poojak2579
Path Finder

Thanks for the response, this will not work as I am not searching for any specific text
I just shared the sample, it can be anything.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

So you want to match any "string" in any event with any other event and count the number of matches? Apart from this being extremely vague, what is it that you are attempting to determine? What are the boundary conditions for determining which strings to try and match? What do you want to do if an event has more than one "string" which matches other strings in other events, do you double count the events?

0 Karma

poojak2579
Path Finder

 I want to group unique error string coming from different logs.
Events are from different application having different logging format.
I am creating a report that shows count of events for all the unique error string.

Boundary condition for determining which string to match:
All the events that have "error" keyword in the log statement

0 Karma

bowesmana
SplunkTrust
SplunkTrust

So, you want to find any event that has the word error in the _raw event and then somehow create some kind of grouping of those events.

Your requirement is way to vague to be able to do grouping because there is no way for any of us to tell you how you can group your messages without some knowledge of your data.

Other than the basic 

index=* error
| stats count by _raw

which is probably next to useless, as you will get 1 for all errors.

You could try using the cluster command, e.g.

index=Your_Indexes error 
| cluster showcount=t 
| table cluster_count _raw 
| sort -cluster_count

which will attempt to cluster your data - see here for the command description

https://docs.splunk.com/Documentation/Splunk/9.3.2/SearchReference/Cluster

 

poojak2579
Path Finder

@bowesmana Sorry for the delay in response, I was on vacation.
 Thanks for sharing the cluster command,  I tried but it is not giving me the required result or I am not using it correctly. I shared only one part of the requirement. Actually the requirement is to compare two days' logs (today and yesterday) coming from different apps and trigger alert whenever there is a new error. There is no specific error pattern or fields to identify errors, we need to look for the keywords "Error/Fail/Timeout in the logs." I am trying to identify similar phrases in error logs and store the unique error text in a  lookup file  and then match it with the next day's data to identify new error log.

Query :
index="a" OR index="b" (ERROR OR TIMEOUT OR FAIL OR EXCEPTION)

0 Karma

bowesmana
SplunkTrust
SplunkTrust

You are trying to get some kind of unstructured learning going on - the cluster command will give you what it thinks are common repetitions of events, how you then take its output to manage your requirement is really beyond the scope of this type of forum.

If you give the search you specified and then cluster it accordingly you will be getting some results - I assume you are not getting nothing.

So, given that output it's really impossible for us to say how you can massage what you are getting to save that into some kind of lookup for the following day.

But if you search at midnight for your error/warn/timeout events, cluster the responses, massage that data and store it as a lookup file, then the searches you subsequently run you will also have to cluster those results, massage that data then perform a lookup against the previously generated lookup.

Without knowing why the results are not what you require - perhaps you could give an example of what is given and why it does not match your requirements - it's hard to say where to go from here.

Anyway, if you can make some concrete progress with the suggestions so far given, I am sure we can continue to help you get to where you are trying to get to.

 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

You could try to play with punct field. I'm quite sure that it's not exactly what you are looking for but maybe it helps you to find those similarities and you can go forward with those some other ways.

See: punct

r. Ismo

0 Karma

poojak2579
Path Finder

Thanks for the reply.
I dont think punct will work for my requirement as I am creating an alert so its not one time thing  , but thanks for looking into it.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Not out of the box. Maybe you could do something like that with MLTK but I've never tried it.

0 Karma

poojak2579
Path Finder

sure, thank you.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

.conf25 Global Broadcast: Don’t Miss a Moment

Hello Splunkers, .conf25 is only a click away.  Not able to make it to .conf25 in person? No worries, you can ...

Observe and Secure All Apps with Splunk

 Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

What's New in Splunk Observability - August 2025

What's New We are excited to announce the latest enhancements to Splunk Observability Cloud as well as what is ...