Splunk Search

How to find highly recurring events

sabbas
Explorer

Hello!

We use Splunk cloud platform for logging.

We wanted to know how we can find highly recurring events.

We have many different types of logs with some different values for the properties logged so creating a reg-ex for a search query isnt feasible.

There are several types of these logs and we want to group similar logs together to get a count for which ones occur the most, and what the content of the logs look like.

We tried using cluster in the search queries, but even with a low threshold of t=0.2 all the events end up with their own cluster label.

Any suggestions?

Labels (2)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Are you looking for the same "event" i.e. event with the same values for particular fields all within the same sourcetype, or could the same value set occur across multiple sourcetypes?

Can you create a hash of the values and look for the same hash value across your dataset(s)?

0 Karma

sabbas
Explorer

So as an example, we have a log with this content

Missing a resource in Resources for key <key>.

Where <key> is a different value (several dozen options), the source, host, and sourcetypes are also different because we have 80+ hosts, 100+ sources, and several sourcetypes. 

Logs for one particular key occur 5 million type.

Another example of logs would be one that contains the content

Ticket-XXXX: getAdmins

and then has different hosts, sources, and sourcetypes.

So we're looking for something that can get us the most frequent logs disregarding the source/host/sourcetype

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @sabbas 

It sounds like the "cluster" command might help you here, check out https://docs.splunk.com/Documentation/Splunk/9.4.2/SearchReference/Cluster

Heres a an example to get you started:

index=_internal source=*splunkd.log* log_level!=info | cluster showcount=t | table cluster_count _raw | sort -cluster_count

livehybrid_0-1755552339301.png

You can set t=<decimal> with a value between 0.1 and 1.0 which is sensitivity of the clustering.

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

 

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...