Hello!
We use Splunk cloud platform for logging.
We wanted to know how we can find highly recurring events.
We have many different types of logs with some different values for the properties logged so creating a reg-ex for a search query isnt feasible.
There are several types of these logs and we want to group similar logs together to get a count for which ones occur the most, and what the content of the logs look like.
We tried using cluster in the search queries, but even with a low threshold of t=0.2 all the events end up with their own cluster label.
Any suggestions?
Are you looking for the same "event" i.e. event with the same values for particular fields all within the same sourcetype, or could the same value set occur across multiple sourcetypes?
Can you create a hash of the values and look for the same hash value across your dataset(s)?
So as an example, we have a log with this content
Missing a resource in Resources for key <key>.
Where <key> is a different value (several dozen options), the source, host, and sourcetypes are also different because we have 80+ hosts, 100+ sources, and several sourcetypes.
Logs for one particular key occur 5 million type.
Another example of logs would be one that contains the content
Ticket-XXXX: getAdmins
and then has different hosts, sources, and sourcetypes.
So we're looking for something that can get us the most frequent logs disregarding the source/host/sourcetype
Hi @sabbas
It sounds like the "cluster" command might help you here, check out https://docs.splunk.com/Documentation/Splunk/9.4.2/SearchReference/Cluster
Heres a an example to get you started:
index=_internal source=*splunkd.log* log_level!=info | cluster showcount=t | table cluster_count _raw | sort -cluster_countYou can set t=<decimal> with a value between 0.1 and 1.0 which is sensitivity of the clustering.
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing