I want to get my event patterns to be recognized automatically. The pattern is not uniform but Splunk should identify any small difference in the events and should give the trend or count of the patterns over time. How can I achieve this?
There is a very simple way of doing this - In your event, there is a default field called
This seem like some alien language which is not understandable at the first look. But its very helpful one. How it works is - in a event, it strips all letters, numbers and replace Whitespace with the Underscore. Leaving just the
Best part is this field gets extracted by Splunk automatically.
We can directly separate a specific type of events belonging to to specific pattern. We use the punct field to find anomalies in data.
For example, If 99% of your events are like this
____::__:________...___ and 1% look like this
..._-_-_[//:::]_"_//.?=__."___"://../.?=&=-"_"/._( then we can easily find the odd one out (undesired one) using this field.
This will show the count of patterns among your events. All events of same patterns will be grouped.
It is a fantastic way to quickly point you to the outliers that didn't match the pattern you expected.
Very helpful in finding anomalous event among large data set OR writing complex regex's for field extraction to ensure all events are covered.
more information about punct is here. I hope this answers your question. 🙂 Thank you - Saurabh
@MousumiChowdhury - hope this answers your question as this way you dont have to write a custom search and you can use a default fields to get the pattern matching. if it supports your question, please accept this answer.
I have used the below query to find the pattern recognition which is working fine for me:
index=<index> | cluster t=0.7 labelonly=t | findkeywords labelfield=cluster_label | table sampleEvent percentInInputGroup | sort - percentInInputGroup
I have tried using cluster. Below is my query:
index=<index> | cluster showcount=t t=0.7 labelonly=t | table _time cluster_count cluster_label _raw | dedup 1 cluster_label | sort - cluster_count cluster_label _time | chart values(cluster_count) as count by _raw | sort limit=20 - count
Is this a correct approach to find the latest patterns that have occurred the most?