I'm new in SPLUNK. Just wanted to ask for an advice :). Currently, I have 11,000 ticket data and I'm trying to filter the most common events/issues/words on it. I am trying the use of cluster, regex and lookup.
What do you think is the best approach for this?
Thank you in advance everyone. 🙂
Hi Skalli. You might want to have a play around with the these two apps.
NLP Text Analytics - https://splunkbase.splunk.com/app/4066/ - A collection of bits a pieces to do text analysis based around NLTK3.3 and Splunk's MLTK.
NLP Natural Language Toolkit - NLTK wrapper - https://splunkbase.splunk.com/app/4057/ - Another wrapper for some of the same python libraries for Natural Language Processing.
Should be able to get the job done, not sure how well at large scale but 11k records is not much.
Hey and welcome to the Splunk community. 🙂
First of all, the answers to your questions have a "depends" in it. If your data is in an easy structure to onboard, you might want to start reading and working through the docs: getting data in. After the data is onboarded correctly, the next thing would be to build field extractions based on the events. For this, you can use the field extractor. After you have built your fields, you can easily filter on those with something more simple like
index=yourIndex sourcetype=yourSourcetype |top your_desired_field1, field2 ....
Hi Skalli! thank you for your answer. It was not a simple unfortunately.. 😞 I'll give a sample data below:
Can you please reset my password?
Password Reset request
Unable to open my account
Please help! Can't access my account.
Can't connect to Wifi
Reset my Password
... and so on.
I wanted to automatically filter the 11,000 data on what is the most frequent words. thanks 🙂