Splunk Search

How to filter a field by common words

parkz
Explorer

I have a field of titles that are filled with sentences about why a test was failed in a security audit, but they are separated by each asset. So there can be two different assets with the same reason listed but in different words. For example, one might say "Login Password is empty" and another asset failure will say "Login password did not meet requirements". If I could aggregate them based on words like "password", I can get more value from the data. I can't hardcode it because I don't know all the possible aggregates.

Here is what I have so far, and I'm open to any feedback:

earliest=-1d@d latest=@d index=cdb_summary sourcetype=cfg_summary source=CDM_*_Daily_Summary
| search hva=*
| eval FailedSTIGs=mvsort(split(FailedSTIGs,","))
| stats values(fismaid) as fismaid dc(asset_id) as Affected by FailedSTIGs,hva
| lookup DHS_Expected_Checks "STIG ID" as FailedSTIGs output "Rule Title"
| fit TFIDF "Rule Title" as rule_tfidf ngram_range=1-12 max_df=0.6 min_df=0.2 stop_words=english | fit KMeans rule_tfidf* k=8 | fields cluster "Rule Title" | sample 6 by cluster | sort by cluster

Labels (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

This is similar to the varied logs from different applications that share common business and technology domains, just more "freehand".  We tried to "encourage" standardization but that only went so far.  I still couldn't predict what the developers would throw at me.  I had to manually tune my aggregation strategies, and update from time to time.

Ideally, you'll have a natural language model to deal with them.  Failing that, you can use ML to do some clustering and start tuning from there.  In all cases, this is going to be dynamic.

Get Updates on the Splunk Community!

Index This | Why did the turkey cross the road?

November 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  🚀 Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Feel the Splunk Love: Real Stories from Real Customers

Hello Splunk Community,    What’s the best part of hearing how our customers use Splunk? Easy: the positive ...