Is there an efficient way to search for entries co...

johnmccash · ‎12-14-2016

I'm not entirely certain exactly how the search optimization in Splunk works. Certainly, if I search only for a rare indexed word, then all entries that contain that word will be found quickly. But what if I want to search for a substring of a rare indexed word, which is itself rare. Say for the sake of argument that this rare substring only occurs in one indexed word. I can search for the substring bracketed by asterisks, but that seems to take significantly longer than the search for the rare indexed word that the substring is part of.

Is there an efficient way to do a search like this directly? Failing that, is there a way to list all indexed words that contain a common substring? If I had that list for a given substring, I could simply search for all instances of the indexed words that contain the substring.

Thanks

lguinn2 · ‎12-14-2016

Whenever a search term begins with a wildcard, the search will be particularly slow. Using wildcards forces Splunk to serially scan the lexicon to find any matching keywords for each bucket. Search terms that end with a wildcard are not as slow as search terms that begin with a wildcard.

If Splunk knows the exact search term, it can use the index to find it directly. It can also use bloom filters to eliminate many buckets from the search. Bloom filters do not work with wildcards.

There is no way to use a wildcard while avoiding the performance penalty of using a wildcard.

However, if you can narrow the search by including additional terms, that will help. For example, be sure to specify the index and the sourcetype. Also, use as narrow a timerange as possible for your search. Anything that helps Splunk reduce the number of buckets to scan, will be good.

johnmccash · ‎12-15-2016

So.... Say I'm searching for the string "bcdefghijklmnop", which occurs exactly once in my entire (large) dataset. The one time this occurs, it does so as " abcdefghijklmnopq " (but, of course, I don't know what the leading & trailing characters are). Are you saying that the only way to find this instance is to search for

"*bcdefghijklmnop*"

?

Thanks

Is there an efficient way to search for entries containing a specific substring of an indexed word?

Upcoming Webinar: Unmasking Insider Threats with Slunk Enterprise Security’s UEBA

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

Join the Conversation

Is there an efficient way to search for entries containing a specific substring of an indexed word?

Upcoming Webinar: Unmasking Insider Threats with Slunk Enterprise Security’s UEBA

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey