Monitoring Splunk

Will asterisks in the value of a field be treated as wild cards and affect automatic lookup performance?

RJ_Grayson
Path Finder

I have a field in one of my datasets labelled user. We perform automatic lookups globally based on the field user to return a variety of information pertaining to the user identified. Recently I noticed that when searching this particular index in anything other than Fast Mode the results would take an extremely long time to return. Upon further investigation I believe the cause of this is a combination of the automatic lookup and the fact that some of the user fields in the data set have the value *****.

The device that we're receiving logs from is masking the user field value with asterisks. When the Splunk search returns results it appears to be attempting to lookup ***** based on the automatic lookup and it is severely effecting the performance of the search. It's as if Splunk is interpreting the asterisks as wild cards and iterating over the entire lookup file (which is quite large).

For example. A five second period of time where none of the events include user = ***** return in 2.648 seconds when searching in Smart Mode and allowing the automatic lookup. A similar five second period of time that includes a single user = ***** field/value pair takes 8.549 seconds. Increase the search time frame and the performance difference becomes much greater.

A 1 hour search with Fast Mode and no field extractions or lookups: "This search has completed and has returned 3,162 results by scanning 3,162 events in 2.822 seconds"
The same 1 hour search with Smart Mode utilizing automatic lookups: "This search has completed and has returned 3,162 results by scanning 3,162 events in 256.991 seconds"

For the second test search with Smart mode enabled and automatic lookups the job inspector shows the duration of command.search.lookups as 1,413.19 seconds.

Interestingly enough all of the events with user = ***** are all given the same lookup value for user even though the value was originally all asterisks. This make me think that the automatic lookup is interpreting the asterisks as wild cards and defaulting to some seemingly random value from the lookup table. It also appears that it's iterating over the entire lookup table when encountering these asterisk filled fields.

Has anyone else seen something like this? Should Splunk be interpreting fields with asterisks in them as wild cards?

RJ_Grayson
Path Finder

I submitted a ticket with Splunk support and based on their preliminary examination of this issue they believe it may be a bug.

0 Karma

DalJeanis
Legend

Interesting. If you post the actual search, then we can optimize the solution.

It sounds like your lookup is returning various values which are then used to initiate a further search. You will need to clear those values with code like woodcock has posted... but without the search itself, we can't be sure whether it needs to be recoded like either of the following

| eval user=if(like(user,"*****"), "#####", user)

or

| eval user=if(like(user,"*****"), "*", user)

From what you have described, the first should result in no returned events, and the second (potentially) with the proper subset of the data, but that conclusion is highly speculative on my part.

0 Karma

woodcock
Esteemed Legend

This is a good question and I would open a ticket with Splunk support. In the meantime, you should be able to bypass the problem by adding this code to your search:

| eval user=if(like(user,"*****"), "#####", user)
Get Updates on the Splunk Community!

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation

As of Splunk Cloud Platform 9.3.2408 and Splunk Enterprise 9.4, classic dashboard export features are now ...

Explore the Latest Educational Offerings from Splunk (November Releases)

At Splunk Education, we are committed to providing a robust learning experience for all users, regardless of ...