Monitoring Splunk

Performance of using wildcard in query

imosquera
Explorer

I was wondering what the performance was of using a wildcard in a query. Specifically for the following:

source="/mnt/logs/*/debug.log"

OR a query containing a custom field:

uri_path="/v1/*"

Tags (1)
1 Solution

sideview
SplunkTrust
SplunkTrust

For the uri_path field, basically the server will have to get every event off disk so as to run the field extraction for uri_path and then check to see if it begins with "/v1/".

source on the other hand, along with sourcetype and host, behave a little differently. Since these are not just indexed fields but special fields backed by special metadata stores, the server will actually check the metadata for all the values of source and end up only requesting the source values that match the wildcard expression.

You can verify both of these with the following technique. In the following, "scanCount" is the number of events that Splunkd retrieved from disk, and "eventCount" is the number of events that were retrieved that ended up matching the search terms.

1) Find a search that just uses host or sourcetype values with no wildcards and tack | head 100000 on the end.

2( Click the little "i" icon to open the Job Inspector and scroll down until you see "scanCount" and "eventCount". Confirm that they are both 100,000 (although scanCount will usually be a couple thousand more. this is normal)

3) Now add your searchterm with the wildcard to the initial search clause, keeping the | head 100000 on the end. Now look into the Job Inspector. You'll see that for the wildcarded source terms, scanCount and eventCount are the same. For the wildcarded uri_path field, the scanCount will be significantly greater, because it had to keep getting all the possible matches off disk until it got to 100,000 rows that matched "/v1/"

View solution in original post

imosquera
Explorer

Thanks for your quick and thorough response!

0 Karma

sideview
SplunkTrust
SplunkTrust

For the uri_path field, basically the server will have to get every event off disk so as to run the field extraction for uri_path and then check to see if it begins with "/v1/".

source on the other hand, along with sourcetype and host, behave a little differently. Since these are not just indexed fields but special fields backed by special metadata stores, the server will actually check the metadata for all the values of source and end up only requesting the source values that match the wildcard expression.

You can verify both of these with the following technique. In the following, "scanCount" is the number of events that Splunkd retrieved from disk, and "eventCount" is the number of events that were retrieved that ended up matching the search terms.

1) Find a search that just uses host or sourcetype values with no wildcards and tack | head 100000 on the end.

2( Click the little "i" icon to open the Job Inspector and scroll down until you see "scanCount" and "eventCount". Confirm that they are both 100,000 (although scanCount will usually be a couple thousand more. this is normal)

3) Now add your searchterm with the wildcard to the initial search clause, keeping the | head 100000 on the end. Now look into the Job Inspector. You'll see that for the wildcarded source terms, scanCount and eventCount are the same. For the wildcarded uri_path field, the scanCount will be significantly greater, because it had to keep getting all the possible matches off disk until it got to 100,000 rows that matched "/v1/"

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...