Splunk Search
Highlighted

What is the best way to find searches without sourcetype or index defined?

Ultra Champion

I know that indexed fields accelerate search performance. Many searches take advantage of this with host, source, and _time, but users new to Splunk often overlook embracing index or sourcetype.

What is the best way you've found for identifying existing searches lacking such an index or sourcetype definition?

Highlighted

Re: What is the best way to find searches without sourcetype or index defined?

Ultra Champion

The Splunk Product Best Practices team provided this response. Read more about How Crowdsourcing is Shaping the Future of Splunk Best Practices.

I've had luck running a search for searches that don't already refer to a sourcetype or index.

index=_audit sourcetype=audittrail action=search search="search *" ( search!="* sourcetype*" OR search!="* index*" ) search_type="ad hoc"
| stats first(search) AS search BY search_id

Or, you can explore your saved searches just the same with:

| rest /services/saved/searches
| search search!="*sourcetype=*" OR search!="*index=*" search!="|*"
| table search

In either approach, remember that sometimes the sourcetype or index IS defined, but is abstracted because it is defined within a macro or as part of an eventtype or tag.

Highlighted

Re: What is the best way to find searches without sourcetype or index defined?

SplunkTrust
SplunkTrust

In Alerts for SplunkAdmins or github I have a few alerts for this under Search Head Level - Non Best-Practices

In particular
SearchHeadLevel - User - Dashboards searching all indexes
SearchHeadLevel - Scheduled searches not specifying an index

They could be adapted for not specifying a sourcetype, audit logs as SloshBurch (Burch) mentioned will work for ad-hoc searches as well.

SearchHeadLevel - User - Dashboards searching all indexes

| rest /servicesNS/-/-/data/ui/views 
| search `comment("A dashboard searching all indexes is an issue just like a scheduled search querying all indexes or using the index=* trick")` eai:data=*query*
| regex eai:data="<search.*" 
| rex field=eai:data "(?P<theSearch><search(?!String)[^>]*>[^<]*<query>.*?)<\/query>" max_match=200 
| mvexpand theSearch 
| rex field=theSearch "<search(?P<searchInfo>[^>]*)>[^<]*<query>(?P<theQuery>.*)" 
| search `comment("If we are seeing post process search then we don't want to check if it has index= because that is likely only in the base query. These are also various exclusions for legitimate searches that will not involve scanning all indexes, such as rest or a savedsearch or similar")` searchInfo!="*base*"
| rename eai:appName AS application, eai:acl.sharing AS sharing, eai:acl.owner AS owner, label AS name
| table theQuery, application, owner, sharing, name, splunk_server, title
| regex theQuery!="index\s*=(?!\s*\*)" 
| regex theQuery!="^(\()?\s*(\`|\$[^|]+\$|eventtype=|<!\[CDATA\[\s*\|\s*((acl)?inputlookup|rest) |\|)"
| rex field=theQuery "^(?P<exampleQueryToDetermineIndexes>[^\|]+)"
| eval exampleQueryToDetermineIndexes=exampleQueryToDetermineIndexes . "| stats values(index) AS index | format | fields search | eval search=replace(search,\"\\)\",\"\"), search=replace(search,\"\\(\",\"\"), search=if(search==\"NOT \",\"No indexes found\",search)"

SearchHeadLevel - Scheduled searches not specifying an index

| rest /servicesNS/-/-/saved/searches
| search `comment("Look over all scheduled searches and find those not specifying/narrowing down to an index, or using the index=* trick")`
| table title, eai:acl.owner, description, eai:acl.app, qualifiedSearch, next_scheduled_time
| search next_scheduled_time!="" 
| regex qualifiedSearch!=".*index\s*(!?)=\s*([^*]|\*\S+)" 
| regex qualifiedSearch="^\s*search "
| regex qualifiedSearch!="^\s*search\s*\[\s*\|\s*inputlookup"
| rex field=qualifiedSearch "^(?P<exampleQueryToDetermineIndexes>[^\|]+)"
| regex exampleQueryToDetermineIndexes!="\`"
| eval exampleQueryToDetermineIndexes=exampleQueryToDetermineIndexes . "| stats values(index) AS index | format | fields search | eval search=replace(search,\"\\)\",\"\"), search=replace(search,\"\\(\",\"\"), search=if(search==\"NOT \",\"No indexes found\",search)"
| rename eai:acl.owner AS owner, eai:acl.app AS Application

View solution in original post

Highlighted

Re: What is the best way to find searches without sourcetype or index defined?

Ultra Champion

Oh nice! And great plug for your app!

Would you be ok with sharing either of the specific searches here so we can learn from you expertise? I ask before I got and copy/paste it myself as I wouldn't want to overstep...

Highlighted

Re: What is the best way to find searches without sourcetype or index defined?

SplunkTrust
SplunkTrust

Updated to include them, copy and pasting is fine, the app was designed to be shared.
One of the original goals was to get some of the searches back into the monitoring console but I think they have gone beyond that level of complexity!

I have stripped some of the macros from the copy & paste excluding the comment macro

0 Karma
Highlighted

Re: What is the best way to find searches without sourcetype or index defined?

Ultra Champion

Great addition! I think the most critical thing you captured is that, in reality, trying to pin down ad hoc searches might not be an effective use of your time as compared to the saved items that surely will be run, like scheduled, and dashboards.

Furthermore, your proper use of /servicesNS/-/-/saved/searches does right by working around the namespace constraints - I lazily overlooked that.

I'll likely revert back with a new answer that takes this to our next level of index OR sourcetype.

But for now, switching the accepted answer to what you provided! Great job!

0 Karma
Highlighted

Re: What is the best way to find searches without sourcetype or index defined?

Ultra Champion

How's this revision for the Scheduled Searches part?

| rest /servicesNS/-/-/saved/searches
| fields qualifiedSearch, next_scheduled_time, title, eai:acl.owner, eai:acl.app
| where match( qualifiedSearch , "^\s*search\s*" )
| rex field=qualifiedSearch "^(?<base_search>search[^\|\[]+)"
| eval 
    check-sourcetype = if( match( base_search , "\s+sourcetype\s*=" ) , "defined" , "missing" ) ,
    check-index = if( match( base_search , "\s+index\s*=" ) , "defined" , "missing" ) ,
    check-hidden = if( match( base_search , "\s+((tag|eventtype)\s*=|\`)" ) , "defined" , "missing" ) ,
    check-scheduled = if( match( next_scheduled_time , ".+" ) , "defined" , "missing" )
| rename eai:acl.* AS namespace-*
| search ( check-sourcetype="missing" OR check-index="missing" ) check-hidden="missing" check-scheduled="missing" namespace-owner!="nobody"
| table check-index, check-sourcetype, base_search, namespace-*

The end of the search is where folks can tweak to add things back in. But the way it's written,

  • namespace-owner!="nobody" means it's limited to items saved by real users since it isn't effective to update 'nobody' owned searches that came with the product.
  • check-scheduled="missing" filters out unscheduled searches. This could be toggled since dashboards might reference a saved, but unscheduled, search. Albeit rare.
  • check-hidden="missing" filters out searches using tag, eventtypes, and macros - things where index or sourcetype might be defined

As of now, this doesn't do much for subsearches but I saw that logic in what you posted.

Whaddya think? Any suggestions to change?

0 Karma
Highlighted

Re: What is the best way to find searches without sourcetype or index defined?

SplunkTrust
SplunkTrust

Perhaps something like:

| rest /servicesNS/-/-/saved/searches
 | fields qualifiedSearch, next_scheduled_time, title, eai:acl.owner, eai:acl.app
 | where match( qualifiedSearch , "^\s*search\s*" )
 | rex field=qualifiedSearch "^(?<base_search>search[^\|\[]+)"
 | eval 
     check-sourcetype = if( match( base_search , "\s+sourcetype\s*=" ) , "defined" , "missing" ) ,
     check-index = if( match( base_search , "\s+index\s*(=|IN)" ) , "defined" , "missing" ) ,
     check-index-contains-wildcard = if( match( base_search , "\s+index\s*(=\s*[^\*]+(\s|$)|IN\s*\([^\)\*]+\s*\))" ) , "missing" , "defined" ) ,
     check-index-starts-wildcard = if( match( base_search , "\s+index\s*(=\s*\*|IN\s*\(\s*\*)" ) , "defined" , "missing" ) ,
     check-hidden = if( match( base_search , "\s+((tag|eventtype)\s*=|\`)" ) , "defined" , "missing" ) ,
     check-scheduled = if( match( next_scheduled_time , ".+" ) , "defined" , "missing" )
 | rename eai:acl.* AS namespace-*
 | search ( check-sourcetype="missing" OR check-index="missing" ) check-hidden="missing" check-scheduled="missing" namespace-owner!="nobody"
 | table title, check-index, check-sourcetype, base_search, namespace-*, check-index-contains-wildcard, check-index-starts-wildcard

This version includes the title (which is really useful), works with the IN clause, and I'm unsure if you wanted to check for wildcards in indexes so I added 2 versions as I've found that useful in my last environment (and it will be useful in my current one)

Let me know what you think

Regarding your comments:
"namespace-owner!="nobody" means it's limited to items saved by real users since it isn't effective to update 'nobody' owned searches that came with the product."

This is a nice place to toggle the setting as it can be useful for identifying poorly built addons.

"check-scheduled="missing" filters out unscheduled searches. This could be toggled since dashboards might reference a saved, but unscheduled, search. Albeit rare."

Yes, except in the example you posted your filtering for unscheduled searches, my example does the same but that can be changed easily...

"check-hidden="missing" filters out searches using tag, eventtypes, and macros - things where index or sourcetype might be defined"

I have another alert called "SearchHeadLevel - Scheduled searches not specifying an index macro version" however it's more complicated and needs a bit more work to get running compared to the version I've pasted already. Note that my current searches don't always cater for "IN" as the app was started on 6.5.x and IN was added later, so only some of my searches have been updated (after this discussion I might update the ones for finding the wildcarded index :))

0 Karma
Highlighted

Re: What is the best way to find searches without sourcetype or index defined?

Ultra Champion

Ha ha. I feel silly for forgetting the title and filtering to unscheduled searches. Good catches!

0 Karma
Highlighted

Re: What is the best way to find searches without sourcetype or index defined?

Ultra Champion

@gjanders - I'm hooking you up with some karma or something. You taught me about the IN Operator! That slipped past me so thank you for teaching me!

0 Karma