Splunk Search

Looking for feedback on SPL design using map command (concerns about performance and safety)

Kimiko
New Member

Hi Splunk Community,

I have created the following SPL for scheduled alerts. Some parts are masked for confidentiality, but the structure is as follows:

| inputlookup my_lookup_table
| eval chk_ignore_from=if(Ignore_From!="", strptime(Ignore_From,"%Y/%m/%d %H:%M:%S"), null())
| eval chk_ignore_to =if(Ignore_To!="", strptime(Ignore_To,"%Y/%m/%d %H:%M:%S"), null())
| where isnull(chk_ignore_from) OR isnull(chk_ignore_to)
OR (now() < chk_ignore_from OR now() >= chk_ignore_to)
| where isnotnull(Macro) AND Macro!=""
AND isnotnull(Alert_Mail_Send_To) AND Alert_Mail_Send_To!=""
AND isnotnull(Alert_Mail_Title) AND Alert_Mail_Title!=""
AND isnotnull(Alert_Mail_Body) AND Alert_Mail_Body!=""
| eval Macro=trim(Macro)
| eval Macro=case(match(Macro,"^[\"'].*[\"']$"), substr(Macro,2,len(Macro)-2), true(), Macro)
| map maxsearches=500 search="search `$Macro$` | fields _time _raw host
| eval target_host=\"...\", Alert_Mail_Send_To=\"...\", Alert_Mail_Title=\"...\", Alert_Mail_Body=\"...\", Macro=\"$Macro$\"
| fields _time _raw host Alert_Mail_Send_To Alert_Mail_Title Alert_Mail_Body Macro target_host"
| dedup Alert_Mail_Body

My main concern is the use of the map command. I know map can be risky because it runs multiple searches and could cause performance issues if not controlled properly.
Questions:

  1. Are there best practices for using map in scheduled searches to avoid excessive search executions?
  2. Besides maxsearches, is there any recommended way to limit or safeguard map usage in Splunk Cloud?
  3. From a long-term operation perspective, do you see any potential issues with this SPL design?

Any feedback or suggestions would be greatly appreciated!

Labels (2)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

Aside from the suggestion that map is generally not a go to solution for most problems, given that all your macro searches are expected to return _time _raw host, could you condense all the macros into a single statement and make the whole thing a subsearch, e.g.

[
  | inputlookup my_lookup_table
  | eval chk_ignore_from=if(Ignore_From!="", strptime(Ignore_From,"%Y/%m/%d %H:%M:%S"), null())
  | eval chk_ignore_to =if(Ignore_To!="", strptime(Ignore_To,"%Y/%m/%d %H:%M:%S"), null())
  | where isnull(chk_ignore_from) OR isnull(chk_ignore_to)
OR (now() < chk_ignore_from OR now() >= chk_ignore_to)
  | where isnotnull(Macro) AND Macro!=""
AND isnotnull(Alert_Mail_Send_To) AND Alert_Mail_Send_To!=""
AND isnotnull(Alert_Mail_Title) AND Alert_Mail_Title!=""
AND isnotnull(Alert_Mail_Body) AND Alert_Mail_Body!=""
  | eval Macro=trim(Macro)
  | eval Macro=case(match(Macro,"^[\"'].*[\"']$"), substr(Macro,2,len(Macro)-2), true(), Macro)
``` Combine them into multiple OR search conditions ```
  | stats values(Macro) as search 
  | eval search="(".mvjoin(search, ") OR (").")" 
]
...

That assumes that the macros represent simple search criteria and do not include any kind of pipes or other processing in the search, as they would then not combine to a single set of OR conditions.

Then you don't have to use map at all.

 

yuanliu
SplunkTrust
SplunkTrust

As hinted by @PickleRick , the best practice for using map - anywhere - is to not use it.  Performance or safety is perhaps the least of concerns.

From your convoluted illustration, it seems that my_lookup_table contains a field named Macro, and a number of fields that you also have to use SPL to manipulate, including a pair of US-centric date+time values and a number of text values.  Like with any such attempted use of lookup, the first question is: Who produces this lookup?  Is the content in your control (design) or is it some sacred input you cannot change? (Most of the time, even if it is from an external input, you can substitute with a final lookup in your control.)

Then, the next question is: Why map a lookup?  It seems to be an effort to reduce maintenance cost.  But the way code works, it is more obscure, hence more difficult to maintain.  Why not just write a giant macro with all the conditions?  Is it really cheaper to maintain a somewhat obscure lookup plus a very obscure map structure? (You can invoke macros inside a macro.)

BTW, the beginning of the search can be simplified to

| inputlookup my_lookup_table where Macro=* Alert_Mail_Send_To=* Alert_Mail_Title=* Alert_Mail_Body=*
| eval chk_ignore_from=strptime(Ignore_From,"%Y/%m/%d %H:%M:%S")
| eval chk_ignore_to =strptime(Ignore_To,"%Y/%m/%d %H:%M:%S")
| where isnull(chk_ignore_from) OR isnull(chk_ignore_to)
  OR (now() < chk_ignore_from OR now() >= chk_ignore_to)

 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I'm not sure what it is supposed to do. I'd assume it's meant as some form of additional control over predefined searches. Apart from the practical/performance issues - yes, you are right, map is usually not the way to go, there is the question of accountability for the searches actually being run and permissions for those searches.

As for the search itself - the use of dedup will _probably_ (because I don't know the intended logic of your search) yield not the results you expect.

0 Karma

Kimiko
New Member

Thank you for your feedback!
Let me clarify the purpose of this SPL:

  • Goal: The lookup CSV contains about 100 rows, each defining one macro. Each macro includes a predefined SPL search. When the alert runs, it needs to execute all these macros and collect the results.
  • Why map is used: To run multiple macros in a single scheduled search. Each row from the lookup expands into a macro call, and the results are combined.
  • Why dedup is used: If the same error log occurs repeatedly within the last 5 minutes (configured in the alert schedule), we want to consolidate those into one alert to avoid sending multiple identical emails.

My question:

  • Are there best practices or recommended alternatives for using map in scheduled searches?
    (Especially to mitigate the risk of excessive search executions.)
  • Could you point out any areas in the overall design that might not follow Splunk best practices or could be improved?
    For example, potential performance issues with large data sets or long-term operation, whether the use of map is appropriate, and if this approach aligns with Splunk’s recommended methods.

Any advice would be greatly appreciated!

0 Karma
Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...