I'm having issues where the map command returns an error when there are no results from the main query. In my use case, this is a perfectly reasonable scenario.
If there are no results from the initial search, then there is no need for the map command to execute at all but instead it tries to execute the search without any tokens.
index=myIndex value1!=True | stats count by siteID | map [search index=myIndex earliest=-2d value2!=True siteID=$siteID$ | stats latest(_time) as lastContact by siteID, siteName, region, siteType]
So if the main search doesn't return anything, I get an error:
Error in 'map': Did not find value for required attribute 'siteID'.
This wouldn't be too bad but the results from this search are being appended to another search and this error causes both searches to fail, regardless of if the first search was successful or not.
Any pointers on getting this map command to play nice would be greatly appreciated.
index=myIndex value1!=True | stats count by siteID | map search="search index=myIndex earliest=-2d value2!=True siteID=\"$siteID$\" | stats latest(_time) as lastContact by siteID, siteName, region, siteType"
※When siteID is a character string
Thanks for the response, although this doesn't seem to fix my issue. siteID will always be a 5 digit integer (or nothing if there are no results in the main search)
I inserted a dummy.
In case of "*", all are extracted.
index=myIndex value1!=True| append [search |noop|stats count|eval siteID="*"] | stats count by siteID | map search="search index=myIndex earliest=-2d value2!=True siteID=$siteID$ | stats latest(_time) as lastContact by siteID, siteName, region, siteType" | eventstats count as all | where (all=1 and siteID="*") OR (all>1 and siteID!="*")
Well, you can solve the current problem with a simple
index=myIndex value1!=True | stats count by siteID | fillnull value="" siteID | map [search index=myIndex earliest=-2d value2!=True siteID=$siteID$ | stats latest(_time) as lastContact by siteID, siteName, region, siteType]
That will eliminate the errors for the search code as it is currently written.
However, I strongly suspect there is a better way to structure this code so that it's not using
map here at all, especially if (as the snippet suggests), your map search is iterating over the same indexed data as the primary search that feeds it. Without any other context, my intuition is that you're finding
siteID values in the primary search from a different time window than the one you're using in your mapped subsearch. And I can totally relate to the desire to structure code like this, but it's actually not the efficient way to do things in Splunk. If you'd like help re-writing this so that your search is more efficient and less brittle, feel free to post more details (either here in response to my post, or in a new "How do I make this search more efficient?"-type post).
But, as an FYI, if you decide to stick with
map on this - you'll want to add
x is the maximum number of iterations you want that map command to run. If you want to live dangerously and allow it to run for as many
siteID values as the primary search finds, you can use
maxsearches=0. If you don't specify a value for this attribute, the map command will max out at 10 iterations of the subsearch.
Such a simple solution, thanks so much. And yes you're right I'm using the map command primarily because I don't want the main search to run over 2 days worth of data because in that time there are almost 1 million events, and it seemed faster to wait until the search had been narrowed down to the point where it was only searching for one or two sites rather than all of them. In saying that, the search does feel slow and inefficient and I would like to improve it.
Basically this search is looking for devices that have gone offline and then the map command will search a maximum of 2 days back to find out how it's been since the device was online. So basically I only want recent events for the first part of the search, and historical data for the second part of the search in order to create a downtime column in the final results. Any ideas?
Thanks again for your answer.
Glad to help! And here's what makes
map a terribly inefficient way to do almost anything. When Splunk runs a search, it allocates a decent number of distinct resources on the search head, including a processor core, to that particular search. When you use
map, the primary search produces some number of results to feed into the map subsearches, and then Splunk creates a whole new search for each result that is going into the subsearch. So if the primary search returns four values for siteID, it's NOT like Splunk is running:
index=myIndex earliest=-2d value2!=True siteID=value1 OR siteID=value2 OR siteID=value3 OR siteID=value4
Rather, Splunk is running four individual searches:
index=myIndex earliest=-2d value2!=True siteID=value1
index=myIndex earliest=-2d value2!=True siteID=value2
index=myIndex earliest=-2d value2!=True siteID=value3
index=myIndex earliest=-2d value2!=True siteID=value4
...with all the resource allocation that those searches imply.
So one way to rewrite the search would be to reverse the order of primary search vs. subsearch. The way your post is written, I know what the intended timeframe was for the original subsearch but not for the original primary search, so I'll structure this as though your original primary search was looking at the last hour. Adjust to your actual needs accordingly.
index=myIndex earliest=-2d value2!=True [ search index=myIndex value1!=True earliest=-1h | fields siteID | format ] | stats latest(_time) as lastContact by siteID, siteName, region, siteType
This approach will still search the smaller time window first, but (assuming the subsearch returns 4 values), it will expand out to:
index=myIndex earliest=-2d value2!=True siteID=value1 OR siteID=value2 OR siteID=value3 OR siteID=value4 | stats latest(_time) as lastContact by siteID, siteName, region, siteType
So that's already a big improvement. If the subsearch always completes quickly and returns a small number of values, that should work reasonably well.
While you're trimming, you might also try replacing those instances of
value2!=True. It is almost always faster to run a Splunk search for something positive (e.g.
value1=False) than to search for the negation of something. If the list of possible values for
value2 are short, you can try running your search the way it's written and then run it with
value1=False OR value1=KindOfFalse OR value1=NotTotallyTrue (or whatever settings
value1 might take that preclude
value1=True) - and see if this speeds up the search. It doesn't always, but it often does.
I had no idea you could expand searches like that. That is incredibly useful! I implemented your suggestions and the search runs much faster (also with less headache).
Thank you again for your help. I'll definitely be using this more in the future.
This actually also fixed another issue I was having where some sites wouldn't appear if they hadn't been online at some point during the last 2 days.
Really glad to help! If you have other searches that are taking longer (or taking more resources) than you think makes sense, feel free to ask on here. There are lots of us who enjoy helping/explaining and making searches run more efficiently!
In a similar vein, if you are not using a stats comment, you can simply append / makeresults to create a dummy result to feed to | map.
| append [| makeresults | eval siteID="DUMMY"]