I have two sets of IIS data (two sourcetypes) in a single index. One sourcetype logs web service requests, the other is a once-daily dump of the available services and their base URIs (app pools and apps). I have two working queries as shown below, and I'm trying to figure out how to combine them -- or if that is even possible. (Trouble is, I'm new to Splunk but I've been a programmer for decades -- trying to wrap my head around "the Splunk way"...)
Returning servers and services with 25+ errors:
index=iis sourcetype=requests sc_status=503 | rex field=cs_uri_stem "(?i>(?<ServiceURI>^\S*\.svc\/)" | stats count by host,ServiceURI | where count > 25 | eval ErrCount=count,ErrHost=host | table ErrCount,ErrHost,ServiceURI
Results look like this:
Querying the once-daily service data is a bit more tricky -- each server can host multiple services, and each service can expose multiple base URIs which are stored as a comma-delimited list. So retrieving the service data for a specific server looks like this:
index=iis sourcetype=services earliest=-24h host=server456 | makemv delim="," RootURIs | mvexpand RootURIs | table AppName,RootURIs | where RootURIs != ""
Results look like this:
Both searches produce relatively low row-counts. I've seen the first produce maybe 15 or 20 hits max on a really bad 15 minute period (these will become alerts). The app info for a really large server might produce a maximum of maybe 200 root URIs. Both execute very quickly (a few seconds, tops).
So my goals are:
1. feed the server names from the first search (503 errors) into the second search,
2. match the start of the first search's ServiceURIs to specific AppNames and RootURIs from the second search,
3. output the fields listed in the table statements in both searches
Is this even remotely possible or realistic? I tried to do this via subsearches, but I think the subsearch has to produce just one result. With my developer and SQL backgrounds, a join was the obvious solution but I couldn't figure out how to get other fields (like the ServiceURI) into the joined search -- I thought I could use $ServiceURI$ but it didn't seem to work. I saw an intriguing comment that the splunk-way is to "collect all the events in one pass and then sort it out in later pipes with eval/stats and friends" but even if that's a good approach, I suspect the volume of data makes this a bad idea (millions of requests logged during a 15 minute window).
Help an old code-monkey discover The Splunk Way!
Thanks for the pointers. I ended up using map, which I've used in the past to iterate over array data. It was the reason I was thinking I could use $xxx$ references in later searches. I'd appreciate any feedback on these end results.
index=iis sourcetype=requests sc_status=503 | rex field=cs_uri_stem "(?i>(?<ServiceURI>^\S*\.svc\/)" | stats count by host,ServiceURI | where count > 25 | eval ErrCount=count,ErrHost=host | fields ErrCount,ErrHost,ServiceURI | map search=" search index=iis sourcetype=services earliest=-24h host=$ErrHost$ | makemv delim=\",\" RootURIs | mvexpand RootURIs | where RootURIs != \"\" | eval restart=if(like($ServiceURI$,RootURIs.\"%\"),1,0) | where restart=1 | eval ErrCount=$ErrCount$,ErrHost=\"$ErrHost$\",ServiceURI=\"$ServiceURI$\" | table ErrCount,ErrHost,ServiceURI,AppName,RootURIs" maxsearches=200
I spent a lot of time trying to figure out how to do a (sourcetype=X or sourcetype=Y) type of thing, but I have the feeling the need to use earliest=-24h in that second part of the search introduces some limitations as far as my options go for how to accomplish this.
It also "feels wrong" to need that last eval to re-create the desired fields from the first search, but it was the only thing I tried that works.
I really hate the way map requires a string representation of the secondary search -- my real secondary search has to apply some nasty regex to clean up that RootURIs list and escaping the already-escaped quotes and slashes is somewhat nightmarish and hardly yields anything human-readable.
The "map" command is even worse than join 😉
But seriously, the optimizer sometimes simplify it but in general, the map command will spawn a new search for every row of results in your main search. So that's a big performance no-no.
But the problem, which I didn't notice at the beginning is that that might not be that easy altogether because as I understand, you want a partial match, right? So you want to match AppName "pool2", which has among its RootURIs a value of "/app2/" with ErrHost server456 which has ServiceURI starting with "/app2/", right? That's not that easy - simple exact match would be a pretty easy thing to do - just append both searches if you can't generate the data any prettier way, rename some fields (for example ServiceURIs -> RootURIs) to get a common point and do stats values(*) by RootURIs. But if you want a "starts with" match... that's not that easy.
Thanks, that's correct, I need partial matching on top of everything else.
In this case I think map-spawned searches won't be an issue, fortunately we rarely have more than one or two services getting into a bad state at the same time, but I will certainly keep that in mind.
Good to know it's at least a hard problem. 🙂
I appreciate all the help.
One way to look performance of your query https://conf.splunk.com/files/2020/slides/TRU1143C.pdf
Even you are asking how to combine these queries together can you think that you schedule that 24h part and put its results to lookup and then use that lookup on another query?
The "pipe" character used in splunk searches is a very good choice because indeed it works very similarily to shell pipelines. If you've ever done some scripting, that's the same approach. You modify the data "on the fly" and whenever you pass past a pipe sign, you only get what you deliberately put through that pipe, and forget about what happened before.
So if you do
something1 | something2 | something3
At something3 you only have what you got from something2. You have absolutely no idea what that "something1" was, what it returned, in what order and whatnot.
The presentation @isoutamo pointed you to is a great source of knowledge and it's a great step in splunk searching proficiency to internalize the approach of joining data sets without actually using join.
Usually it boils down to clever stats-ing.
this should be a doable task.
Here is some links how you can try to do it