It's also worth explaining why the where command is usually way slower than adding another condition to the original search (or adding another search command in the pipeline). Firstly, Splunk is relatively smart and when it sees search condition1 | search condition2 it internally optimizes it out and treats it as search condition1 AND condition2 But that's a minor point here. The major point (and that's really very important in understanding why some things work faster with Splunk than others) is _how_ Splunk searches the indexes for data. Your typilcal "other solution" (like RDBMS or some object database which indexes documents) splits the data into discrete fields on ingestion and holds each of those fields in a separate "compartment" (we can call it columns in database table, we can call it object properties, doesn't matter here). So when you have to look for key=value pair, the solution looks into the "drawer" called "key" and looks for "value". Splunk (mostly; the exception being indexed fields) works the other ways around. It stores the "values" in form of tokens into which it splits the input data. And during searching if you search for key=value condition it searches for all events containing the "value" token and parses all of them to see if the value is in the proper place within the event to match the defined extraction for key. Of course the more values you're looking for (because you have separate conditions for many fields containing separate values like key1=value1 AND key2=value2 AND key3=value3 and so on), the lower is the count of events containing all those values at the same time and the fewer events Splunk has to actually parse to see if those field definitions match what you're searching for. So if you're adding more conditions to your search by AND you're telling Splunk to consider fewer and fewer events in your search. But where does not work like that. Where works only as a streaming command and has to process all the events that come from the preceeding command(s). So for example, if you have in your index 100 thousands events of which 10000 contain "value1", 10000 contain "value2" (1000 of them overlap and contain both of those values), if you're searching for index=myindex key1=value1 key2=value2 Splunk has to only parse 1000 events which contain both values at the same time to find if they contain it in places corresponding to key1 and key2 respectively. But if you do index=myindex key1=value1 | where key2="value2" Splunk has to parse all 10000 events containing value1 to see if they match key1. From the resulting set of this search it needs to match all events where key2="value2". Even worse if you just did index=myindex | where key1="value1" AND key2="value2" Splunk then would have to read all 100k events from your index and parse those two fields out of them to later compare their values with the given condition. To show you what difference that can make an example from my home lab box. index=winevents EventCode=4799 EventRecordID=461117 I ran this search over last 30 days. This search has completed and has returned 1 results by scanning 1 events in 0.278 seconds EventRecordID is a pretty unique identifier so Splunk already had only a single record to check. If we move this condition to the where part index=winevents EventCode=4799 | where EventRecordID=461117 We get This search has completed and has returned 1 results by scanning 9,768 events in 1.045 seconds As you can see, Splunk had to do much more work because I had 9768 events which matched the value 4799 (and from the further job inspection which I'm not pasting here I see that all of them were in the EventCode field) and all those events had to be processed further by the where command. It's still relatively fast, because 10k events is not that much but it's about 4 times slower (the difference on bigger sets would be more noticeable - here the big part of the time used is just spawning the search). If we move both conditions to the where part: index=winevents | where EventCode=4799 AND EventRecordID=461117 We still get the same 1 result which is not surprising but... This search has completed and has returned 1 results by scanning 63,740 events in 6.017 seconds I have exactly 63740 events in the winevents index and they all had to be parsed and processed further down the pipeline by the where command. And it's no wonder that since there's about 6 times more events to process than in previous variant it took about 6 times as much time. So yes, where is a fairly sophisticated and flexible command letting you do many things that ordinary search command won't but the tighter you can "squeeze" your indexes with the initial search the better the overall performance.
... View more