All Apps and Add-ons

Difference between WHERE and SEARCH commands

tsunamii
Path Finder

What are the differences between “where” and “search”? I read somewhere that "search" tends to cause more overhead. The search below if run over one day of netflow data, it takes more than 24+ hours to run.


index=proxy* s_op=GET | lookup geoip clientip as d_ip | where client_country="Russian Federation" OR client_country="Ukraine" OR client_country="Romania" OR client_country="Bulgaria" OR client_country="Latvia" OR client_country="Azerbaijan" OR client_country="Kazakstan" OR client_country="Macedonia" OR client_country="Serbia" | table _time c_ip d_ip r_host client_country client_city cs_bytes d_port cs_uri referer c_agent

1 Solution

martin_mueller
SplunkTrust
SplunkTrust

I'm going to guess that search takes that long because it's reading a boatload of events off disk and performing the lookup, only to then possibly throw out most of them. The where (or search) after that isn't going to add a lot more to the runtime of that pipeline.
What kind of lookup is that, scripted? How many events are you loading? What are you actually looking for as a result, could you possible pre-aggregate data before looking up the location? Have you considered using the Splunk 6 iplocation command to maybe speed up the lookup process?

As for the question from the title, search and where as a filter further down the pipeline mostly differ in what they can do, and how. where only evaluates boolean expressions, so to do a wildcard filter you have to explicitly call match() while search can just do field=value*. I doubt there's a significant difference in performance when doing the same stuff compared to the actual loading of events at the start of the pipeline.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

I'm going to guess that search takes that long because it's reading a boatload of events off disk and performing the lookup, only to then possibly throw out most of them. The where (or search) after that isn't going to add a lot more to the runtime of that pipeline.
What kind of lookup is that, scripted? How many events are you loading? What are you actually looking for as a result, could you possible pre-aggregate data before looking up the location? Have you considered using the Splunk 6 iplocation command to maybe speed up the lookup process?

As for the question from the title, search and where as a filter further down the pipeline mostly differ in what they can do, and how. where only evaluates boolean expressions, so to do a wildcard filter you have to explicitly call match() while search can just do field=value*. I doubt there's a significant difference in performance when doing the same stuff compared to the actual loading of events at the start of the pipeline.

Get Updates on the Splunk Community!

SOC Modernization: How Automation and Splunk SOAR are Shaping the Next-Gen Security ...

Security automation is no longer a luxury but a necessity. Join us to learn how Splunk ES and SOAR empower ...

Ask It, Fix It: Faster Investigations with AI Assistant in Observability Cloud

  Join us in this Tech Talk and learn about the recently launched AI Assistant in Observability Cloud. With ...

Index This | How many sides does a circle have?

  March 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...