Splunk Search

Using lookups to augment searches with additional filters

zapping575
Path Finder

I sometimes need to make some changes to my eventtype definitions.

However, I do not actually want to edit the query in the eventtype definition directly. 

It appears that the following is a viable solution:

  • Create a new lookup (eventtype-filter.csv) with only two colums: eventtype, qry
  • In the search that uses the eventtypes, add this
| index=etc tag::eventtype=some_types
| lookup eventtype-filter.csv eventtype OUTPUT qry
| eval filtered = if(searchmatch("qry"), "true", "false") 
| where filtered = "true"​

The value for qry would look smth like this: NOT src=asdf AND NOT DST=qwer (i know this could be simplified)

Now this seems to be working fine but I noticed a few things:

  • It is required to put the OUTPUT from my lookup into quotes when passing it to searchmatch(). Otherwise searchmatch() will say that its input is not correct
  • The actual value for qry (stored in the lookup) should also not have any quotes, as this did also cause errors

My questions would be:

  • If anyone is also using this, possibly confirming that this is viable and not "some hack".
  • In case I need to use quotes in my qry value, are there any pitfalls to look out for?
Labels (3)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

Reading searchmatch, it is clear that it only accepts literal as argument.  To dereference your qry field into literal string, you need a different mechanism.  Like most people in this forum, I usually tell people not to use map. But this may be one occasion it solves multiple potential issues related to escaping quotation marks.

| inputlookup eventtype-filter.csv
| map search="search index=etc tag::eventtype=some_types tag::eventtype=$eventtype$
  | where searchmatch($qry$)
  1. The dual use of tag::eventtype is to consider that "some_types" may not include all eventtype's in that lookup.
  2. searchmatch function returns boolean.  There is no need to construct a filtered field.
  3. Consider @PickleRick 's warning about performance.

To test this technique against varied escape conditions, I use a dummy lookup with name eventtype-filter.csv and the following content:

search, "info IN (completed, granted)"
system, info != "granted"
splunk_instrumentation, info=completed

The qry values are designed for index=_audit.  Then, use the following emulated search:

| inputlookup eventtype-filter.csv
| map search="search index=_audit app=$eventtype$ | where searchmatch($qry$)"
| rename app as eventtype
| stats count by eventtype info

A sample output is

eventtypeinfocount
searchcompleted127
searchgranted37
splunk_instrumentationcompleted66
systemcompleted52

 Without the filters, distribution is like

eventtypeinfocount
*granted41
searchbad_request8
searchcanceled24
searchcompleted138
searchfailed1
searchgranted40
searchsuccess3
splunk_instrumentationcompleted66
splunk_instrumentationgranted46
systemcompleted54
systemgranted55
Tags (1)

tscroggins
Influencer

Hi @zapping575 ,

It could be made to work correctly with the searchmatch function, but it's slower than dynamically modifying the base search.

While the eventtype is duplicated in both the base and subsearch, this is faster:

index=etc eventtype=some_types 
    [| inputookup eventtype-filter.csv where eventtype=some_types 
    | fields qry 
    | return $qry ]

You can add syntactic sugar by using a macro:

index=etc `eventtype-filter(some_types)`
# macros.conf
[eventtype-filter(1)]
args = eventtype
definition = eventtype=$eventtype$ [| inputlookup eventtype-filter.csv where eventtype=$eventtype$ | fields qry | return $qry ]
iseval = 0

 

zapping575
Path Finder

You are assuming that I pass an actual eventtype into the base query (eventtype=some_types).

I am afraid that I cannot "economically" do that. In a nutshell, this is why:

  • I have a large amount of eventtypes
  • It is a maintenance nightmare to have a savedsearch for every eventtype
  • Especially because the eventtypes are used in multiple apps and indices, which are access controlled for different user groups

Regardless, thank you very much!

0 Karma

tscroggins
Influencer

Hi @zapping575 

Your base search included "tag::eventtype=...," so yes, my assumption was you would be specifying event types, albeit indirectly. You won't be updating event types after they're created, so your original event type definitions will always be your baseline, and anything you define in a lookup will be additive.

Here's a real world example using the err0r event type:

# eventtypes.conf

[err0r]
color =
description =
disabled = 0
priority = 1
search = NOT sourcetype=stash (error OR failure OR fail OR failed OR fatal) NOT "not an error"
tags =

We don't want to modify the err0r event type, but we want to "un-type" events with the phrase "Incorrect function" because we have a legacy Windows COM application that interprets and logs S_FALSE as an error in the Application event log, which we index with source type WinEventLog:

11/08/2025 12:00:00 PM
LogName=Application
EventCode=1000
EventType=0
ComputerName=EXAMPLE
SourceName=MyComApp
Type=Error
RecordNumber=1234
Keywords=Classic
TaskCategory=None
OpCode=Info
Message=Incorrect function.

Your eventtype-filter.csv might look like this:

# eventtype-filter.csv

eventtype,qry
err0r,NOT "Incorrect function"

Given the following base search:

index=windows tag::eventtype=error

our desired translation based on the current eventtype-filter.csv content is:

index=windows tag::eventtype=error NOT ( eventtype=err0r NOT ( NOT "Incorrect function" ) )

This search will exclude events with event type err0r that contain the phrase "Incorrect function," but how do we construct the search? We can use a subsearch and inputlookup:

index=windows tag::eventtype=error NOT
    [| inputlookup eventtype-filter.csv 
    | eval search="eventtype=".eventtype." NOT ( ".qry." )" 
    | fields search 
    | mvcombine search 
    | eval search="( ( ".mvjoin(search, " ) OR ( ")." ) )" 
    | return $search ]

This preserves your original eventtype-filter.csv structure but generates search predicates that work within the base search. Granted, you must execute a subsearch to generate the predicates, but it's much faster than the where command for large result sets.

As before, you can wrap all or some of the NOT [| inputlookup ... ] logic in a search macro.

tscroggins
Influencer

If it's not clear above, tag::eventtype=error is just an example. You can use tag::eventtype=tag1,tag2,tag3,tagN and include any number of event types in the lookup file. Your search string and lispy will grow as your lookup grows, but that should be fine. Test, test, and test again.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

It is a very bad idea. For at least two reasons.

One is the maintenance and accountability - change in the lookup contents will not be audited.

Another, probably more important for you at this point - performance. If you have a set of search terms in the initial search, Splunk tries very hard to limit the number of events it actually has to read from the index, parse out the fields from and process down the pipeline. 

With a search like this Splunk has to read each single event from the index, parse it and try to match to a given search condition.

An example from my home lab. If I do

index=windows EventCode=4662

over last 24 hours, the search ends almost immediately (after 0.5s it takes to spawn the search) and returns nothing because I don't have any event with such event and since Splunk hadn't indexed anything with a term "4662" it just checks the tsidx file, sees there is no such term and doesn't have to read a single raw event from the index file.

But if I do it "your way"

index=winevents 
| eval match=if(searchmatch("EventCode=4662"),1,0)
| where match=1

I still get no results because the events haven't suddenly magically spawned but to come up with this it took Splunk about 2.5 seconds (on my mostly empty lab) and it had to read all 23851 events from the search time range.

zapping575
Path Finder

Thank you very much for your feedback.

Regarding the maintenance:

Access to the lookup file in question is controlled and only possible for users with a specific role. It is true that tracking each change to a lookup file is probably difficult (I didnt check if perhaps the splunk app for lookup file editing is logging these things), but since its only accessible to a handful of users, while I think this is definitely a tradeoff, it is still acceptable.

Regarding the performance:

You are 100% right that this performs way worse than your example. (I presume in your example, EventCode is an indexed field, making the search much faster.)

90% of my eventtypes are based on string matching definitions (sometimes with wildcards in them). We are not doing any field extraction at index time. 

In the query in the OP, I am passing tag::eventtype=some_types, this will effectively result in a large normalizedSearch with all the eventtypes (associated with the tag) concatenated together.

I am definetely not a fan of this solution either, but it beats having to update every single definition in every single app in case changes occur. (referring to my answer to @tscroggins )

To limit the "footprint" of this solution, we are including this in the base search:

_index_earliest=-1h _index_latest=now

This then runs as a scheduled report every hour and writes the results to a summary index. It takes about 10 seconds to run.

Now, since we already have to have all of this overhead, adding the additional filter is -from what I can see- not impacting performance any further. When I remove the filter, the search still takes about 10 seconds to run.

 

I guess that what is becoming apparent is that there is a fundamental "issue" with the way our searches (reports) are constructed and the OP discussed here is just one side effect that comes from that.

In the past, I had to change eventtypes, savedsearches and lookups across multiple apps even tho their content was all the same. This is very tedious and error prone. Coming up with a solution that allows to keep and edit a single source of truth for knowledge objects is proving to be difficult to implement with splunk without making some kinds of tradeoffs.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

No. EventCode is _not_ an indexed field. That's just it. That's how Splunk search works. If Splunk sees a search condition like that "EventCode=4662" it selects only those events which contain value "4662" for further searching. And only those events are checked whether that value happens to be getting parsed out as the EventCode field. So if you have multiple search terms the intersection of the sets of events resulting from limiting the initial data set by those sought terms can be really small even though you're searching a broad time range (and that's also the reason why negative matching is not as effective performance-wise).

If you do filtering with searchmatch Splunk cannot use this magic to limit the scanned events. It has to parse (at least to the point when it has all needed fields) each event. It can be a humongous number of events.

Anyway, if you have _the same_ definition across different apps and eventtypes, that means something is way off in your knowledge management process. Don't try to solve policy problems with technical means and vice versa.

@tscroggins 's solution with using a subsearch to generate the search dynamically is... well, while it's technically correct and should be working, it does make your life more complicated because you have more items to manage, auditing is more complicated (if possible at all) and debugging gets a lot more frustrating.

If you need dynamic elements in your searches/eventtypes/dashboards and so on, you typically just use macros. Those macros can be made editable by a defined group of people and can contain variable elements of the search. Typical use cases here are - whitelisting some results or dynamic thresholding. Generally - that's the way of externalizing configuration of the search from its basic logic.

But again - if you have the same search in various places across your environment that might need refactoring your configuration as a whole and redesigning your maintenance process - it can benefit everyone in the end and save a lot of time and effort instead of introducing more maintenance nightmares.

bowesmana
SplunkTrust
SplunkTrust

There are some subtleties around your first para. Splunk does indeed in the EventCode case, only look at events where 4662 would exist in the tsidx, but that's because EventCode is generally an alias for EventID for Windows Event logs The search results therefore will only have a scanCount=resultCount - assuming no other search conditions.

However, if the field you are matching is a calculated field, then it cannot do that optimisation. Say for example you have a calculated field

| eval MyEventCode=random() % 10 + 9990

and in your search you do 

index=bla MyEventCode=9990

then scanCount for the search will have to read every event, because 9990 does not exist in the tsidx files, so in this case resultCount will be approx 10% of scanCount.

You can see the effect of this in the search.log, the first being a search for MyEventCode and the second for EventCode

base lispy: [ AND index::bla [ OR ...
vs
base lispy: [ AND 4662 index::bla [ OR...

and you can see that there is some Calculated field processing happening before this for MyEventCode.

So like all things Splunk, you need to understand the data and in this case, how the "fields" you are searching for are defined in any TAs that are relevant to that data.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Yup. There are border cases where there is more to life than this 😉 But generally, filtering as early as possible is the way to go. And that's the main takeaway. 🙂

0 Karma

zapping575
Path Finder

I think your remark regarding the knowledge management process is what is the core of this problem.

The motivation, initially,  was to do things "the smart way":

  • Keep eventtype defintions in one place only
  • Favor one report that covers many similar eventtypes over one report per eventtype
  • Be able to make changes without requiring re-deployment or restarts (as noted by  @tscroggins )
  • Make changes in one place only (if required)

Trying to package all of these together (using whichever method) would however always end up in more complicated queries and/or definitions. These are becoming harder to reason about. If you consider continuity, a more straightforward approach is often desired.

Which is why we had to take a step backwards and changed the approach to something which is easier to understand but requires a lot of manual work (I am really missing the option to create / manage knowledge objects in bulk). We now have a single, dedicated report for any eventtype we are interested in, using the REQUIRED_EVENTTYPES directive to speed up the search. Any extra changes we can make in the report definition, leaving the original eventtype untouched (and in its own app)

I really appreciate everybodys thoughts and feedback. Thank you.

0 Karma

tscroggins
Influencer

Hi @zapping575,

The optimization consensus is (almost) always a base search with predicates, i.e., do this:

index=foo bar=baz

not this:

index=foo
| where bar=="baz"

Event types and tags are shortcuts for the former. Pay close attention to the search job inspector. The ratio of returned results to scanned results should be, generally speaking, as close to 1:1 as possible.

Writing creative SPL is fun or else we would not be here, but I will echo @PickleRick: If you find yourself modifying event types frequently, reevaluate your knowledge management practices. It may be time to pivot and adjust your source type, field extraction, event type, tag, and (hopefully) data model funnel.

But if you need to be creative, be creative! Is it easier to maintain one lookup instead of a global set of eventtypes.conf and tags.conf overrides in a lexicographically "last" app, e.g., zzz_global_types? The answer is arguable, but the primary benefit of the lookup is a restart-free update. If you manage both lookup and configuration file deployment outside Splunk, however, the operational difference may be minimal. In either case, you will need to rebuild data model summaries if you use them, and the impact to searches is semantically the same. The impact to search performance is significantly worse with the combination of the lookup and where commands.

Get Updates on the Splunk Community!

Accelerating Observability as Code with the Splunk AI Assistant

We’ve seen in previous posts what Observability as Code (OaC) is and how it’s now essential for managing ...

Integrating Splunk Search API and Quarto to Create Reproducible Investigation ...

 Splunk is More Than Just the Web Console For Digital Forensics and Incident Response (DFIR) practitioners, ...

Congratulations to the 2025-2026 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...