Solved: Pros and Cons: External lookup script vs custom s...

Lowell · ‎06-22-2010

What are the pros and cons to using an external lookup script vs a custom search command when the purpose is simply to augment your results with additional fields based on a given field?

Here is my scenario: I have a field that contains a hexadecimal value that contains several bit-level encoded fields (5 single bit flags, some multi-bit lookups, and a multi-bit value). I've written a python function that will take in the hex field and return a dictionary of new fields, and now I'm wondering which approach is better.

Lowell · ‎06-22-2010

I've tried both approaches and found the following:

External lookups:

Pro: Seems to be slightly faster.
Pro: Less inputs for the script to handle (since only unique values are passed to the script)
Con: Can't natively handle multi-valued field. (You can return a ";" and then split them using eval, but that may not always work.)
Con: Less flexible. For example, input to output must be deterministic (or static); which works for my given scenario.
Con: No way to pass in authentication therefore making it difficult to make REST calls to lookup configuration settings stored in Splunk or username/password info for remote resources, for example.
Pro: You can setup a new external lookup script without restarting splunk. (I had to to trick splunk into reloading the metadata file to pick up my [searchscripts/my_lookup.py] entry, since I don't think you can setup these permissions via the UI yet.)
Pro: Lookup can be setup to automatically extracted based on source/sourcetype/...

Custom Search Command

Pro: Full flexibility. Access to all fields
Pro: Can return multi-value fields
Con: Speed is a tad bit slower. (I found that enabling "streaming" did improve performance by 6x on my test query, but it's still slightly slower than the "lookup" approach). Also take a look at the v2 interface and the Python SDK for example scripts.
Con: Can't be setup to run automatically.
Con: You have to deal with getting everything setup properly via config files. Enabling getinfo does let you do more of this without as many restarts.

Please let me know if you have additional thoughts or if you find any mistakes in either of these lists.

Practically speaking, it's a good idea to wrap all of this in a macro, that way if you ever change your mind about which approach to use there are no changes to existing searches. And, if your new approach breaks, you can switch back quickly.

View solution in original post

Lowell · ‎06-22-2010

I've tried both approaches and found the following:

External lookups:

Pro: Seems to be slightly faster.
Pro: Less inputs for the script to handle (since only unique values are passed to the script)
Con: Can't natively handle multi-valued field. (You can return a ";" and then split them using eval, but that may not always work.)
Con: Less flexible. For example, input to output must be deterministic (or static); which works for my given scenario.
Con: No way to pass in authentication therefore making it difficult to make REST calls to lookup configuration settings stored in Splunk or username/password info for remote resources, for example.
Pro: You can setup a new external lookup script without restarting splunk. (I had to to trick splunk into reloading the metadata file to pick up my [searchscripts/my_lookup.py] entry, since I don't think you can setup these permissions via the UI yet.)
Pro: Lookup can be setup to automatically extracted based on source/sourcetype/...

Custom Search Command

Pro: Full flexibility. Access to all fields
Pro: Can return multi-value fields
Con: Speed is a tad bit slower. (I found that enabling "streaming" did improve performance by 6x on my test query, but it's still slightly slower than the "lookup" approach). Also take a look at the v2 interface and the Python SDK for example scripts.
Con: Can't be setup to run automatically.
Con: You have to deal with getting everything setup properly via config files. Enabling getinfo does let you do more of this without as many restarts.

Please let me know if you have additional thoughts or if you find any mistakes in either of these lists.

Practically speaking, it's a good idea to wrap all of this in a macro, that way if you ever change your mind about which approach to use there are no changes to existing searches. And, if your new approach breaks, you can switch back quickly.

Pros and Cons: External lookup script vs custom search command?

External lookups:

Custom Search Command

External lookups:

Custom Search Command

Now Playing: Splunk Education Summer Learning Premieres

The Visibility Gap: Hybrid Networks and IT Services

Get Operational Insights Quickly with Natural Language on the Splunk Platform

Are you a member of the Splunk Community?

Pros and Cons: External lookup script vs custom search command?

External lookups:

Custom Search Command

External lookups:

Custom Search Command

Now Playing: Splunk Education Summer Learning Premieres

The Visibility Gap: Hybrid Networks and IT Services

Get Operational Insights Quickly with Natural Language on the Splunk Platform