Splunk Search

Is it possible to use a lookup at index time, not search time?

mathiask
Communicator

Hi Splunkers

This is during parsing time .. not search time.

Is there a way that I can use a lookup during parsing phase, and directly write the data into the log?
i.e. DHCP information, since this is volatile. so lookup "src_ip" in the CSV/etc. and add this to the event as "src_name"

And can I subsequently filter out events that based on that
e.g. "src_name" = "unknown" drop event

0 Karma

vsilchev
Explorer

In general, hostname-ip resolving with DHCP logs is like a "reference" example of Splunk time-based lookup. However...

Assuming, that (for some reason) you cannot dump and store the data required for your lookup (e.g. DHCP leases), but you can somehow perform the lookup "at the moment" (for example, by running external command like "netsh"). In this case, I believe, you should create a scripted input that will collect/receive your events, perform the lookup (and probably additional event processing routines) and return extended event records to Splunk.

Scripted inputs overview section of "Developing Views and Apps for Splunk Web" Manual could be a good starting point, I guess.

0 Karma

jeffland
SplunkTrust
SplunkTrust

If by "volatile" information you mean information that changes over time, you might want to have a look at time-based lookups. They can be used to lookup different DHCP info for the same src_ip at different times.

0 Karma

mathiask
Communicator

Thanks for that input

These solutions would work well if I can first dump everything into splunk and then query it. This usually works within one enterprise/organisation and/or if you have sufficient storage.
But this does not work for us. Depending on the use case we very fast talk about multiple TB/day, just to later drop 99.999% of it ...
While this would be really fun to do ... the cost benefit ratio is not in my favor 🙂

Therefore, I want to lookup some information based on the information in the event, and then decide upon this to keep or drop the event.
DHCP is just a simple example because everyone knows the problem ...

0 Karma

jeffland
SplunkTrust
SplunkTrust

What you can do with splunk is use regular expressions to decide whether to index an event or not. If your events do not contain something you can detect with regular expressions (maybe certain subnets?), then that info needs to be added before splunk.

0 Karma

mathiask
Communicator

Yeah I thought of that ... it would be like "hardcoding" the list(s) as REGEX rather than defining a config, i.e. lookup, and updating the list, but it could do the job ...
... given that it scales to several hundred entries and way beyond 100k events/s

0 Karma

jeffland
SplunkTrust
SplunkTrust

I would not recommend that. Splunk's parsing is certainly built to handle that amount of data, and throwing stuff to nullqueue instead of indexing it will not cause any problems. But such regexes over each and every individual event are not what you want to throw at this problem.
If you can, I'd suggest using some other system in front of splunk to determine the relevant hosts and route their output to splunk. Splunk itself is made for ingesting data and dropping individual events, it is not a platform to permanently and dynamically enable and disable inputs.

0 Karma

koshyk
Super Champion

I would never touch the raw event and would retain it's purity.

for volatile information, what i tend to do is to "index" those information on a daily basis to another index. In your case, I would index the DHCP information into a separate index with that days time stamp as _time
You can always co-relate your event with this indexed/volatile data at any time later.

0 Karma

lguinn2
Legend

Not as far as I know.

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...