Splunk Search

What would cause a search-time CSV lookup to only return results intermittently?

malat_UoM
Explorer

OK; this one's odd... what might cause a lookup in a search to only return results some of the time...?

Brief description:

I have a search for tracking Windows user authentications in a high-flux DHCP environment, which I implement like so,

  1. Scheduled report on DHCP log runs every 2mins and outputs to a lookup table containing DHCP IP, client hostname, MAC, DHCP lease issue time and DHCP lease release time - the report search is rather convoluted; it returns DHCP ACK and RELEASE events, then marries them up with what's already in the lookup table to maintain a historic record of which IP's were issued to which client devices and when.

  2. Search Windows Security log for Kerberos and NTLM authentication events, then look up the IP (Kerberos) or hostname (NTLM) in the DHCP lookup table, and evaluate the authentication event timestamp against the lease-issued and lease-released timestamps in the lookup table to return a nice, traceable set of properties for the source of the authentication event.

the problem is that the DHCP lease table lookup only works when it feel like it - some of the time, the search returns fully-populated results like it's supposed to, but most of the time, it doesn't, despite the data it should hit on being present in the lookup table.

What could be causing this, and how would I even go about troubleshooting it, let alone fixing it?

Tags (3)
0 Karma

woodcock
Esteemed Legend

This sounds like a race condition where you are looking into the lookup table before it is available from the last update. This is exactly the kind of thing (latency from disk writes) that the new KV Store was designed to address. I would immediately switch from using outputlookup to disk and move to KV Store and probably your troubles will go away. The other thing that you might do is inspect your jobs, particularly the Normalized Search string, and compare ones that fail to ones that don't. Splunk does some crazy optimizations when normalizing and I have seen it do the wrong thing, especially with lookups. One way to bypass most of Splunk's normalization by reverse-lookup optimizations is to add a superfluous | search as far left in your search as you can. See if this makes any difference.

0 Karma

malat_UoM
Explorer

That actually sounds none too implausible... especially since the search includes two lookups to the DHCP table - one to help pin down the DHCP lease during which the auth event was generated based on the lease isse and release timstamps, the other to actually retrieve the client properties...

Checking the jobs log shows the scheduled search of the DHCP log is fairly speedy - runtime's always less than a minute, but the main search, especially if it covers a decent stretch of time, can take over 5mins...

Haven't had to use a kvstore yet; off I go to do some reading.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...