Splunk Search

What would cause a search-time CSV lookup to only return results intermittently?

malat_UoM
Explorer

OK; this one's odd... what might cause a lookup in a search to only return results some of the time...?

Brief description:

I have a search for tracking Windows user authentications in a high-flux DHCP environment, which I implement like so,

  1. Scheduled report on DHCP log runs every 2mins and outputs to a lookup table containing DHCP IP, client hostname, MAC, DHCP lease issue time and DHCP lease release time - the report search is rather convoluted; it returns DHCP ACK and RELEASE events, then marries them up with what's already in the lookup table to maintain a historic record of which IP's were issued to which client devices and when.

  2. Search Windows Security log for Kerberos and NTLM authentication events, then look up the IP (Kerberos) or hostname (NTLM) in the DHCP lookup table, and evaluate the authentication event timestamp against the lease-issued and lease-released timestamps in the lookup table to return a nice, traceable set of properties for the source of the authentication event.

the problem is that the DHCP lease table lookup only works when it feel like it - some of the time, the search returns fully-populated results like it's supposed to, but most of the time, it doesn't, despite the data it should hit on being present in the lookup table.

What could be causing this, and how would I even go about troubleshooting it, let alone fixing it?

Tags (3)
0 Karma

woodcock
Esteemed Legend

This sounds like a race condition where you are looking into the lookup table before it is available from the last update. This is exactly the kind of thing (latency from disk writes) that the new KV Store was designed to address. I would immediately switch from using outputlookup to disk and move to KV Store and probably your troubles will go away. The other thing that you might do is inspect your jobs, particularly the Normalized Search string, and compare ones that fail to ones that don't. Splunk does some crazy optimizations when normalizing and I have seen it do the wrong thing, especially with lookups. One way to bypass most of Splunk's normalization by reverse-lookup optimizations is to add a superfluous | search as far left in your search as you can. See if this makes any difference.

0 Karma

malat_UoM
Explorer

That actually sounds none too implausible... especially since the search includes two lookups to the DHCP table - one to help pin down the DHCP lease during which the auth event was generated based on the lease isse and release timstamps, the other to actually retrieve the client properties...

Checking the jobs log shows the scheduled search of the DHCP log is fairly speedy - runtime's always less than a minute, but the main search, especially if it covers a decent stretch of time, can take over 5mins...

Haven't had to use a kvstore yet; off I go to do some reading.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...