Unfortunately our proxy data does not have user information. However I do have access to AV data that is able to map client IP to user information.
Via the "Lookup Definitions" link in the Splunk Manager I can setup Max and Min Offset for my "enrichment data". I see that these settings would used if my "enrichment" data is in the future. Unfortunately my enrichment data is usually one day behind. This is causing issues with the correct encrichment data being applied to the events.
[av_lookup] filename = av_lookup.csv time_field = savreportcheckin lookup_table = av_lookup ip_address AS c_ip OUTPUTNEW clientuser computer savreportcheckin
Enrichment data below:
clientuser,ip_address,savreportcheckin,computer u000000,10.0.0.0,2010-07-26 22:24:00,WP103702A740532 z000000,10.0.0.0,2010-07-27 22:23:00,WP103702A740532
If I search on this event data:
2010-07-26 22:55:09 3 10.0.0.0 200 TCP_RESCAN_HIT 2597 851 GET http www.newyorklife.com 80 - - 126.96.36.199 - "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)" Financial%20Services - 127.0.0.1 -
2010-07-26 22:55:09 1 10.0.0.0 200 TCP_HIT 7268 866 GET http www.newyorklife.com 80 - - 188.8.131.52 - "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)" Financial%20Services - 127.0.0.1 -
The z00000 user ID is returned. This is the incorrect user id since the event occured on 07-26 not 07-27 and the person updated there AV sigs approximately 30 mins before this event happened.
Is there a config within lookup definitions to only match "enrichment" data to events that occured within so many seconds either before or after the actual event occured ?
Thanks for the help,
I believe that you can specify negative values for max_offset_secs and min_offset_secs to restrict times to the past.
However, I'm not sure that this matters too much. I think that the documentation of these is bad because the use of "ahead" and "behind" is entirely ambiguous. I think the actual meaning (and default behavior) of these settings will work just fine for you. In other words, if the timestamp on the IP/user mapping is the time when it becomes effective, and you should use it until you see a more recent (newer) mapping (which would be the normal case), then it should work fine for you. I guess all you might need to do is set the min to some number less than zero to account for timestamp discrepancies?
I am about to file a bug on the ambiguity of the docs on this point.
I changed my settings to:
filename = avlookup.csv
maxoffsetsecs = 86400
minoffsetsecs = -86400
time_field = savreportcheckin
and I unable to perform any lookups at all. I thought it might of been a timeformat issue, so I changed my timeformat to:
time_format = %Y-%m-%d %H:%M:%S
and it still is broken.
I removed the min and max the lookups started working again. Any thoughts on a setting that I might be missing or have wrong ?