We are attempting to replicate ArcSight's 'active list' functionality in Splunk. Is there a straight-forward means of populating a lookup table with real-time search results? We would also like the lookup table entries to expire after a given amount of time.
You can setup a scheduled search to retrieve the list at an interval and overwrite the lookup table file entry.
That is a great suggestion.
Since the data in the tables would be coming from a real-time search I think we might be able to achieve the results we are looking for by simply throttling the alert. I'm not an ArcSight content expert but it seems to me that throttling in ArcSight is facilitated by populating these tables and setting an expiration period.
There days I miss ArcSight's active lists and rules constantly running in memory. The general approach would be to add a date field of now() to any data you are going to add to the list and then at the end of your query before you do outputlookup do a where your date field is greater than or less than your TTL (depending on how you write that part).
Active Lists (aka most recent event) with a TTL of 1 day
Scheduled search for every 5/10/x minutes.
... | eval last_seen = now() | inputlookup append=t mylist | table src_ip user_name last_seen| stats max(last_seen) as last_seen by | eval now = now() | eval ttl = 3600 * 24 | where (now - last_seen) < ttl | table | outputlookup mylist.csv
If you want to have more of a Session List you could remove the max stats part but then you also wouldn't really be removing the items off the list /shrug. At any rate you now have one list to add items to the list, keep it current, and remove stale items.
We used this today so thanks for this answer!
One quick note if you do: it's missing a field name after the "by" in the stats max() section. We used the names of the fields that we're putting into the CSV.
In the end, though, we saved this version of it for any searches that run through a lot of data, and went simpler by just using the time picker for the last 90 days for our base search (which runs twice daily) then overwriting the CSV (instead of appending, which we'd done before). The search we're running isn't that intensive, resource-wise. If we see resource issues, we'll use this version.