I've read the docs in the splunk manual on parse-time indexed fields. http://docs.splunk.com/Documentation/Splunk/6.1.3/Data/Configureindex-timefieldextraction
But I still have a question. We're going to be search 15 months worth of authentication data to see if users have logged in within the previous 15 months. We'll have to do this search for 700,000 different user IDs. So the speed of the individual search is very important.
We've already decided to create a summary index that extracts the auth information from the main LDAP and Active Directory logs and creates a new, reduced data set. However, I'm still concerned that search 15 months worth of data will take a LONG time when repeated 700,000 times. For example, if each search requires an average of 0.5 seconds, our search will take 4 days.
I'm wondering if creating an index-time field for the user id would speed things up dramatically? This is what we'd do with a database table, but I'm not sure if "indexed" means the same thing in Splunk.
Basically each search would need to go back and look for the first successful auth event for each user ID, and could stop there. Unfortunately, we expect a significant number of these to fail, and thus to have to repeatedly search the entire data set.
Does this sound like a good use case for creating an index-time field?
Thanks,
Andrew
... View more