Does it make sense to turn data model acceleration on for the Incident Management data model (default summary range is "None")? Of concern in this case is the Expired Entity Activity search in Splunk Enterprise Security (ES). I enabled acceleration for this model and it took a very long time to complete the build. After showing 100% complete, the size of the model is only .5MB. One of the longest running/most skipped searches we have is the correlation search that looks for expired user activity from this model. When I run the manual
| datamodel ... search, I see it iterating through thousands of events when I would expect the acceleration to be doing that in the background and my correlation search returns immediately (since there aren't any matches). Is this a case of when the data model is empty the search goes to _raw? Would acceleration on the correlation search be better than the data model?
The main question is "why is the correlation search showing that it is searching through a bunch of events when the data model is essentially empty?" This is an accelerated search, so it should be searching the accelerated index. Is it due to the fact that the correlation search looks to "future" events (+5m@m) that are not in the accelerated index?
The default search for this correlation using |datamodel. This is not a search that looks at accelerated data.
|tstats would look at accelerated and raw
|tstats summariesonly=true would just look at accelerated data.
The datamodel you are referring to is actually looking mostly at the CSV files that we have built off of your source assets and identities files.
We are looking for Expired Identity Accounts that have activity. So in reality the search would need to look at all events coming in.
I don't think this DM is even a candidate for acceleration based on what I know.
As per jwelch's comment have you tried:
Activity from Expired User Identity
15:20:00 EDT Enabled | Disable |
Change to scheduled
Perhaps a scheduled search would be more efficient?
An accelerated data model works by searching through the original data to build the data model, once the data model exists you can then use tstats to read from the accelerated data model (in my current understanding)
In the ES version i am running the macro used on this datamodel does an index=notable, and the index is tiny therefore the runtime is minimal!
However I just realised the search your referring to uses the Identity Management data model, also known as the Assets And Identities data model. Two of the data sets in there are lookups, 1 of them searches without any index limits, perhaps you could add in index= to it and limit what indexes it searches over to reduce the impact ?
Is this search set like so?
Start time rt-5m@m
End timer rt+5m@m
Or do you see this:
Activity from Expired User Identity Correlation Search SA-IdentityManagement 2017-03-16 15:20:00 EDT Enabled | Disable | Change to scheduled
And what version of Splunk Core and ES are you running? Also is this a SHC?