About ranurag

ranurag · ‎09-30-2019

We have a data model which has following fields - Source IpAddress FileName FileVersion Flag _time S1 IP1 File1 FileVersion1 Flag1 _time1 S1 IP1 File1 FileVersion1 Flag2 _time2 S1 IP1 File1 FileVersion1 Flag3 _time3 S1 IP1 File1 FileVersion1 Flag4 _time4 There are more than 10 million FileVersion(s) in data and assuming 2 Flag(s) for each gives us ~20 million events in data model. The requirement is to get the latest Flag for each FileVersion and then show a count of FileVersion(s) by Flag. So the output is something like this: Flag Count Other columns Flag1 11,232 ... Flag2 67,764 ... ... We are using query similar to this (execution time ~600sec): |tstats latest(Flag) as Flag where datamodel=xxx by Source, IpAddress, FileName, FileVersion |stats count by Flag, Source, IpAddress, FileName The problem is that tstats is taking long time due to high data cardinality. We even tried using prestats="t" but it does not help much (~10% performance increase). Another caveat is that new Flag for FileVersion can flow in at any time and we need to show the counts based on latest Flag, so creating summary index is not feasible (we will have to run the summary index generating search very frequently and scan full index) Is there any way we can improve the performance of the query or any better way to achieve the requirement.

ranurag · ‎09-12-2019

We have a accelerated data model on Splunk Enterprise for which the scheduled searches are getting skipped. On checking scheduler logs through search query we can see that the search is getting skipped due to concurrency limits. Query: index=_internal sourcetype=scheduler savedsearch_name=*_ACCELERATE_DM_* app="app-name" Result: *search_type="datamodel_acceleration", user="nobody", app="app-name", savedsearch_name="ACCELERATE_DM_app_name_data_model_name.object_name_ACCELERATE",priority=default, status=skipped, reason="The maximum number of concurrent historical scheduled searches on this instance has been reached", concurrency_category="historical_scheduled", concurrency_context="saved-search_instance-wide", concurrency_limit=5, scheduled_time=1568278800, window_time=0 * Similar issue was faced by us earlier for scheduled saved searches and we had fixed the issue by assigning the owner of saved searches as "admin", increasing concurrency limits for "admin" and running the saved searches as "owner". For data model acceleration we have set the owner as "admin" but the search is still running as "nobody". Is there a way we can ensure that searches for data model acceleration run as a particular user rather than "nobody" The data model acceleration does complete even though searches get skipped and there is no problem with data but we would like to avoid the searches getting skipped.

Posts	2
Solutions	0
Karma Given	1
Karma Received	1
Member Since	‎07-04-2019

Online Status	Offline
Date Last Visited	‎06-05-2020 02:04 AM

Is there a better way to improve performance using...

Run data model acceleration search as user instead...

Is there a better way to improve performance using...

Run data model acceleration search as user instead...

Join the Conversation

Is there a better way to improve performance using...

Run data model acceleration search as user instead...

Is there a better way to improve performance using...

Run data model acceleration search as user instead...