I have a panel which loads data for
last 3 months and it takes approx 120 secs to load the single panel value - showing the count of advanced users in percentage. Currently, we have implemented the summary index and data model to improve the search performance, but still the query takes approx 45 seconds to show the value in the panel. Need your suggestions on further improving the performance of the query. Below are the details of the same :
1) Following saved search is running
every 5 minutes collecting
last 5 minutes of data into summary index. Please note that we need to show value of last 3 months.
index=box event_type!="*login*" |fields - raw|dedup event_id |table _time,created_by_login,event_type,source_item_name,event_id
2) Data model is created from the summary index.
3) Below is the query which is taking data from datamodel and loads the panel, but still it takes approx 45 seconds to load the value in the panel :
|tstats dc(event_type) as count from datamodel=box_activity by created_by_login |dedup created_by_login |replace "Unknown User" with "Shared Links" in created_by_login |rex field=created_by_login ".@(?<sourcedomain>[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z])" |search sourcedomain="dadomain.com" |rename created_by_login as User, event_type as Activity |where User!="BOX Admin" |sort - count|where count >=7| stats count|appendcols [search index=dctest sourcetype=box-users earliest=-24h latest=now|table S_No|search S_No!="S.No"|sort - S_No|head 1 |eval S_No=S_No] |eval Advanced=round((count*100)/S_No,0)|fields - count|table Advanced|rename Advanced as "%"
How can we further improve the search performance. Whether we can use
loadjob command here ? I think we can use it, but
loadjob will only load the static values based on constant time picker option. If we change the time picker range, then
loadjob will not work accordingly. Below link has some workaround for
loadjob issue, but I think this requires
_time to be part of the output in the above query. How can I add
_time to the above query ? Please help resolve this issue ?
OK, let's start completely over. First of all, realize that these 2 methods are 100% mutually-exclusive, but not incompatibly so.
summary index you are scheduled to run
Every 5 minutes for
The last 5 minutes. This is a TERRIBLE plan because typically, events take 2-3 minutes to get into splunk which means that the events that arrive 2-3 minutes late get missed forever. Instead, I would schedule your search to run
Every 5 minutes for
earliest=-10m latest=-5m and this is assuming that your event latency is typical (you should measure it with
| eval latency = _index_time - _time | stats avg(latency) max(latency) BY sourcetype to be sure. That should cover the
summary index use case.
accelerated data model, you have not shown us your datamodel definition so we really cannot help there accept to say that the
summary index option should work whether you are doing this option or not. We can optimize your exploitation search a bit, though; try this:
|tstats summariesonly=t dc(event_type) AS count FROM datamodel=box_activity WHERE index=* BY created_by_login | rex field=created_by_login ".@(?<sourcedomain>[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z])" | search sourcedomain = "dadomain.com" | where count >= 7 AND created_by_login != "BOX Admin" | stats count | appendcols [search index=dctest sourcetype=box-users earliest=-24h latest=now | search S_No!="S.No" | sort - S_No | head 1 | table S_No] | eval Advanced = round((count * 100) / S_No, 0) | table Advanced | rename Advanced as "%"
Sure. I will change the schedule for my saved search, per your suggestion.
Regarding datamodel definition, I have uploaded the images in my above comments, as I cannot directly upload image here. Please have a look at those images, to understand my datamodel definition.
I will try to run the modified query to check, whether there is any improvement in search performance.
You have your bases covered here. If you need to have the
timepicker and not a static panel, then
loadjob will not work. Did you accelerate your datamodel and try adding
summariesonly=true? If not, then try that. Your only other option beyond this to speed this up is to convert to a
metrics index; see here:
Somehow our discussed old comments got deleted. Not sure what happened, I cannot see my old comments !!!
I asked some questions :
You are confusing the 2 questions. All of our comments are in the old question. I am resetting here.
1: If you are using summary index or accelerated data model, then TTL makes no difference; that is only if we are using
3: See #4.
| tstats summariesonly=t ... or switch to metrics store.
I accelerated the datamodel and tried using summariesonly=true, but it is not returning any events and any values now ? returning zero. What can be the issue here ?
|tstats summariesonly=true dc(event_type) as count from datamodel=box_activity by created_by_login |dedup created_by_login |replace "Unknown User" with "Shared Links" in created_by_login |rex field=created_by_login ".@(?<sourcedomain>[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z])" |search sourcedomain="dadomain.com" |rename created_by_login as User, event_type as Activity |where User!="BOX Admin" |sort - count|where count >=7| stats count|appendcols [search index=dctest sourcetype=box-users earliest=-6h latest=now|table S_No|search S_No!="S.No"|sort - S_No|head 1 |eval S_No=S_No+13 ] |eval Advanced=round((count*100)/S_No,0)|fields - count|table Advanced|rename Advanced as "%"
When I click on datamodel. In the Acceleration status it shows as 100% completed. Then I am using the same query with summariesonly=true, but still I am not getting any events. The output of the query is zero.
while enabling acceleration, I gave last 3 months in the acceleration time.
I think there is some issue in my summary index configuration and datamodel, which I am not able to figure it out. Now, I have disabled "Acceleration", but still I am not able to get the events, zero result. Can you go through my question once again and suggest whether my configuration sequence is proper ? I have configured the summary index for the first time, hence not sure if something is wrong there ?
My Scenario - I want to show %value of users whose event_type is more than 12 activities such as download, upload, preview etc. For that I need 2 fields created_by_name and event_type. So to start with :
Please advise, as I am still confused on my sequence which I have put into question ?
Images of datamodel are uploaded at below links
Are you certain that the data range of the query you are using as a test falls within the date/time range of your acceleration? If you query outside of that range then your query will be as slow as if it was not accelerated.
Lastly, I already tried accelerating the datamodel, but still the query runs slow and takes time to load the result in the panel and also, after accelerating, it is not showing proper value.. showing reduced value in the panel.
Yes, accelerating the datamodel means that you will be running a few minutes behind because it takes time to accelerate. That is why most people use
earliest=-65m latest=-5m when doing an hourly report (shifted back 5 minutes).