We are Using Hunk 6.3 in my application which connects to 3 node Cassandra cluster to build dashboard charts through Virtual Index.
From the Hunk documentation we learnt that Querying and filtering Operation will be done in the nodes we retrive data
from cassandra.
In that case in order to scale up my Hunk to accomodate to 1000 concurrent users search per second what needs to be done
from Hunk side regardless of increasing cassandra nodes.
You can optimize it on several fronts
1) Hardware - Make sure you have enough cores on your Search Heads
2) Splunk Settings - Enable Saved Searches and dashboards, also store info in a Summary Index. All these options will allow you to prevent your users from running long queries against Cassandra and get the data from the local cached data.
3) Access Control - Allow only select users access (or edit) the Cassandra VIX
4) Optimize these Cassandra ERP settings - collect only the data you need
vix.cassandra.cql.cmd
vix.cassandra.datetime.field
vix.cassandra.datetime.format
vix.cassandra.max.days.hence
You can optimize it on several fronts
1) Hardware - Make sure you have enough cores on your Search Heads
2) Splunk Settings - Enable Saved Searches and dashboards, also store info in a Summary Index. All these options will allow you to prevent your users from running long queries against Cassandra and get the data from the local cached data.
3) Access Control - Allow only select users access (or edit) the Cassandra VIX
4) Optimize these Cassandra ERP settings - collect only the data you need
vix.cassandra.cql.cmd
vix.cassandra.datetime.field
vix.cassandra.datetime.format
vix.cassandra.max.days.hence
Is there any limitations on Summary Index with respect to memory or storage? because our One hour data might be in few Gb's and the we require to hold 48 hours record in Summary Index to avoid concurrent user hits to DB. Is there any way we can optimize this?
From a hardware point of view - There is nothing fundamentally different about a summary index vs a regular index.