About jbuecse

jbuecse · ‎04-08-2024

We have several summary searches that collect data into metric indexes. They run nightly and some of them create quite a large number of events (~100k). As a result we sometimes see warnings, that the metric indexes cannot be optimised fast enough. A typical query looks like index=uhdbox sourcetype="tvclients:log:analytics" name="app*" name="*Play*" OR name="*Open*" earliest=-1d@d+3h latest=-0d@d+3h | bin _time AS day span=24h aligntime=@d+3h | stats count as eventCount earliest(_time) as _time by day, eventName, releaseTrack, partnerId, deviceId | fields - day | mcollect index=uhdbox_summary_metrics split=true marker="name=UHD_AppsDetails, version=1.1.0" eventName, releaseTrack, partnerId, deviceId The main contributor to the large number of events is the cardinality of deviceId (~100k) which effectively is a "MAC" address with a common prefix and defined length. I could create 4 / 8 /16 reports each selecting a subset of deviceIds and schedule them at different times, but it would be quite a burden to maintain those basicly identical copies. So... I wonder if there is a mechanism to shard the search results and feed them it into many separate mcollects that are spaced apart by some delay. Something like index=uhdbox sourcetype="tvclients:log:analytics" name="app*" name="*Play*" OR name="*Open*" earliest=-1d@d+3h latest=-0d@d+3h | shard by deviceId bins=10 sleep=60s | stats count as eventCount earliest(_time) as _time by day, eventName, releaseTrack, partnerId, deviceId | fields - day | mcollect index=uhdbox_summary_metrics split=true marker="name=UHD_AppsDetails, version=1.1.0" eventName, releaseTrack, partnerId, deviceId Maybe my pseudo code above is not so clear. What I would like to achieve is, that instead of one huge mcollect I get 10 mcollects (each for a approximately 1/10th of the events). They should be scheduled approximately 60s apart from each other...

Posts	2
Solutions	0
Karma Given	0
Karma Received	0
Member Since	‎04-08-2024

Online Status	Offline
Date Last Visited	‎04-08-2024 01:38 PM

Sharding searches / mcollect

Sharding searches / mcollect

Are you a member of the Splunk Community?

Sharding searches / mcollect

Sharding searches / mcollect