I try to figure out a design for the metric indexing with the following constrainst:
- keep the original raw data
- availability of the metrics (ok for 15/30min)
- high number of indexes and TB by day
- lot of data manipulation for metric name and format alignement (factor the volume)
- high search complexity (accross many indexes...)
In that case, what do you suggest as what I've in mind is not really good...?
- lookup: with the volume, the data manipulation and the search it's not sure to have a good performance result
- kafka: add design complexity (ms, infra...) and imply to rewrite the current transformation rules
- transformation during the indexing: it's not recommanded and it doesn't match with the need to keep the original raw
- reindexing: data in new indexes will duplicate cost (infra but splunk lic also?) and increase the delay to have the metrics
Thanks in advance for your help and enjoy your weekend 🙂
Ok and thanks for the advice.
Do they icnrease the cost as the mcollect command will "convert events into metric data to be stored in a metric index on the search head" ?
Someone as proposed me to use ES+data Model / SIEM to make the job but not sure that it will reply to my expectation. From my understanding it's more to do the metric analytics than clean and format metrics. What do you think?
ES doesn't really use metrics. You could build a Data Model and use your own accelerated one for custom dashboards but ES requires an additional license. If you don't have an ES running, you won't need it just for some metrics.
Also, Splunk Enterprise also has the option to build data models and accelerate them. Docs: About data models