I know it's easy and straightforward to get ingestion metrics (how much data was ingested) based on sourcetype or index, searching with index=_internal source=*metrics.log
Unfortunately, we do have a bunch of different services that log to the same indexes and sourcetypes, but now we want to calculate their ingestion based on a specific field, let's call it service. So something like this would do the trick:
index=foo earliest=-1d@d latest=@d | eval bytes=len(_raw) | stats sum(bytes) by service
This is very very slow though (we ingest > 1TB / day). Is there a more elegant and faster way to achieve this?
Well thanks, if summary indexing is the only or best solution, then summary indexing it will be. 🙂
I'll go with a 5m index as lowest source and may build a 1d summary based on that. This is fast enough so that I can actually do the stats sum by some other additional fields that may help us for future analysis.