Solved: How to optimize the performance of my search repor...

knielsen · ‎10-26-2015

Hello,

I know it's easy and straightforward to get ingestion metrics (how much data was ingested) based on sourcetype or index, searching with index=_internal source=*metrics.log

Unfortunately, we do have a bunch of different services that log to the same indexes and sourcetypes, but now we want to calculate their ingestion based on a specific field, let's call it service. So something like this would do the trick:

index=foo earliest=-1d@d latest=@d | eval bytes=len(_raw) | stats sum(bytes) by service

This is very very slow though (we ingest > 1TB / day). Is there a more elegant and faster way to achieve this?

Regards,
Kai.

woodcock · ‎10-26-2015

You could set this up as an hourly summary index.

View solution in original post

woodcock · ‎10-26-2015

You could set this up as an hourly summary index.

knielsen · ‎10-28-2015

Well thanks, if summary indexing is the only or best solution, then summary indexing it will be. 🙂

I'll go with a 5m index as lowest source and may build a 1d summary based on that. This is fast enough so that I can actually do the stats sum by some other additional fields that may help us for future analysis.

How to optimize the performance of my search reporting on how much data was indexed for arbitrary fields?

Can’t make it to .conf25? Join us online!

Community Content Calendar, September edition

Splunkbase Unveils New App Listing Management Public Preview

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you a member of the Splunk Community?

How to optimize the performance of my search reporting on how much data was indexed for arbitrary fields?

Can’t make it to .conf25? Join us online!

Community Content Calendar, September edition

Splunkbase Unveils New App Listing Management Public Preview

Leveraging Automated Threat Analysis Across the Splunk Ecosystem