I have a very simple process to monitor monthly ETL processes, so I only get one file each month. That is until something goes wrong and I get more than one (reruns, bug fixes, etc).
For my dashboard I need to be able to filter out only last log file from each month and build my reporting logic based on those sourcefiles only. My first instinct was to do something like:
index=my_process_name process='my_process' data_type='stats' table_name='my_table'
| dedup year_month_code, currency
| chart max(ammount) by year_month_code, currency
( year_month_code is YYYY-MM extracted from _time )
This works, as long as all currency codes matches between different logs in given month. For example if first log for June have a row with currency=EUR and latest June log does not it would get the EUR row anyway from previous log which is something I want to avoid.
Above example is just that - an example. I need to have access to all fields after the filtering so that I can create arbitrary analysis on the result.
Thanks for Your help
... View more