The following search gives me a table that contains the number of lines of code on the first of each month and calculates the number of lines changed over the month:
index=foo date_mday="1"
| dedup lineCount sortby +_time
| sort -_time
| delta lineCount AS linesChanged
| fieldformat linesChanged=linesChanged*-1
| table _time date_month lineCount linesChanged
This works great as long as there is a log generated on the 1st. I'm looking to add a condition that will add the log from that last day that one was generated if there was none on the 1st of the month.
For example, there is no data for June because nothing was generated on the 1st (see attached picture). The last day that a a log was generated was May 30th, which I would like to include.
I know that date_mday=1 filter will have to be dropped but don't know where to start otherwise.
Any help would be much appreciated!
Hi @dfraseman,
Actually that search is comparing the max linesChanged values during every month with the previous month max values, which is not your need.
Please try below, you may add some sorting;
index=foo
| stats first(_time) as _time min(date_mday) as firstDay latest(date_mday) as lastDay earliest(lineCount) as firstlineCount latest(lineCount) as lastlineCount by date_month
| autoregress lastlineCount p=1
| eval lineCount=if(firstDay==1,firstlineCount,lastlineCount_p1)
| delta lineCount AS linesChanged
| convert timeformat="%b" ctime(_time)
| table _time lineCount linesChanged
Hi @dfraseman,
Can you try below?
index=foo
| timechart span=1mon@mon max(lineCount) as lineCount
| delta lineCount AS linesChanged
| convert timeformat="%b" ctime(_time)
| table _time lineCount linesChanged
Hi @scelikok
Thanks for the reply. I gave it a try and it looks like exactly what ill need format wise, but after some manual checking of the linesChanged it looks like they are off.
The logs are generated early in the morning for the previous days activity and therefore when calculating the difference it has to be done between the 1sts of the month. Correct me if I'm wrong but the span in your solution looks at at the timestamps from the earliest day to the last day of the same month and therefore wouldn't account for the lines changed on the last day of the prior month that gets logged the next day.
Hi @dfraseman,
Actually that search is comparing the max linesChanged values during every month with the previous month max values, which is not your need.
Please try below, you may add some sorting;
index=foo
| stats first(_time) as _time min(date_mday) as firstDay latest(date_mday) as lastDay earliest(lineCount) as firstlineCount latest(lineCount) as lastlineCount by date_month
| autoregress lastlineCount p=1
| eval lineCount=if(firstDay==1,firstlineCount,lastlineCount_p1)
| delta lineCount AS linesChanged
| convert timeformat="%b" ctime(_time)
| table _time lineCount linesChanged
Thanks @scelikok Works like a charm! Just needed to add ascending sort on _time and a new field to show the month prior as that is really the month that the counts of lines changed is for:
index=foo
| stats first(_time) as _time min(date_mday) as firstDay latest(date_mday) as lastDay earliest(lineCount) as firstlineCount latest(lineCount) as lastlineCount by date_month
| sort +_time
| autoregress lastlineCount p=1
| eval lineCount=if(firstDay==1,firstlineCount,lastlineCount_p1)
| eval lastmonth=strftime(relative_time(_time,"-1mon"),"%b")
| delta lineCount AS linesChanged
| convert timeformat="%b" ctime(_time)
| table lastmonth lineCount linesChanged
This might be of some help. The source for these logs are text files and change every month, for example:
source="/.../2020-12.log" (December)
source="/.../2021-01.log" (January)
So I fumbled my way to a search that produces the lineCount from the earliest and latest log that was generated from each source AKA each month:
index=foo
| stats earliest(lineCount) as lineCount earliest(_time) as eventTime by source date_month
| append [ search index=foo | stats latest(lineCount) as lineCount latest(_time) as eventTime by source date_month ]
| sort -eventTime
| fieldformat eventTime=strftime(eventTime,"%Y-%m-%d %H:%M:%S")
| table source date_month eventTime lineCount
So now I think it is just a matter of performing a conditional delta calculation based on the eventTime.
See screenshot for output of query above.