Our Application does a nightly re-index on node 1, once thats complete, the index build is copied to 6 other nodes, Each of the other nodes then restore the files. These entries are noted as "Index restore started." and "Index restore complete." in the application logs. I would like to have a dashboard panel that shows how long it takes from "started" to "complete" on each host to be able to see trends for this over time. How to go about this?
--- search
| eval starttime=if(match(_raw,"Index restore started"),_time,null)
| eval completetime=if(match(_raw,"Index restore complete"),_time,null)
| stats latest(starttime) as starttime latest(completetime) as completetime by host
thanks, can we calculate the number of minutes as a column? Rather than giving the result in hours?
Can you share anonymised sample log entries so we can see what you are dealing with?
Thanks for the reply.
Here is the Index Start Event:
5/4/21
2:43:26.502 AM
2021-05-04 02:43:26,502-0400 NodeReindexServiceThread:thread-0 INFO [c.a.j.index.ha.DefaultIndexCopyService] Index restore started. Total 3561438 issues on instance before loading Snapshot file: IndexSnapshot_39801.tar.sz
host = hostname
source = application.log
The Following is the Index Completed Event , inned the duration from the time it started to when its complete by the hostname please.
5/3/21
1:08:10.843 AM
2021-05-03 01:08:10,843-0400 NodeReindexServiceThread:thread-0 INFO [c.a.j.index.ha.DefaultIndexCopyService] Index restore complete. Total 3653227 issues on instance
host = hostname
source = application.log
--- search
| eval starttime=if(match(_raw,"Index restore started"),_time,null)
| eval completetime=if(match(_raw,"Index restore complete"),_time,null)
| stats latest(starttime) as starttime latest(completetime) as completetime by host
Thanks i tried the query and got the following, how to make it readable?
host starttime completetime
Hostname | 1620017696.169 | 1620018391.676 |
| fieldformat starttime=strftime(starttime,"%Y-%m-%d %H:%M:%S")
| fieldformat completetime=strftime(completetime,"%Y-%m-%d %H:%M:%S")
| eval duration=tostring(completetime-starttime,"duration")
| eval completeminutes=floor((completetime-starttime)/60)
Based on the range and the search, it looks like this takes the last 7 days and finds the latest start and complete times and uses those. What we really want to see is how long it takes each day so that we can compare day over day. As we are seeing that reindexing is taking a long time lately, we would probably then choose a range of 30 days or even longer to get a comparison.
--- search
| bin span=1d _time as day
| eval starttime=if(match(_raw,"Index restore started"),_time,null)
| eval completetime=if(match(_raw,"Index restore complete"),_time,null)
| stats latest(starttime) as starttime latest(completetime) as completetime by day host
| fieldformat day=strftime(day,"%Y-%m-%d")
| eval duration=tostring(completetime-starttime,"duration")
| eval completeminutes=floor((completetime-starttime)/60)
Thanks but the output is like the following...
day host starttime completetime
2021-05-05 hostname 1620228103.281
2021-05-05 hostname 1620228045.815
2021-05-05 hostname 1620228474.435
2021-05-05 hostname 1620228646.272
2021-05-05 hostname 1620228205.655
2021-05-05 hostname1620227386.055
2021-05-09 hostname1620611884.843
2021-05-10 hostname1620637757.090
Given that you only have one time (apart from the day), either starttime or completetime is not being evaluated correctly. It is not possible to tell from the example you pasted. If you can't tell, try changing the order of the fields e.g. day starttime host completetime to see which is missing and then work out why.
The search query that i use is the following:
host=hostname* "Index restore*" | eval starttime=if(match(_raw,"Index restore started"),_time,null)
| eval completetime=if(match(_raw,"Index restore complete"),_time,null)
| stats latest(starttime) as starttime latest(completetime) as completetime by host | fieldformat starttime=strftime(starttime,"%Y-%m-%d %H:%M:%S")
| fieldformat completetime=strftime(completetime,"%Y-%m-%d %H:%M:%S") | eval duration=tostring(completetime-starttime,"duration")
| eval completeminutes=floor((completetime-starttime)/60)
When i select last 30 days:
it gives the latest log entry and the entry from a month ago.
host starttime completetime completeminutes duration
hostname 2021-05-09 21:58:04 2021-05-10 05:09:17 431 07:11:12.247
hostname 2021-06-06 12:42:21 2021-06-06 12:50:10 7 00:07:49.396
hostname 2021-06-06 12:42:20 2021-06-06 12:50:05 7 00:07:45.268
hostname 2021-06-06 12:42:22 2021-06-06 12:50:25 8 00:08:03.316
hostname 2021-06-06 12:42:22 2021-06-06 12:51:01 8 00:08:38.956
hostname 2021-06-06 12:42:22 2021-06-06 12:49:58 7 00:07:35.255
hostname 2021-06-06 12:42:23 2021-06-06 12:49:50 7 00:07:27.548
The search query though gives the log entry from every week:
host=hostname* "Index restore*"
Have the entries from
May 9th
May 15th
May 23rd
May 30th
June 6th
Why am i not getting this in the statistics but get only the latest log entry?
You are not getting the other days because you didn't include the bin line to calculate the day and you didn't include day in the by clause of the stats command.
I modified the query included bin however i couldn't fig out how to include day in the by clause of the stats command.
host=hostname* "Index restore*" | eval starttime=if(match(_raw,"Index restore started"),_time,null)
| eval completetime=if(match(_raw,"Index restore complete"),_time,null)
| stats latest(starttime) as starttime latest(completetime) as completetime by host | fieldformat starttime=strftime(starttime,"%Y-%m-%d %H:%M:%S")
| fieldformat completetime=strftime(completetime,"%Y-%m-%d %H:%M:%S") | eval duration=tostring(completetime-starttime,"duration")
| eval completeminutes=floor((completetime-starttime)/60) | bin span=30d _time
As I showed earlier, the bin needs to be earlier on in the search and the day is just added to the stats command by clause
host=hostname* "Index restore*"
| bin span=1d as day
| eval starttime=if(match(_raw,"Index restore started"),_time,null)
| eval completetime=if(match(_raw,"Index restore complete"),_time,null)
| stats latest(starttime) as starttime latest(completetime) as completetime by day host
| fieldformat starttime=strftime(starttime,"%Y-%m-%d %H:%M:%S")
| fieldformat completetime=strftime(completetime,"%Y-%m-%d %H:%M:%S") | eval duration=tostring(completetime-starttime,"duration")
| eval completeminutes=floor((completetime-starttime)/60)
@ITWhisperer Thanks
When i run the query it says "Error in 'bin' command: Invalid argument: 'day'"
Got it changed the bin query to "bin _time span=1d as day"
The following result is what i am getting
day host starttime completetime completeminutes duration
1620532800 host 2021-05-09 21:58:04
1620619200 host 2021-05-10 05:09:17
You might want to format the day column
| fieldformat day=strftime(day,"%Y-%m-%d")
Sorry that was a typo - you are right it should have been bin _time span=1d as day