Splunk IT Service Intelligence

Failed ITSI restore from backup...

hascobot
New Member

Hi,

The very important services_kpi_lookup kvstore got overwritten by a mistake when an operator wrote "|outputlookup services_kpi_lookup" instead of "|inputlookup services_kpi_lookup". This has had extremely big consequences.

The ITSI environment does not work right now. It looks like we have no services, service templates, base searches, etc. Luckily enough the Splunk ITSI keeps backups for a week back by default. However, when we tried to restore to it we got a failed restore. This is our only chance to salvage our environment. It seems like it fails because our ITSI environment is so big. We had over 1000 services and over 8000 KPIs. When we read the logs we see that they say the following:

hascobot_2-1597517397641.png

hascobot_3-1597517436587.png

 

 

hascobot_4-1597517500320.png

This last error is the one that we get stuck on right now. The restore from backup functionality seems to not work in our case and we do not know why. Any help would be appreciated. 

Kind Regards,
A Very Concerned Person

Tags (1)
0 Karma

eduncan
Splunk Employee
Splunk Employee

They are timing out because of how many objects are in the KV Store.  Make sure you clean the KV store first and then you can change the default timeout of 12 hours.  The instructions are listed here:https://docs.splunk.com/Documentation/ITSI/4.5.0/Configure/Restore

Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...