Splunk IT Service Intelligence

ITSI 4.3.0 Backfill Exception during startup

dogmatic
New Member

Hi,

I have been using ITSI 4.3.0 for some time now, A few months ago had a KV Store issue which seemed to resolve itself by doing a ITSI restore, not sure if thats related.

For 1 month now ITSI takes a while to start and i am missing data in some dashboards because of the backfill exception error, i believe.

This is the exception

2019-10-04 18:28:33,387 ERROR [itsi.backfill] [__init__] [exception] [30764] Backfill core job exception
Traceback (most recent call last):
  File "/opt/splunk/etc/apps/SA-ITOA/bin/itsi_backfill.py", line 83, in do_run
    backfill_core.start()
  File "/opt/splunk/etc/apps/SA-ITOA/lib/itsi/backfill/__init__.py", line 695, in start
    self._run_main_loop()
  File "/opt/splunk/etc/apps/SA-ITOA/lib/itsi/backfill/__init__.py", line 654, in _run_main_loop
    while (self._should_execute()
  File "/opt/splunk/etc/apps/SA-ITOA/lib/itsi/backfill/__init__.py", line 509, in _should_execute
    self._last_exe_check_val = self._modinput_is_target(self.session_key, logger=self.logger)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/ITOA/itoa_common.py", line 241, in modular_input_should_run
    if info.is_captain_ready():
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/utils.py", line 154, in wrapper
    return func(*args, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/server_info.py", line 198, in is_captain_ready
    cap_info = self.captain_info()
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/utils.py", line 154, in wrapper
    return func(*args, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/server_info.py", line 224, in captain_info
    output_mode='json').body.read()
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/splunklib/binding.py", line 287, in wrapper
    return request_fun(self, *args, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/splunklib/binding.py", line 69, in new_f
    val = f(*args, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/splunklib/binding.py", line 665, in get
    response = self.http.get(path, self._auth_headers, **query)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/splunklib/binding.py", line 1160, in get
    return self.request(url, { 'method': "GET", 'headers': headers })
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/splunklib/binding.py", line 1218, in request
    response = self.handler(url, message, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/splunk_rest_client.py", line 140, in request
    verify=verify, proxies=proxies, cert=cert, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', error(104, 'Connection reset by peer'))

And the result of this code exception seems to appear here, as example

alt text

I have the following settings -

[httpServer]
maxThreads = -1
maxSockets = -1

and

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) 1073741824
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 257556
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 250000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 64059
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

setting on all Linux SHC Members.

Wondering if anyone has had this experience and/or similar and possible fixes / workarounds. If you need any further details to assist i'll try to answer.

0 Karma

kanwu_splunk
Splunk Employee
Splunk Employee

Looks to me this is a Splunk environment related issue, and it is pointing to the Splunk rest endoints. Do you see similar error/warning messages from other rest calls?

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.