Splunk IT Service Intelligence

ITSI 4.3.0 Backfill Exception during startup

dogmatic
New Member

Hi,

I have been using ITSI 4.3.0 for some time now, A few months ago had a KV Store issue which seemed to resolve itself by doing a ITSI restore, not sure if thats related.

For 1 month now ITSI takes a while to start and i am missing data in some dashboards because of the backfill exception error, i believe.

This is the exception

2019-10-04 18:28:33,387 ERROR [itsi.backfill] [__init__] [exception] [30764] Backfill core job exception
Traceback (most recent call last):
  File "/opt/splunk/etc/apps/SA-ITOA/bin/itsi_backfill.py", line 83, in do_run
    backfill_core.start()
  File "/opt/splunk/etc/apps/SA-ITOA/lib/itsi/backfill/__init__.py", line 695, in start
    self._run_main_loop()
  File "/opt/splunk/etc/apps/SA-ITOA/lib/itsi/backfill/__init__.py", line 654, in _run_main_loop
    while (self._should_execute()
  File "/opt/splunk/etc/apps/SA-ITOA/lib/itsi/backfill/__init__.py", line 509, in _should_execute
    self._last_exe_check_val = self._modinput_is_target(self.session_key, logger=self.logger)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/ITOA/itoa_common.py", line 241, in modular_input_should_run
    if info.is_captain_ready():
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/utils.py", line 154, in wrapper
    return func(*args, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/server_info.py", line 198, in is_captain_ready
    cap_info = self.captain_info()
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/utils.py", line 154, in wrapper
    return func(*args, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/server_info.py", line 224, in captain_info
    output_mode='json').body.read()
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/splunklib/binding.py", line 287, in wrapper
    return request_fun(self, *args, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/splunklib/binding.py", line 69, in new_f
    val = f(*args, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/splunklib/binding.py", line 665, in get
    response = self.http.get(path, self._auth_headers, **query)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/splunklib/binding.py", line 1160, in get
    return self.request(url, { 'method': "GET", 'headers': headers })
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/splunklib/binding.py", line 1218, in request
    response = self.handler(url, message, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/splunk_rest_client.py", line 140, in request
    verify=verify, proxies=proxies, cert=cert, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/opt/splunk/etc/apps/SA-ITOA/lib/SA_ITOA_app_common/solnlib/packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', error(104, 'Connection reset by peer'))

And the result of this code exception seems to appear here, as example

alt text

I have the following settings -

[httpServer]
maxThreads = -1
maxSockets = -1

and

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) 1073741824
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 257556
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 250000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 64059
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

setting on all Linux SHC Members.

Wondering if anyone has had this experience and/or similar and possible fixes / workarounds. If you need any further details to assist i'll try to answer.

0 Karma

kanwu_splunk
Splunk Employee
Splunk Employee

Looks to me this is a Splunk environment related issue, and it is pointing to the Splunk rest endoints. Do you see similar error/warning messages from other rest calls?

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...