Solved: Terminated with signal 4 (core dumped) when upgrad...

leahs · ‎07-27-2021

Hi,

I'm upgrading my cluster master from version 8.0.3 to 8.2.1. After installing the new version over the old deployment and starting splunk, I get "ERROR: pid xxx terminated with signal 4 (core dumped)", and the Splunk web server is not available. How can I fix this?

My Splunk environment is running on AWS Linux EC2s. This is the information i have about the OS:

NAME="Amazon Linux AMI"
VERSION="2018.03"
ID_LIKE="rhel fedora"

jasen_m · ‎08-09-2021

Hello, we encountered this same issue when we also tried to upgrade and I actually saw your post when we were looking for a solution.

Ultimately, we were able to find out that there was a conflict with something else we had installed on the host.

Can you try to look in /var/log/messages after trying to start splunk?

In our case, we observed the following message:
Aug 7 00:22:54 <splunk_host> kernel: splunkd[53109] trap invalid opcode ip:XXXXXXX sp:XXXXXXXXXX error:0 in <non_splunk_agent>.so[XXXXXX+XXXXX]

After we saw that, we were able to uninstall the agent and splunk was able to start normally on the search head.

View solution in original post

pignardh · ‎10-07-2021

Hello,
I had the same problem and found this solution:

You must deactivate the "usePreloadedPstacks" parameter in
$SPLUNK_HOME/etc/system/local/servers.conf
[watchdog]
usePreloadedPstacks = false

anirbandasdeb · ‎08-03-2021

Were you able to get any resolution for this @leahs ?

Im facing the exact same issue while doing a dry run in preparation for the actual upgrade..

Checking prerequisites...
        Checking http port [9443]: open
        Checking mgmt port [8089]: open
        Checking appserver port [127.0.0.1:8065]: open
        Checking kvstore port [8191]: open
        Checking configuration... Done.
        Checking critical directories...        Done
        Checking indexes...
                Validated: _audit _internal _introspection _metrics _metrics_rollup _telemetry _thefishbucket history main summary
        Done
        Checking filesystem compatibility...  Done
        Checking conf files for problems...
        Done
        Checking default conf files for edits...
        Validating installed files against hashes from '/opt/splunk/splunk-8.2.1-ddff1c41e5cf-linux-2.6-x86_64-manifest'
        All installed files intact.
        Done
        Checking replication_port port [23456]: open
All preliminary checks passed.

Starting splunk server daemon (splunkd)...
ERROR: pid 2283 terminated with signal 4 (core dumped)
Done
 [  OK  ]

Waiting for web server at http://127.0.0.1:9443 to be available..............................................................................................................................................................................................................^C
(dev2) splunk@splnkhfvm-xxx:~ $ ./bin/splunk version
Splunk 8.2.1 (build ddff1c41e5cf)

I am upgrading an SH Cluster from v8.0.5 to 8.2.1

(dev2) splunk@splnkhfvm-xxx:~ $ hostnamectl
   Static hostname: splnkhfvm-qvjj5.xxx
         Icon name: computer-vm
           Chassis: vm
        Machine ID: xxx
           Boot ID: xxx
    Virtualization: kvm
  Operating System: Red Hat Enterprise Linux
       CPE OS Name: cpe:/o:redhat:enterprise_linux:7.9:GA:server
            Kernel: Linux 3.10.0-1160.15.2.el7.x86_64
      Architecture: x86-64
(dev2) splunk@splnkhfvm-xxx:~ $

Entry in /var/log/messages log file

Aug  3 18:04:45 splnkhfvm-xxx sudo:  anirban : TTY=pts/1 ; PWD=/home/dasd ; USER=splunk ; COMMAND=/opt/splunk/bin/splunk start
Aug  3 18:04:49 splnkhfvm-qvjj5 kernel: [7308988.389979] traps: splunkd[5860] trap invalid opcode ip:7f6a374da3bf sp:7ffda55435b0 error:0 in liboneagentproc.so[7f6a374c6000+84000]

vzabawski · ‎11-08-2021

To whoever might find this interesting.

I've recently encountered with such issue after installing Dynatrace OneAgent chart in the same k8s cluster with Splunk. In my case I wasn't able to just delete the liboneagentproc.so file, so I had to uninstall Dynatrace chart and then delete /opt/oneagent directory in pod where Splunk runs.

Link to upgrade notes: https://docs.splunk.com/Documentation/Splunk/8.2.1/ReleaseNotes/Knownissues

jonoped · ‎08-01-2021

Hi @leahs ,

Did you manage to find a solution to this?
I am receiving the same error, though mine is a fresh build of an on-prem VM not cloud.

Thanks,

Jono

leahs · ‎08-02-2021

Hi @jonoped,

I haven't solved the problem yet. The instances are running on this OS:

NAME="Amazon Linux AMI"
VERSION="2018.03"
ID_LIKE="rhel fedora"

What OS are you running?

Lea

jasen_m · ‎08-09-2021

Hello, we encountered this same issue when we also tried to upgrade and I actually saw your post when we were looking for a solution.

Ultimately, we were able to find out that there was a conflict with something else we had installed on the host.

Can you try to look in /var/log/messages after trying to start splunk?

In our case, we observed the following message:
Aug 7 00:22:54 <splunk_host> kernel: splunkd[53109] trap invalid opcode ip:XXXXXXX sp:XXXXXXXXXX error:0 in <non_splunk_agent>.so[XXXXXX+XXXXX]

After we saw that, we were able to uninstall the agent and splunk was able to start normally on the search head.

leahs · ‎08-10-2021

Hi @jasen_m ,

Thank you so much!! This solved the issue.

Terminated with signal 4 (core dumped) when upgrading

installation

troubleshooting

upgrade

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)