Splunk Search

What is a splunk search in "zombie" state? What does this mean?

jrodman
Splunk Employee
Splunk Employee

Sometimes when I review splunk logs or job inspector I see that I have searches in zombie state. What does this mean?

Tags (1)
1 Solution

jrodman
Splunk Employee
Splunk Employee

Splunk calls a search a zombie when it the search is no longer running, but did not declare explicitly that it has finished its work.

Typically this means that the search crashed, but some other scenarios might exist like an awkward shutdown, or some other class of bug like an error writing out the completion state.

Note, this is not the same as a UNIX "zombie" process, it's only tangentially even similar.

View solution in original post

zakkg3
Explorer

Iam running a test server in a Ligthsail aws, its a bitnami distro with only 512 MB Ram. splunkd RAM usage is more than 90% in large searches,
expanded the swap to 12GB solve the problem.
Just follow these steps:

Make all swap off
sudo swapoff -a

Resize the swapfile
sudo dd if=/dev/zero of=/swapfile bs=1M count=1024

Make swapfile usable
sudo mkswap /swapfile

Make swapon again
sudo swapon /swapfile

Note this:

Recommended hardware capacity

The following requirements are accurate for a single instance installation with light to moderate use. For significant enterprise and distributed deployments, see Capacity Planning.

Platform Recommended hardware capacity/configuration
Non-Windows platforms 2x six-core, 2+ GHz CPU, 12GB RAM, Redundant Array of Independent Disks (RAID) 0 or 1+0, with a 64 bit OS installed.
Windows platforms 2x six-core, 2+ GHz CPU, 12GB RAM, RAID 0 or 1+0, with a 64-bit OS installed.

0 Karma

bhawkins1
Communicator

In my case the issue was prolonged high RAM usage due to a complex report.

Due note that splunk recommends 12GB of RAM.

0 Karma

sherm77
Path Finder

I had this issue today with a real time search (Splunk enterprise v6.2.0), we were experiencing issues where the search terms weren't being picked up by the real time search/alert and alerting on terms that we were excluding. After searching Answers & Docs repeatedly and even Googled the issue, I found nothing.

After much searching/changing search terms, testing & nashing of teeth, I turned on the "List in Triggered Alerts" and then examined the next alert that came up in Job Inspector. I saw that the search terms that I put in the alert were there, but the search job properties did not have the changed search terms, so I was finally on a hot trail. I tried changing the search terms several times, but the changes never made it to the search job properties that are what the search head sends to the indexer.

When I went to the Activity > Jobs menu and went to look at the particular user & Running jobs, I saw a number of zombie processes out there. When I looked at them in Job Inspector, I saw that they had the very search terms that I was trying to change. So, instead of restarting the Splunk instance, I tried finalizing the job / alert that was running (real time). The zombie processes evaporated (and stopped eating my cpu brains!) then the job started back up and it was using the correct, changed search terms. I tried changing the search terms a few times after that and the running job correctly reflected the changes.

I hope this will help someone as I spent about 5 hours messing with this, but it is a good lesson learned. I wasn't aware that zombie processes could prevent changes, although it makes sense. I'll have to use the Job Inspector weapon in more often to rid my installation of zombies.

0 Karma

jrodman
Splunk Employee
Splunk Employee

This answer isn't about an issue. It's an informational answer about splunk terminology.

0 Karma

sherm77
Path Finder

No @jrodman, my answer is not just informational, it is an operational way to correct an issue with zombied search jobs in 6.2.x, and it would be nice if zombie searches were part of the documentation (maybe in the troubleshooting section). I'm not sure why you would put that comment under my answer.

Interesting thing that I ran across a few days after my answer above is a search on how to find Zombie jobs in Splunk, in a field called isZombie! I created this search and I alert if I get any results.

In essence, it shows if a search job died for some reason, but the search continues. A real-time search or long running, intensive search that continues while another kicks off (maybe repeatedly) will certainly cause issues (as I've experienced).

| rest /services/search/jobs | search isZombie>0| table author id isDone isFailed isFinalized isPaused isRealTimeSearch isSaved isZombie normalizedSearch request.search request.earliest_time request.latest_time sid title updated

Found these Splunk doc links concerning zombied searches:

Link about isZombie: http://docs.splunk.com/DocumentationStatic/CshrpSDK/2.1.1/Splunk.Client/html/f5a017f2-1e02-d689-339a...

From doc, classic definition of zombie: Gets a value that indicates if the process running the current search job is dead, but with the search not finished.

This doc indicates much of the same: http://dev.splunk.com/view/python-sdk/SP-CAAAEE5

From doc: isZombie A Boolean that indicates whether the process running the search is dead, but with the search not finished

Lastly, search for isZombie in the JavaSDK doc about Job: http://docs.splunk.com/DocumentationStatic/JavaSDK/1.0/com/splunk/Job.html

JensT
Communicator

Hi,

We had also "zombie" searches. In our case it were searches still running on the indexer, but not anymore on the Search-Head.
This occurs in Splunk 4.3.x with KV_MODE=XML and logs with invalid XML.
If you have such problems, ask Splunk for the Patch-Release.

Regards,
Jens

0 Karma

jrodman
Splunk Employee
Splunk Employee

Splunk calls a search a zombie when it the search is no longer running, but did not declare explicitly that it has finished its work.

Typically this means that the search crashed, but some other scenarios might exist like an awkward shutdown, or some other class of bug like an error writing out the completion state.

Note, this is not the same as a UNIX "zombie" process, it's only tangentially even similar.

yannK
Splunk Employee
Splunk Employee

So this is not when a search is eating the cpu brain 🙂

0 Karma

RicoSuave
Builder

So this has nothing to do with the zombie apocalypse?

Get Updates on the Splunk Community!

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Deprecation of Splunk Observability Kubernetes “Classic Navigator” UI starting ...

Access to Splunk Observability Kubernetes “Classic Navigator” UI will no longer be available starting January ...

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...