Splunk Dev

Why is number of open files is causing high CPU on the cluster?

robertlynch2020
Motivator

Hi

I have an issue where Splunk or the OS is starting to use more and more open files over time.

This is causing the CPU to go up and Splunk to get very slow.

The image below shows at about 1:30 yesterday where the CPU starts to go up on the cluster.

The 2nd graph shows the number of files open in the system

1st is using the user log in autoengine (Used to launch Splunk).

lsof -u autoengine | awk 'BEGIN { total = 0; } $4 ~ /^[0-9]/ { total += 1 } END { print total }'

2nd is 

lsof -u autoengine | grep splunk | awk 'BEGIN { total = 0; } $4 ~ /^[0-9]/ { total += 1 } END { print total }'

It looks like one is linked to the other.

when I look at the file of what is taking all the open files I am a bit lost, to be honest.

Below the restart is the restart of Splunk SH and all the indexers

robertlynch2020_0-1648035104477.png

 

There is a 2/3 related to crome -  but i am just not sure?

 

 

 

 

 

 

chrome    56068 autoengine  mem       REG              253,0     15688     8390650 /usr/lib64/libkeyutils.so.1.5
chrome    56068 autoengine  mem       REG              253,0     58728     8515371 /usr/lib64/libkrb5support.so.0.1
chrome    56068 autoengine  mem       REG              253,0     11384     8390648 /usr/lib64/libXinerama.so.1.0.0
chrome    56068 autoengine  mem       REG              253,0   1027384     8406742 /usr/lib64/libepoxy.so.0.0.0
chrome    56068 autoengine  mem       REG              253,0     35720     9044060 /usr/lib64/libcairo-gobject.so.2.11400.8
chrome    56068 autoengine  mem       REG              253,0    481320     9044042 /usr/lib64/libGL.so.1.2.0
chrome    56068 autoengine  mem       REG              253,0     56472     8390526 /usr/lib64/libxcb-render.so.0.0.0
chrome    56068 autoengine  mem       REG              253,0     15336     8390534 /usr/lib64/libxcb-shm.so.0.0.0
chrome    56068 autoengine  mem       REG              253,0    179296     8390451 /usr/lib64/libpng15.so.15.13.0
chrome    56068 autoengine  mem       REG              253,0    189856     9044046 /usr/lib64/libEGL.so.1.0.0
chrome    56068 autoengine  mem       REG              253,0    698744     8406537 /usr/lib64/libpixman-1.so.0.34.0
chrome    56068 autoengine  mem       REG              253,0    691736     8390436 /usr/lib64/libfreetype.so.6.10.0
chrome    56068 autoengine  mem       REG              253,0    255968     8515671 /usr/lib64/libfontconfig.so.1.7.0
chrome    56068 autoengine  mem       REG              253,0    413936     8599610 /usr/lib64/libharfbuzz.so.0.10302.0
chrome    56068 autoengine  mem       REG              253,0      6944     8517202 /usr/lib64/libgthread-2.0.so.0.5000.3
chrome    56068 autoengine  mem       REG              253,0     52032     8599640 /usr/lib64/libthai.so.0.1.6
chrome    56068 autoengine  mem       REG              253,0     92272     9044056 /usr/lib64/libpangoft2-1.0.so.0.4000.4
chrome    56068 autoengine  mem       REG              253,0    269416     8517192 /usr/lib64/libmount.so.1.1.0
chrome    56068 autoengine  mem       REG              253,0    111080     8390378 /usr/lib64/libresolv-2.17.so
chrome    56068 autoengine  mem       REG              253,0    155752     8390430 /usr/lib64/libselinux.so.1
chrome    56068 autoengine  mem       REG              253,0     15632     8517198 /usr/lib64/libgmodule-2.0.so.0.5000.3
chrome    56068 autoengine  mem       REG              253,0     90632     8390432 /usr/lib64/libz.so.1.2.7
chrome    56068 autoengine  mem       REG              253,0     41080     8390354 /usr/lib64/libcrypt-2.17.so
chrome    56068 autoengine  mem       REG              253,0     69960     8390582 /usr/lib64/libavahi-client.so.3.2.9
chrome    56068 autoengine  mem       REG              253,0     53848     8390584 /usr/lib64/libavahi-common.so.3.5.3
chrome    56068 autoengine  mem       REG              253,0   2520768     8515374 /usr/lib64/libcrypto.so.1.0.2k
chrome    56068 autoengine  mem       REG              253,0    470376     8515376 /usr/lib64/libssl.so.1.0.2k
chrome    56068 autoengine  mem       REG              253,0     15848     8390438 /usr/lib64/libcom_err.so.2.1
chrome    56068 autoengine  mem       REG              253,0    210768     8515363 /usr/lib64/libk5crypto.so.3.1
chrome    56068 autoengine  mem       REG              253,0    963504     8515369 /usr/lib64/libkrb5.so.3.3
chrome    56068 autoengine  mem       REG              253,0    320776     8515359 /usr/lib64/libgssapi_krb5.so.2.2
chrome    56068 autoengine  mem       REG              253,0     15744     8390441 /usr/lib64/libplds4.so
chrome    56068 autoengine  mem       REG              253,0     20040     8390440 /usr/lib64/libplc4.so
chrome    56068 autoengine  mem       REG              253,0     32304     8391190 /usr/lib64/libffi.so.6.0.1
chrome    56068 autoengine  mem       REG              253,0    402384     8390420 /usr/lib64/libpcre.so.1.2.0
chrome    56068 autoengine  mem       REG              253,0     15512     8390474 /usr/lib64/libXau.so.6.0.0
chrome    56068 autoengine  mem       REG              253,0   2127336     8390350 /usr/lib64/libc-2.17.so
chrome    56068 autoengine  mem       REG              253,0     88720     8389122 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
chrome    56068 autoengine  mem       REG              253,0    166328     9027824 /usr/lib64/libgdk_pixbuf-2.0.so.0.3600.5
chrome    56068 autoengine  mem       REG              253,0    761616     9555002 /usr/lib64/libgdk-3.so.0.2200.10
chrome    56068 autoengine  mem       REG              253,0   7466528     9555004 /usr/lib64/libgtk-3.so.0.2200.10

 

Labels (1)
Tags (2)
0 Karma
1 Solution

robertlynch2020
Motivator

in the end - a team was running curl commands lots of them and they were not setting start and end time. So they were running ALL TIME  - the SPL was taking a long long time to finish and the longer it went on the more time the next search took to finish etc...

View solution in original post

robertlynch2020
Motivator

in the end - a team was running curl commands lots of them and they were not setting start and end time. So they were running ALL TIME  - the SPL was taking a long long time to finish and the longer it went on the more time the next search took to finish etc...

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...