Splunk Search

Why does this search cause Splunk to crash occasionally?

chadman
Path Finder

I have a search that works most of the time, but sometimes just causes Splunk to crash and requires a restart. I have a ticket opened with Splunk, but they are still not able to figure out what's going on so I thought I would post this. This is the search I use in my dashboard.

<![CDATA[| metadata  type=hosts | eval age = now()-lastTime | where age > 300 and age < 86400 | convert ctime(lastTime) | eval field_in_ddhhmmss=tostring((age) , "duration") |rename field_in_ddhhmmss as "Time Offline" lastTime as "Last Update Time" | join host [search sourcetype=systemInfo | rename serial as "Serial Number" isp as "ISP" state as "State" city as "City"] |sort "Time Offline" a | table "Serial Number","Time Offline","Last Update Time","ISP","City","State"]]>

I use it to find computers that were checking in at least 24 hours ago, but have not checked in for the last 5 min. I then use "join" to match to a sourcetype to the host to get some specific data about those hosts. This search is fairly fast and runs in a couple seconds. I was using the same search for months, but this started to happen a couple weeks ago. The only thing that's changed is we have added more hosts. CPU/memory on the Splunk server is low when it crashes, and we're not seeing any spikes when this happens.

0 Karma

RishiMandal
Explorer

Did you ever figure it out? Iam seeing the same behaviour. One of my indexer crashes for no reason

0 Karma

tskinnerivsec
Contributor

Are you running it as a non-admin user? Are you running into disk quota issues?

0 Karma

chadman
Path Finder

Myself and other users do run this dashboard as a non-admin. I'm not a Splunk admin, but can have one check. How would one check disk quota issues? Myself and the Splunk admins are kind of new to Splunk.

0 Karma

tskinnerivsec
Contributor

Well, you could run something like this to look for quota issues:

index = _internal sourcetype=splunkd component=DispatchManager quota

(i'm not 100% sure if this covers ad-hoc searches)

You can use this search to determine what each user's quota on the search head is:

| rest splunk_server=local /services/search/jobs | eval diskUsageMB=diskUsage/1024/1024 | stats sum(diskUsageMB) by eai:acl.owner
0 Karma

tskinnerivsec
Contributor

Have you looked at your last Splunk crash log? Is there any errors in the log about too many open files? If this is on a linux server, this could be a ulimit setting issue.

0 Karma

chadman
Path Finder

Oh, and this it is running on a Linux server.

0 Karma

chadman
Path Finder

Splunk has had me run a bunch of diag logs and they don't see it crashing. Recently I had started Splunk in debug mode and captured another log for them. I'm still waiting to hear back from them about what they found. I'm just puzzled that it works almost all the time. When I try to crash it to get a log it normally takes me about 20-30 min of almost running it continually until it crashes.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...