Solved: Re: splunkd causes OutOfMemory when searching data...

yuanliu · ‎12-29-2025

I am onboarding a JSON dataset whose event size is very close to 1MB. I have to increase TRUNCATE to 1000000 (from default of 10000).

TRUNCATE = 1000000
KV_MODE = json

But when I perform a search on this sourcetype, splunkd's memory demand sky-rockets and causing oom_killer to kill random processes, effectively bringing down the OS.

I see some default sourcetypes (etc/system/defaut/props.conf) already have TRUNCATE=1000000. But they use INDEXED_EXTRACTIONS=json. Is index-time extraction the only/recommended way to handle these exceedingly large events? Is there some formula to determine search time memory need based on event size?

yuanliu · ‎12-29-2025

1. 1MByte events are... huge. Whether it is kv-json or plain regex-based extractions, it's gonna be heavy.

I think search-time extraction (aka KV_MODE) assumes a lot of contingencies so it holds a lot more data in memory. That is causing the memory pressure. After taking away KV_MODE, there is no problem in search. I then apply | spath inline. No problem at all. It is actually very performant.

I am starting to learn some difference between implied actions and some evaluation actions. (See a Slack thread about tojson and eval. In this case, spath behaves like eval.) I am guessing that there is a good reason why certain implied actions consumes so much more resource. Maybe that's why those large-event default sourcetypes use index-time extraction instead.

In the end, index-time extraction and inline spath are the only options for such sourcetypes.

View solution in original post

PickleRick · ‎12-29-2025

1. 1MByte events are... huge. Whether it is kv-json or plain regex-based extractions, it's gonna be heavy.

2. As a side note - if splunkd brings down whole OS it might be the time to tweak the VMM parameters. (swappiness, zram, oom killer priorities...)

yuanliu · ‎12-29-2025

1. 1MByte events are... huge. Whether it is kv-json or plain regex-based extractions, it's gonna be heavy.

I think search-time extraction (aka KV_MODE) assumes a lot of contingencies so it holds a lot more data in memory. That is causing the memory pressure. After taking away KV_MODE, there is no problem in search. I then apply | spath inline. No problem at all. It is actually very performant.

I am starting to learn some difference between implied actions and some evaluation actions. (See a Slack thread about tojson and eval. In this case, spath behaves like eval.) I am guessing that there is a good reason why certain implied actions consumes so much more resource. Maybe that's why those large-event default sourcetypes use index-time extraction instead.

In the end, index-time extraction and inline spath are the only options for such sourcetypes.

splunkd causes OutOfMemory when searching data with large event size

field extraction

JSON

props.conf

Splunk MCP & Agentic AI: Machine Data Without Limits

Finding Based Detections General Availability

Get Your Hands Dirty (and Your Shoes Comfy): The Splunk Experience

Join the Conversation

splunkd causes OutOfMemory when searching data with large event size

field extraction

JSON

props.conf

Splunk MCP & Agentic AI: Machine Data Without Limits

Finding Based Detections General Availability

Get Your Hands Dirty (and Your Shoes Comfy): The Splunk Experience