Getting Data In

Dealing with OMG huge events

mikelanghorst
Motivator

I've got a few log4j application logs that can get extremely long when my developers decide to dump out message payloads into the log. Similiar to a large stack trace it's a many line event, and can be several hundred lines (at least that, haven't counted exact numbers yet). I've had issues with these events being broken into multiple event when they shouldn't be. I've set BREAK_ONLY_BEFORE, and MAX_EVENTS = 3000, but I'm still seeing the events broken up.

Realistically at some point the extra data in this event is simply unnecessary, and shouldn't be even included as INFO level messages. Is there a way to simply dump the data after a certain length?

Tags (2)

landen99
Motivator

Try this at searchtime:

| rex mode=sed "s/([\r\n].{1,100}).*/\1/g"

0 Karma

jspears
Communicator

For some kinds of events, you can readily determine what you want to throw away and do that with SEDCMD. Here's mine for Windows events that include egregious amounts of information after the actual event data:

SEDCMD-windows = s/This event is generated.+$//g

TRUNCATE sounds like an excellent option if you know a specific length where you want to start throwing out irrelevant data.

kbains
Splunk Employee
Splunk Employee

If you really REALLY want it as one event, you can use TRUNCATE=0. Just be warned that pulling back lots (>100,000) of large events (>1MB) will cause your browser to use a lot of memory and may even cause it to crash.

mikelanghorst
Motivator

Hmm, I'd been concerned with the performance hit on the server, not even thinking about the client.

There's not a good break point in the data, that wouldn't just be a fragment of the data and orphaned off.

For this data 99% of the time default handling works great for sourcetype=log4j, just they decided upon error to log the data to the standard server.log file, when it should really be dumped elsewhere, or not logged at all.

0 Karma

mikelanghorst
Motivator

Talking with dwaddle in #splunk, he's suggested using LINE_BREAKER, and increasing TRUNCATE to a large number rather than BREAK_ONLY_BEFORE and MAX_EVENTS. Gonna give that a shot instead, but still wondering if/how other users are dealing with really large event sets such as this.

mikelanghorst
Motivator

Works better, but I'm still getting messages broken where they shouldn't be.

0 Karma

mikelanghorst
Motivator

Quick check of the current offender, event count is about 30-35k lines...

What type of impact would setting MAX_EVENTS to like 40000 have on the indexers?

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...