Getting Data In

Questions for large install admins: What's your best practice for log length (TRUNCATE) maximums? How much do you force devs to limit their event size?

twinspop
Influencer

My current splunk install handles logs for about 80 different development groups. Each one with their own idea of what logs should be. Currently at 700+ sourcetypes, and growing every day.

Some of the devs get really carried away with what they're logging. IMHO. I used to have a limit of 10 KB. Who would ever need log events with more than 10 KB of data. "That's insane!" I used to say. Ah the good old days. Then they wanted 50 KB, and i grudgingly complied. Then they wanted 100 KB, and again i was forced to comply. It's still not enough. I now have log events of 1 MB or larger. Giant XML dumps. I dunno if I'd call them logs. More like entire databases. And of course they "need" them. Gotta have it. No matter the limit, they will find ways to push beyond.

Do you just run with TRUNCATE=0 (and a well-defined LINE_BREAKER)? Do you try to educate devs to not be braindead and/or lazy? I've been trying the latter for years, and it feels like I'm just spinning my wheels.

In the last 24hrs I see 1.1 million warnings regarding truncation in the internal logs, representing about 75 sourcetypes. Gross.

0 Karma

sloshburch
Ultra Champion

This one is super hard given it's very specific to your data owner's needs of that data. There might be other ways for them to solve the problem. It's certainly odd to expect a production system having to output that much data in each pop.

Perhaps getting them to be more involved and understand the implication of their needs would get them to be more emotionally invested in the challenge you face.

Alternatively, what about breaking those large events into a ton of sub events (just events in Splunk) and giving them knowledge objects to correlate and reattach them all at search time as needed?

0 Karma

woodcock
Esteemed Legend

I don't really understand the nature of your concern but you are really looking for trouble if you specify TRUNCATE=0. It should never be done that way.

0 Karma

DalJeanis
Legend

1) Just to be sure I understand you... how, precisely, were you "forced to comply"?

2) Have you calculated what the storage and logging cost is, and made that known to your management?

3) What are the actual ability of your system to process and store log events? If you are getting a million truncation events per day, then you are losing data. What is the effect of losing that data?

4) Just for fun, consider the possibility of a policy rerouting oversize events for each sourcetype to a separate index. They can have 1 meg if they want, but it will go away ten times faster. Properly compact events get priority.

0 Karma
Get Updates on the Splunk Community!

Splunk Observability as Code: From Zero to Dashboard

For the details on what Self-Service Observability and Observability as Code is, we have some awesome content ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Shape the Future of Splunk: Join the Product Research Lab!

Join the Splunk Product Research Lab and connect with us in the Slack channel #product-research-lab to get ...