Getting Data In

Questions for large install admins: What's your best practice for log length (TRUNCATE) maximums? How much do you force devs to limit their event size?

twinspop
Influencer

My current splunk install handles logs for about 80 different development groups. Each one with their own idea of what logs should be. Currently at 700+ sourcetypes, and growing every day.

Some of the devs get really carried away with what they're logging. IMHO. I used to have a limit of 10 KB. Who would ever need log events with more than 10 KB of data. "That's insane!" I used to say. Ah the good old days. Then they wanted 50 KB, and i grudgingly complied. Then they wanted 100 KB, and again i was forced to comply. It's still not enough. I now have log events of 1 MB or larger. Giant XML dumps. I dunno if I'd call them logs. More like entire databases. And of course they "need" them. Gotta have it. No matter the limit, they will find ways to push beyond.

Do you just run with TRUNCATE=0 (and a well-defined LINE_BREAKER)? Do you try to educate devs to not be braindead and/or lazy? I've been trying the latter for years, and it feels like I'm just spinning my wheels.

In the last 24hrs I see 1.1 million warnings regarding truncation in the internal logs, representing about 75 sourcetypes. Gross.

0 Karma

sloshburch
Splunk Employee
Splunk Employee

This one is super hard given it's very specific to your data owner's needs of that data. There might be other ways for them to solve the problem. It's certainly odd to expect a production system having to output that much data in each pop.

Perhaps getting them to be more involved and understand the implication of their needs would get them to be more emotionally invested in the challenge you face.

Alternatively, what about breaking those large events into a ton of sub events (just events in Splunk) and giving them knowledge objects to correlate and reattach them all at search time as needed?

0 Karma

woodcock
Esteemed Legend

I don't really understand the nature of your concern but you are really looking for trouble if you specify TRUNCATE=0. It should never be done that way.

0 Karma

DalJeanis
Legend

1) Just to be sure I understand you... how, precisely, were you "forced to comply"?

2) Have you calculated what the storage and logging cost is, and made that known to your management?

3) What are the actual ability of your system to process and store log events? If you are getting a million truncation events per day, then you are losing data. What is the effect of losing that data?

4) Just for fun, consider the possibility of a policy rerouting oversize events for each sourcetype to a separate index. They can have 1 meg if they want, but it will go away ten times faster. Properly compact events get priority.

0 Karma
Get Updates on the Splunk Community!

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

Splunk Decoded: Business Transactions vs Business IQ

It’s the morning of Black Friday, and your e-commerce site is handling 10x normal traffic. Orders are flowing, ...

Fastest way to demo Observability

I’ve been having a lot of fun learning about Kubernetes and Observability. I set myself an interesting ...