Splunk Search

Writing Splunk Logs for Faster Queries, with Category Boolean true

MarkSmith47
New Member

We are writing Log Statements in Java,  and then reviewing the info and exception alerts.

Our team is then conducting a Splunk Search count of log statements by Category.

Many of our log statements can have share multiple categories.  Using this reference for key-value pair, https://dev.splunk.com/enterprise/docs/developapps/addsupport/logging/loggingbestpractices/

So in our log statements,

We are doing 

 

 

LOG.info("CategoryA=true ,  CategoryG=true");

 

 

Of course, we aren't going to write "Category=false" in any logger, since its inherent in the statement.

Is this a overall good method to count values in Splunk by Category, or do you recommend a better practice?

 

 

Labels (2)
Tags (2)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

I would take issue with some of the statements as "best practice" for logging standards. We often find developer friendly formats, such as JSON cause large ingestion volumes compared to the value of the data contained in the JSON. The ratio of field names to usable field values can typically be 50% and often developer logging frameworks will just dump out JSON objects with empty field values, which is a real cost.

I often see clients hitting their ingestion licence limits then having to push back to developers who have written dashboards on their data, asking them to shrink their data.

Anyway, as to your question, if you want to count how many of CategoryA are true and how many false, if false is not written, you can only extrapolate the false count to be the total count - true count, on the assumption that all events are implicitly false. Therefore you need to know the data to be able to make those searches.

It's fine to have things like cat_a=true or categorya=1 - however, if you have 100 million events per day, then use =1, not =true, so you save 300MB/day ingestion cost 😄 also mapping a "true" to something you can count on is more expensive instead of doing this simple wildcarding logic of

| stats sum(cat_*) as cat_*

if you have predictable naming conventions.

Please also do not write full Java class names in the logs, e.g org.apache.catalina.bla.bla.bla as this has no value and just costs in licence ingest. Most logging frameworks have the ability to abbreviate package names to a single character and there is rarely ambiguity in class names.

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Level Up Your .conf25: Splunk Arcade Comes to Boston

With .conf25 right around the corner in Boston, there’s a lot to look forward to — inspiring keynotes, ...

Manual Instrumentation with Splunk Observability Cloud: How to Instrument Frontend ...

Although it might seem daunting, as we’ve seen in this series, manual instrumentation can be straightforward ...

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

Ready to make your IT operations smarter and more efficient? Discover how to automate Splunk alerts with Red ...