Splunk Search

Writing Splunk Logs for Faster Queries, with Category Boolean true

MarkSmith47
New Member

We are writing Log Statements in Java,  and then reviewing the info and exception alerts.

Our team is then conducting a Splunk Search count of log statements by Category.

Many of our log statements can have share multiple categories.  Using this reference for key-value pair, https://dev.splunk.com/enterprise/docs/developapps/addsupport/logging/loggingbestpractices/

So in our log statements,

We are doing 

 

 

LOG.info("CategoryA=true ,  CategoryG=true");

 

 

Of course, we aren't going to write "Category=false" in any logger, since its inherent in the statement.

Is this a overall good method to count values in Splunk by Category, or do you recommend a better practice?

 

 

Labels (2)
Tags (2)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

I would take issue with some of the statements as "best practice" for logging standards. We often find developer friendly formats, such as JSON cause large ingestion volumes compared to the value of the data contained in the JSON. The ratio of field names to usable field values can typically be 50% and often developer logging frameworks will just dump out JSON objects with empty field values, which is a real cost.

I often see clients hitting their ingestion licence limits then having to push back to developers who have written dashboards on their data, asking them to shrink their data.

Anyway, as to your question, if you want to count how many of CategoryA are true and how many false, if false is not written, you can only extrapolate the false count to be the total count - true count, on the assumption that all events are implicitly false. Therefore you need to know the data to be able to make those searches.

It's fine to have things like cat_a=true or categorya=1 - however, if you have 100 million events per day, then use =1, not =true, so you save 300MB/day ingestion cost 😄 also mapping a "true" to something you can count on is more expensive instead of doing this simple wildcarding logic of

| stats sum(cat_*) as cat_*

if you have predictable naming conventions.

Please also do not write full Java class names in the logs, e.g org.apache.catalina.bla.bla.bla as this has no value and just costs in licence ingest. Most logging frameworks have the ability to abbreviate package names to a single character and there is rarely ambiguity in class names.

Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...