Splunk Search

LOG4J CIM?

mitag
Contributor

A number of applications and services in our environment use LOG4J for logging. Is there a CIM (Common Information Model) for LOG4J log types - or perhaps just the accepted / standardized field names? (The idea is to properly set up field extraction correctly the 1st time so we don't have to do it again in the future.)

In other words what I need first and foremost are standardized field names for this log type - and if there is anything else that needs to be done to have a clean and performant field extraction for this log type that'll last a while w/o needing major revisions that might wreak havoc on existing dashboards, reports and searches.

Event examples:

2020-04-13 15:20:53,379 ERROR [com.somejavaapp.exec.Server] (pool-1-thread-1) - Caught exception producing output
java.net.SocketException: Connection reset by peer: socket write error
    at java.net.SocketOutputStream.socketWrite0(Native Method)
    at java.net.SocketOutputStream.socketWrite(Unknown Source)
    at java.net.SocketOutputStream.write(Unknown Source)
Show all 15 lines

...

2020-04-13 15:20:53,379 ERROR [com.somejavaapp.exec.Server] (Thread-149821) - Exception while sending progress data
java.net.SocketException: Connection reset by peer: socket write error
    at java.net.SocketOutputStream.socketWrite0(Native Method)
    at java.net.SocketOutputStream.socketWrite(Unknown Source)
    at java.net.SocketOutputStream.write(Unknown Source)
Show all 8 lines

... etc... So perhaps the field names should be as follows?

_time                           2020-04-13 15:20:53,379
severity? log_level?            ERROR
java_class?                     [com.somejavaapp.exec.Server]
java_class_package?             com.somejavaapp.exec
java_class_package_namespace?   com.somejavaapp

thread?                         (pool-1-thread-1)
                                (Thread-149821)

message?      Caught exception producing output
exception?    java.net.SocketException: Connection reset by peer: socket write error

java_traces?
            at java.net.SocketOutputStream.socketWrite0(Native Method)
            at java.net.SocketOutputStream.socketWrite(Unknown Source)
            at java.net.SocketOutputStream.write(Unknown Source)

Thanks!

0 Karma

PavelP
Motivator

Hello @mitag,

I think you mean a CIM parser for log4j logs, not a CIM, because CIM organized by domain of interest (like Changes, Authentication, etc.) and not by logging method (syslog, log4j, sql, etc.).

A general parser configuration for log4j logs could look like this:

props.conf:

[your_log_sourcetype]
SHOULD_LINEMERGE        = false
MAX_TIMESTAMP_LOOKAHEAD = 30
TIME_FORMAT             = %Y-%m-%d %H:%M:%S,%3N
TIME_PREFIX             = ^
LINE_BREAKER            = ([\r\n]+)\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d,\d\d\d
EVENT_BREAKER_ENABLE    = true
EVENT_BREAKER           = ([\r\n]+)\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d,\d\d\d

then you have to identifiy a particular CIM to map. In your case the events contain some network exceptions - there are no matching CIMs for this. Check these links for more information:

https://docs.splunk.com/Documentation/CIM/4.15.0/User/Overview

https://docs.splunk.com/Documentation/CIM/4.15.0/User/Howtousethesereferencetables

Good luck!

mitag
Contributor

Thanks Pavel!

This part of CIM is what I am looking for:

The CIM helps you to normalize your data to match a common standard, using the same field names and event tags for equivalent events from different sources or vendors.

I.e. field names that are in line with common standards - or at least with what others do with log4j events.

So yes, it is a CIM that I am looking for and not a CIM parser - even if field names are just a small part of what a CIM is.

(I probably didn't write my question very clearly - revised it - hope it's clearer now.)

Thanks for looking into it!

0 Karma

PavelP
Motivator

Hi @mitag,

then may be this link is what you need - a list of standardized field names - this is a recent (just a week ago!) addition to the CIM documentation - an overview of all field names per associated data model:

https://docs.splunk.com/Documentation/CIM/4.15.0/User/CIMfields

I hope it is what you need!

Let me know how it went.

0 Karma

mitag
Contributor

many thanks - but not finding much there... E.g. not seeing anything relevant to threads, java classes, namespaces, java traces...

Appreciate your looking into it.

0 Karma

PavelP
Motivator

that is what I mean from the beginning - there are currently no matching CIM for your log examples.

0 Karma

mitag
Contributor

Maybe not in Splunk docs or CIM repo - but there must be other people logging java applications using log4j with a similar log structure - and extracting fields. The question wasn't, "does Splunk offer a CIM for log4j?", it was "LOG4J CIM?" with the implication of, "how do I zero in on the best possible approximation of such a CIM?". If you search the interwebs for pieces and bits of the logs I posted - you'll see a bunch of people using similar log structures.

It's those people using Splunk that I am hoping to engage with this question, not those who would say "Splunk can't help you" or "I can't help you".

Thank you for the understanding 🙂

0 Karma

to4kawa
SplunkTrust
SplunkTrust

mitag
Contributor

Thanks - this isn't it. The logs are already in Splunk - forwarded via SUFs - so I don't need a TCP input to get them into Splunk.

What I need to extract fields and name them in line with common standards - i.e. using a CIM for log4j if one exists - or using field names that others are using. I probably didn't write my question very clearly - will see if I can revise it.

Thanks for looking into it!

0 Karma