Getting Data In

Implement selective parsing of events in splunk based on timestamp

Juhi28
New Member

Hi,

We are trying to use selective parsing in splunk to parse only those events that have timestamp as a part of entire log. The context behind the same is that while parsing the logs from one of the Java application, JVM is printing garbage value as well in the logs and we don't want it to be parsed in splunk system. So we are moving to parse the events selectively based on timestamps ie events with timestamps.

0 Karma

markusspitzli
Communicator

When we ingest logfiles which contain Java we configure the linebreaker manually. This means that the whole java dump will be in one event instead of mixed up by the splunk auto parsing.

In those cases our props.conf looks like this:
[yoursourcetype]
LINE_BREAKER = ([\r\n]+)\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}
SHOULD_LINEMERGE = false
TRUNCATE = 100000
MAX_TIMESTAMP_LOOKAHEAD = 30
NO_BINARY_CHECK = true
TIME_FORMAT=%F %T
TIME_PREFIX=^

LINE_BREAKER should contain the timestamp after the ([\r\n]+) and you probably want to increase TRUNCATE to at least 100000

0 Karma

gcusello
SplunkTrust
SplunkTrust

hi Juhi28,
could you share an example of your logs?

Anyway, the method to filter events id described at https://docs.splunk.com/Documentation/Splunk/7.2.3/Forwarding/Routeandfilterdatad

Bye.
Giuseppe

0 Karma

Juhi28
New Member

Here is the sample data. I only want splunk to parse data with timestamp in it and exclude other garbage data.

19/01/21 05:32:15 WARN YarnAllocator: Expected to find pending requests, but found none.
^@^Fstdout^@^A0^@^A^@^@^E^@
^@^GVERSION/-^@+container_e217_1537606163373_5158_01_000001^E^Dnone^A^PÑ,Ñ,^C^Qdata:BCFile.index^DnoneÑ}^K^K^Pdata:TFile.index^DnoneÑB;;^Odata:TFile.meta^DnoneÑ<^F^F^@^@^@^@^@^@^E~H^@^A^@^@Ñ^QÓh~Qµ×¶9ßA@~RºáP
Ñ^QÓh~Qµ×¶9ßA@~RºáP ^@^GVERSION^D^@^@^@^A^Q^@^OAPPLICATION_ACL,^@
MODIFY_APP^@ s_ptheon ^@^HVIEW_APP^@ s_ptheon ^S^@^QAPPLICATION_OWNER
^@^Hs_ptheon-^@+container_e217_1537606163373_5158_01_000004Î^C^@^Fstderr^@^C491SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/alluxio/alluxio-1.4.0/core/client/target/alluxio-core-client-1.4.0-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
^@^Fstdout^@^A0^@^A^@^@^D^@
^@^GVERSION/-^@+container_e217_1537606163373_5158_01_000004^D^Dnone^A^PÎ| Î| ^C^Qdata:BCFile.index^DnoneÎñ^K^K^Pdata:TFile.index^Dnoneζ;;^Odata:TFile.meta^Dnoneΰ^F^F^@^@^@^@^@^@^Bü^@^A^@^@Ñ^QÓh~Qµ×¶9ßA@~RºáP

0 Karma

lakshman239
SplunkTrust
SplunkTrust

If the above is a single event, you could send it to nullQueue as per the steps in the link posted above by Giuseppe. However, if you would want to only remove contents after 'found none'......, then you would need to filter them before indexing something like below:
https://docs.splunk.com/Documentation/Splunk/7.2.3/Admin/Propsconf

[yoursourcetype]
SEDCMD-removejvmlogs = s/^@.*//

This will need a restart of the indexer and any future messages will be masked/removed.

0 Karma

Juhi28
New Member

Can this be done at forwarder level instead of indexing. I want the data without timestamp should not come to splunk system.

0 Karma

lakshman239
SplunkTrust
SplunkTrust

Yes, if its a heavy forwarder. If you only have UF and indexer, you can add the rule in indexer and it will filter them out before indexing. So this will not be consuming your license/storage, not available in searches

0 Karma

Juhi28
New Member

Thanks lakshman , What rule will i have to add in inputs.conf?

0 Karma

lakshman239
SplunkTrust
SplunkTrust

As this is already coming to indexer, adding the SEDCMD-* in the props.conf in the indexer or heavyforwarder will do. no need to change inputs.conf

0 Karma

lakshman239
SplunkTrust
SplunkTrust

It may be possible that the JVM is throwing exceptions and unwanted lines. You may want to double check the line breaking (as some events could form part of multi-line and hence give you the impression that events are without timestamp) and parse them correctly. Each event has to have a timestamp.
if you share examples that would help.

0 Karma

Juhi28
New Member

yes, the scenario is similar to JVM throwing exception, i shared sample data with Giuseppe in above thread.

0 Karma