Implement selective parsing of events in splunk ba...

Juhi28 · ‎01-13-2019

Hi,

We are trying to use selective parsing in splunk to parse only those events that have timestamp as a part of entire log. The context behind the same is that while parsing the logs from one of the Java application, JVM is printing garbage value as well in the logs and we don't want it to be parsed in splunk system. So we are moving to parse the events selectively based on timestamps ie events with timestamps.

markusspitzli · ‎02-18-2019

When we ingest logfiles which contain Java we configure the linebreaker manually. This means that the whole java dump will be in one event instead of mixed up by the splunk auto parsing.

In those cases our props.conf looks like this:
[yoursourcetype]
LINE_BREAKER = ([\r\n]+)\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}
SHOULD_LINEMERGE = false
TRUNCATE = 100000
MAX_TIMESTAMP_LOOKAHEAD = 30
NO_BINARY_CHECK = true
TIME_FORMAT=%F %T
TIME_PREFIX=^

LINE_BREAKER should contain the timestamp after the ([\r\n]+) and you probably want to increase TRUNCATE to at least 100000

gcusello · ‎01-14-2019

hi Juhi28,
could you share an example of your logs?

Anyway, the method to filter events id described at https://docs.splunk.com/Documentation/Splunk/7.2.3/Forwarding/Routeandfilterdatad

Bye.
Giuseppe

Juhi28 · ‎01-21-2019

Here is the sample data. I only want splunk to parse data with timestamp in it and exclude other garbage data.

19/01/21 05:32:15 WARN YarnAllocator: Expected to find pending requests, but found none.
^@^Fstdout^@^A0^@^A^@^@^E^@
^@^GVERSION/-^@+container_e217_1537606163373_5158_01_000001^E^Dnone^A^PÑ,Ñ,^C^Qdata:BCFile.index^DnoneÑ}^K^K^Pdata:TFile.index^DnoneÑB;;^Odata:TFile.meta^DnoneÑ<^F^F^@^@^@^@^@^@^E~H^@^A^@^@Ñ^QÓh~Qµ×¶9ßA@~RºáP
Ñ^QÓh~Qµ×¶9ßA@~RºáP ^@^GVERSION^D^@^@^@^A^Q^@^OAPPLICATION_ACL,^@
MODIFY_APP^@ s_ptheon ^@^HVIEW_APP^@ s_ptheon ^S^@^QAPPLICATION_OWNER
^@^Hs_ptheon-^@+container_e217_1537606163373_5158_01_000004Î^C^@^Fstderr^@^C491SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/alluxio/alluxio-1.4.0/core/client/target/alluxio-core-client-1.4.0-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
^@^Fstdout^@^A0^@^A^@^@^D^@
^@^GVERSION/-^@+container_e217_1537606163373_5158_01_000004^D^Dnone^A^PÎ| Î| ^C^Qdata:BCFile.index^DnoneÎñ^K^K^Pdata:TFile.index^DnoneÎ¶;;^Odata:TFile.meta^DnoneÎ°^F^F^@^@^@^@^@^@^Bü^@^A^@^@Ñ^QÓh~Qµ×¶9ßA@~RºáP

lakshman239 · ‎01-22-2019

If the above is a single event, you could send it to nullQueue as per the steps in the link posted above by Giuseppe. However, if you would want to only remove contents after 'found none'......, then you would need to filter them before indexing something like below:
https://docs.splunk.com/Documentation/Splunk/7.2.3/Admin/Propsconf

[yoursourcetype]
SEDCMD-removejvmlogs = s/^@.*//

This will need a restart of the indexer and any future messages will be masked/removed.

Juhi28 · ‎01-22-2019

Can this be done at forwarder level instead of indexing. I want the data without timestamp should not come to splunk system.

lakshman239 · ‎01-22-2019

Yes, if its a heavy forwarder. If you only have UF and indexer, you can add the rule in indexer and it will filter them out before indexing. So this will not be consuming your license/storage, not available in searches

Juhi28 · ‎01-22-2019

Thanks lakshman , What rule will i have to add in inputs.conf?

lakshman239 · ‎01-22-2019

As this is already coming to indexer, adding the SEDCMD-* in the props.conf in the indexer or heavyforwarder will do. no need to change inputs.conf

lakshman239 · ‎01-15-2019

It may be possible that the JVM is throwing exceptions and unwanted lines. You may want to double check the line breaking (as some events could form part of multi-line and hence give you the impression that events are without timestamp) and parse them correctly. Each event has to have a timestamp.
if you share examples that would help.

Juhi28 · ‎01-21-2019

yes, the scenario is similar to JVM throwing exception, i shared sample data with Giuseppe in above thread.

Implement selective parsing of events in splunk based on timestamp

Leveraging Detections from the Splunk Threat Research Team & Cisco Talos

New in Splunk Observability Cloud: Automated Archiving for Unused Metrics

Calling All Security Pros: Ready to Race Through Boston?

Are you a member of the Splunk Community?

Implement selective parsing of events in splunk based on timestamp

Leveraging Detections from the Splunk Threat Research Team & Cisco Talos

New in Splunk Observability Cloud: Automated Archiving for Unused Metrics

Calling All Security Pros: Ready to Race Through Boston?