- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Implement selective parsing of events in splunk based on timestamp

Hi,
We are trying to use selective parsing in splunk to parse only those events that have timestamp as a part of entire log. The context behind the same is that while parsing the logs from one of the Java application, JVM is printing garbage value as well in the logs and we don't want it to be parsed in splunk system. So we are moving to parse the events selectively based on timestamps ie events with timestamps.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

When we ingest logfiles which contain Java we configure the linebreaker manually. This means that the whole java dump will be in one event instead of mixed up by the splunk auto parsing.
In those cases our props.conf looks like this:
[yoursourcetype]
LINE_BREAKER = ([\r\n]+)\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}
SHOULD_LINEMERGE = false
TRUNCATE = 100000
MAX_TIMESTAMP_LOOKAHEAD = 30
NO_BINARY_CHECK = true
TIME_FORMAT=%F %T
TIME_PREFIX=^
LINE_BREAKER should contain the timestamp after the ([\r\n]+)
and you probably want to increase TRUNCATE
to at least 100000
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content


hi Juhi28,
could you share an example of your logs?
Anyway, the method to filter events id described at https://docs.splunk.com/Documentation/Splunk/7.2.3/Forwarding/Routeandfilterdatad
Bye.
Giuseppe
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Here is the sample data. I only want splunk to parse data with timestamp in it and exclude other garbage data.
19/01/21 05:32:15 WARN YarnAllocator: Expected to find pending requests, but found none.
^@^Fstdout^@^A0^@^A^@^@^E^@
^@^GVERSION/-^@+container_e217_1537606163373_5158_01_000001^E^Dnone^A^PÑ,Ñ,^C^Qdata:BCFile.index^DnoneÑ}^K^K^Pdata:TFile.index^DnoneÑB;;^Odata:TFile.meta^DnoneÑ<^F^F^@^@^@^@^@^@^E~H^@^A^@^@Ñ^QÓh~Qµ×¶9ßA@~RºáP
Ñ^QÓh~Qµ×¶9ßA@~RºáP ^@^GVERSION^D^@^@^@^A^Q^@^OAPPLICATION_ACL,^@
MODIFY_APP^@ s_ptheon ^@^HVIEW_APP^@ s_ptheon ^S^@^QAPPLICATION_OWNER
^@^Hs_ptheon-^@+container_e217_1537606163373_5158_01_000004Î^C^@^Fstderr^@^C491SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/alluxio/alluxio-1.4.0/core/client/target/alluxio-core-client-1.4.0-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
^@^Fstdout^@^A0^@^A^@^@^D^@
^@^GVERSION/-^@+container_e217_1537606163373_5158_01_000004^D^Dnone^A^PÎ| Î| ^C^Qdata:BCFile.index^DnoneÎñ^K^K^Pdata:TFile.index^Dnoneζ;;^Odata:TFile.meta^Dnoneΰ^F^F^@^@^@^@^@^@^Bü^@^A^@^@Ñ^QÓh~Qµ×¶9ßA@~RºáP
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

If the above is a single event, you could send it to nullQueue as per the steps in the link posted above by Giuseppe. However, if you would want to only remove contents after 'found none'......, then you would need to filter them before indexing something like below:
https://docs.splunk.com/Documentation/Splunk/7.2.3/Admin/Propsconf
[yoursourcetype]
SEDCMD-removejvmlogs = s/^@.*//
This will need a restart of the indexer and any future messages will be masked/removed.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Can this be done at forwarder level instead of indexing. I want the data without timestamp should not come to splunk system.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Yes, if its a heavy forwarder. If you only have UF and indexer, you can add the rule in indexer and it will filter them out before indexing. So this will not be consuming your license/storage, not available in searches
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thanks lakshman , What rule will i have to add in inputs.conf?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

As this is already coming to indexer, adding the SEDCMD-* in the props.conf in the indexer or heavyforwarder will do. no need to change inputs.conf
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

It may be possible that the JVM is throwing exceptions and unwanted lines. You may want to double check the line breaking (as some events could form part of multi-line and hence give you the impression that events are without timestamp) and parse them correctly. Each event has to have a timestamp.
if you share examples that would help.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

yes, the scenario is similar to JVM throwing exception, i shared sample data with Giuseppe in above thread.
