Splunk Search

Describe the pattern matching syntax used for 'punct'?

stefanlasiewski
Contributor

I am trying to determine how I can use 'punct' to match certain patterns and set eventtypes for my data.

I see punct described in the documentation at UseDefaultAndInternalFields and ClassifyAndGroupSimilarEvents, but I don't see any description on how to read the syntax.

What does something like punct="<>__::_..._[]:_=_=___=\"=,=,=,=,=,=\"" mean? It's obviously some sort of pattern matching behavior like regular expressions or globbing, but I don't see this defined anywhere.

Does punct support wildcards?

Is there an easy way to experiment with different punct patterns and see if they correctly match my data? I need a way to quickly compare one pattern vs. another, so I can determine if a particular punct is too narrow or too broad.

On any given search, Splunk will suggest over 50 different puncts, which are very difficult to compare. My logs are all sent via syslog, and follow the standard formats defined in RFC 5424 (RFC 3164).

Tags (2)
1 Solution

stefanlasiewski
Contributor

Deep in the Knowledge Manager Manual, I found more details about punct. In the section "Use the punct field to search on similar events " says:

The punct field stores the first 30 punctuation characters in the first line of the event. This field is useful for finding similar events quickly.

When you use punct, keep in mind:

  • Quotes and backslashes are escaped.
  • Spaces are replaced with an underscore (_).
  • Tabs are replaced with a "t".
  • Dashes that follow alphanumeric characters are ignored.
  • Interesting punctuation characters are:

    ",;-#$%&+./:=?@\'|*\n\r\"(){}<>[]^!"

In addition, wildcards are supported, according to "Identify similar events with punct". However, this description is vague.

You may want to consider wildcarding the punctuation to match insignificant variations (for example, "punct=::[]/").

View solution in original post

stefanlasiewski
Contributor

Deep in the Knowledge Manager Manual, I found more details about punct. In the section "Use the punct field to search on similar events " says:

The punct field stores the first 30 punctuation characters in the first line of the event. This field is useful for finding similar events quickly.

When you use punct, keep in mind:

  • Quotes and backslashes are escaped.
  • Spaces are replaced with an underscore (_).
  • Tabs are replaced with a "t".
  • Dashes that follow alphanumeric characters are ignored.
  • Interesting punctuation characters are:

    ",;-#$%&+./:=?@\'|*\n\r\"(){}<>[]^!"

In addition, wildcards are supported, according to "Identify similar events with punct". However, this description is vague.

You may want to consider wildcarding the punctuation to match insignificant variations (for example, "punct=::[]/").

stefanlasiewski
Contributor

I cannot edit this answer anymore, but I thought I would provide an update. The 6.2.0 manual has improved this documentation somewhat. See http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Abouteventtypes#Use_the_punct_field_to_...

0 Karma

sam
Explorer

punct is a field just like any other. The content is punct is the same as the event stripping all letters and number, and replacing whitespace with the underscore. Leaving just the PUNCTuation.

punct is useful for finding similar messages that have varying some varying text in them. Process ids, host names, and times could be different in the content of several events, but the actual message that you care about could be the same. In may cases like this the punct will be the same across those events.

Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

 (view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...