Getting Data In

What is the best way to match more than 3000 patterns used to classify events into multiple sourcetypes?

Builder

Hi All,

We have more than 3000 patterns which are used to classify events into multiple sourcetypes. What is the best way implementing this use-case?

Thanks,
Vishal

0 Karma
1 Solution

Splunk Employee
Splunk Employee

Try reading this blog as it will help you understand how to re-write sourcetypes based on pattern matching using transforms and props in Splunk:

http://blogs.splunk.com/2010/02/11/sourcetypes-gone-wild/

View solution in original post

0 Karma

Communicator

Adding more details:

We have 4 sourcetypes: .log, .out, .debug and .err
Phase-1: Apply 1000 patterns to filter out unwanted data using NullQueue and IndexQueue on Heavy Forwarder-
[log]
TRANSFORMS-set = setnull,setraw

setnull=.
setraw=1000 patterns

Phase-2: Apply Escalate, Non-Escalate patterns to this output using EXTRACT keyword in props.conf to in respective fields.
EXTRACT-esc=1000 patterns
EXTRACT-nonesc=2000 patterns
In this case, will there be any performance issue on Heavy forwarder and Indexer? Can Splunk handle 3000 patterns at search time and 1000 at parsing time?

0 Karma

Builder

We are evaluating the regex pattern matching and sourcetype configurations (using props.conf + transforms.conf), but we are not sure if 3000+ patterns will be supported or not, and if supported, then what will be downside impact on forwarder/indexer. Will there be any performance impact due to (3000+) pattern matching?

0 Karma

Builder

We want to set the metadata sourcetype at index time.

0 Karma

Splunk Employee
Splunk Employee

And you want to set the metadata sourcetype at index time or are you ok with using an eval statement to set fieldname called escalated_exceptions=true for example? Sorry I am not quicker to respond, but I am in all day meetings this week off site.

0 Karma

Splunk Employee
Splunk Employee

Try reading this blog as it will help you understand how to re-write sourcetypes based on pattern matching using transforms and props in Splunk:

http://blogs.splunk.com/2010/02/11/sourcetypes-gone-wild/

View solution in original post

0 Karma

Motivator

I second the idea of using eventtypes

0 Karma

Splunk Employee
Splunk Employee

If I were you I would just put everything into a few sourcetypes and use eventtypes vs. spending all those resources to rewrite the sourcetype.

0 Karma

Splunk Employee
Splunk Employee

Correct, follow the example and blog above and at index time you can dynamically rewrite the sourcetype based on the patterns you define as the REGEX in the transforms.conf file.

0 Karma

Splunk Employee
Splunk Employee

The blog I referenced above gives a great example on how to rewrite sourcetypes based on regex patterns:

In props.conf:

[source::/path/to/sample.log]
TRANSFORMS-yummy = setCPSourcetype, setSyslogSourcetype

In transforms.conf:

[setCPSourcetype]
DEST_KEY = MetaData:Sourcetype
REGEX = %PIX-
FORMAT = sourcetype::cisco-pix

[setSyslogSourcetype]
DEST_KEY = MetaData:Sourcetype
REGEX = \w+ \d+ \d+:\d+:\d+ \S+ \w+[\d+]:
FORMAT = sourcetype::syslog

0 Karma

Builder

We are having an application which generates different logs. The logs contains different exceptions from Java application and these exceptions are classified into four different categories:
1. Escalate exceptions
2. Non-escalate exceptions
3. Ignore exceptions
4. Unmatched exceptions
e.g. Anything starting with NullPointerException should go to Escalated exceptions (and escalated exceptions sourcetype)

0 Karma

Splunk Employee
Splunk Employee

Can you elaborate and provide some examples? I am unclear about what the issue is and what has been tried to date.

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!