Getting Data In

Can Splunk Auto Change Sourcetype If Input Format Changes

rpettymb
New Member

Hello,

I have added a new input that looks like this:

> ...
>     Start calculating postfix queue depth on server1.domain.com at Wed Mar
> 19 22:45:01 UTC 2014
>     instance=1 deferred=1 active=0 incoming=0
>     instance=8 deferred=0 active=0 incoming=0
>     instance=9 deferred=27 active=0 incoming=0
>     Stop calculating postfix queue depth on server1.domain.com at Wed Mar
> 19 22:45:01 UTC 2014
>     Start calculating postfix queue depth on server1.domain.com at Wed Mar
> 19 23:45:01 UTC 2014
>     instance=1 deferred=1 active=0 incoming=0
>     instance=8 deferred=0 active=0 incoming=0
>     instance=9 deferred=27 active=0 incoming=0
>     Stop calculating postfix queue depth on server1.domain.com at Wed Mar
> 19 23:45:01 UTC 2014 ...

Splunk has applied a sourcetype of *-too_small. This is causing some grief as I can't simply search for "deferred>25". I am assuming the lines "Start... and Stop..." are causing Splunk to not auto parse this as I would hope. If I remove the lines "Start... and Stop..." going forward will Splunk change the sourcetype to something that can be parsed as simple key=value pairs?

Thank you!

Tags (1)
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

The *-too_small indicates the sample size was too small for Splunk to guess the sourcetype, even if it may or may not recognize it from a larger sample. That's why I always specify the sourcetype wherever possible - even when it's an automatically recognized one.

0 Karma

dmaislin_splunk
Splunk Employee
Splunk Employee

Why are you not specifying a sourcetype in your inputs.conf?

Just add this to your inputs.conf and restart the forwarder:

sourcetype=postfixdata

Or something like that so that Splunk doesn't try to guess at a sourcetype and auto-learn them.

For this point all new indexed data will be in one sourcetype.

0 Karma

dmaislin_splunk
Splunk Employee
Splunk Employee

Once that is done we can help you with your linebreaks of the events, field extractions, etc.

0 Karma

rpettymb
New Member

Martin,

Maybe I am misunderstanding, but in simple (or well established) cases it detect a pattern and 'sometimes' just get it correct with zero configuration? In my case, I have now switched the data to be just the key value pairs "k1=v1 k2=v2" (4 keys per line), and now I can do basic searching, so I assume it must be detecting something. It did not however change the sourcetype.

I agree, about being explicit in all but none-simple cases, but I would hope ones like this would magically work 🙂

Regards.

Ron

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

How should that data be parsed? Where are the event breaks? Which timestamp should be used?

It's usually best to define a custom sourcetype for custom data, that will tell Splunk how to index the data - specifically, timestamping and event breaking.
After that you can define searchtime field extractions as you need them.

0 Karma
Get Updates on the Splunk Community!

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

For Splunk Cloud customers, understanding and optimizing Splunk Virtual Compute (SVC) usage and resource ...

Automatic Discovery Part 3: Practical Use Cases

If you’ve enabled Automatic Discovery in your install of the Splunk Distribution of the OpenTelemetry ...