Getting Data In

Source or Sourcetype override

nags
Engager

I have sourcetype based definition in which I mentioned INDEXED_EXTRACTION=JSON. Under this sourcetype there are 10 sources configured. Out of 10, let us say one is not in JSON format. So how to use same sourcetype but no need to mentioned INDEXED_EXTRACTION=JSON for that particular source alone? I thought of using source:: based extraction in props with other attributes and not mentioning this INDEXED_EXTRACTION attribute. In that case will it be considered from the sourcetype declaration?

Labels (1)
Tags (1)
0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

sourcetype defining your log file schema. For that reason it must be different for different log files/log event types. Here is one excellent description for it by Mark McCullough (Splunk Slack #bestpractices)

--8<--

I think I've finally figured out how to explain to "I know Splunk!" types what the sourcetype field means in a way that doesn't cause them to want to pick _json for everything that uses  JSON syntax:  "It's like a reference to a XSD file for XML.  It specifies what fields are required, what fields are permitted, and the overall structure of the event."

--8<--

There is also some naming standards for KO in Splunk which helps you to manage all these KOs. In most cases I'm using naming schema "owner:system/vendor:app:subsystem:log file:#" There is no need to keep all those, but usually it has at least three of those and number as a suffix. When the format of log changed later I just increment last digit by one.

In most times when you have Splunk system where are even couple of different business / tech systems you should use this kind of naming standard for all your KO like apps, indexes, saved searches, alerts etc. This will help you and at least it helps your splunk admins.

r. Ismo

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I'm not 100% sure what you want. As you can see in the docs (https://docs.splunk.com/Documentation/Splunk/Latest/Admin/Propsconf), you can define settings based on sourcetype, source or host so that some of the settings can be selectively applied to your sources.

But the main question here is - why do you want to have INDEXED_EXTRACTIONS (there is "S" at the end, it's important!). INDEXED_EXTRACTIONS are sometimes inevitable (if the log file has variable order of columns and the field order is determined by the header row, it's the only way to reasonably ingest such file) but often search-time parsing is enough and generally with Splunk search-time operations are the preferred method. So why not KV_MODE=json instead of INDEXED_EXTRACTIONS?

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @nags,

it isn't possible: sourcetype defines the specification of a data source (one of them is INDEXED_EXTRACTIONS) so you cannot use the same data definition for different data sources.

As a workaround, you could use a similar sourcetype (e.g. my_sourcetype and my_sourcetype_json) so in the searches you can use: 

sourcetype=my_sourcetype*

and take both of them.

Ciao.

Giuseppe

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...