Getting Data In

Source or Sourcetype override

nags
Engager

I have sourcetype based definition in which I mentioned INDEXED_EXTRACTION=JSON. Under this sourcetype there are 10 sources configured. Out of 10, let us say one is not in JSON format. So how to use same sourcetype but no need to mentioned INDEXED_EXTRACTION=JSON for that particular source alone? I thought of using source:: based extraction in props with other attributes and not mentioning this INDEXED_EXTRACTION attribute. In that case will it be considered from the sourcetype declaration?

Labels (1)
Tags (1)
0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

sourcetype defining your log file schema. For that reason it must be different for different log files/log event types. Here is one excellent description for it by Mark McCullough (Splunk Slack #bestpractices)

--8<--

I think I've finally figured out how to explain to "I know Splunk!" types what the sourcetype field means in a way that doesn't cause them to want to pick _json for everything that uses  JSON syntax:  "It's like a reference to a XSD file for XML.  It specifies what fields are required, what fields are permitted, and the overall structure of the event."

--8<--

There is also some naming standards for KO in Splunk which helps you to manage all these KOs. In most cases I'm using naming schema "owner:system/vendor:app:subsystem:log file:#" There is no need to keep all those, but usually it has at least three of those and number as a suffix. When the format of log changed later I just increment last digit by one.

In most times when you have Splunk system where are even couple of different business / tech systems you should use this kind of naming standard for all your KO like apps, indexes, saved searches, alerts etc. This will help you and at least it helps your splunk admins.

r. Ismo

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I'm not 100% sure what you want. As you can see in the docs (https://docs.splunk.com/Documentation/Splunk/Latest/Admin/Propsconf), you can define settings based on sourcetype, source or host so that some of the settings can be selectively applied to your sources.

But the main question here is - why do you want to have INDEXED_EXTRACTIONS (there is "S" at the end, it's important!). INDEXED_EXTRACTIONS are sometimes inevitable (if the log file has variable order of columns and the field order is determined by the header row, it's the only way to reasonably ingest such file) but often search-time parsing is enough and generally with Splunk search-time operations are the preferred method. So why not KV_MODE=json instead of INDEXED_EXTRACTIONS?

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @nags,

it isn't possible: sourcetype defines the specification of a data source (one of them is INDEXED_EXTRACTIONS) so you cannot use the same data definition for different data sources.

As a workaround, you could use a similar sourcetype (e.g. my_sourcetype and my_sourcetype_json) so in the searches you can use: 

sourcetype=my_sourcetype*

and take both of them.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...