Re: How do I configure the HF to produce raw data ...

danielbb

It's not clear to me how indexAndForward works, the documentation says - "Set to 'true' to index all data locally, in addition to forwarding it." Does it mean that the data is being indexed in two places? if so, what should we do to produce cooked data AND forward it to the indexer?

PickleRick

OK. Let me add my three cents to what the guys already covered to some extent.

There are two separate things here.

One is "index and forward" setting.

By default Splunk receives and processes data from inputs and indexes it and sends to outputs (if any are defined). If you disable "index and forward", it will still process and send data but it will not save the events to local indexes. So you disable this setting on any Splunk component which is not supposed to store data locally (in a well-enginered environment only an all-in-one server or an indexer stores indexes; all other components should forward their data to indexer tier).

A Heavy Forwarder is just a fancy name for a Splunk Enterprise (not Universal Forwarder!) instance which does not do local indexing and doesn't have any other roles (actually if you were to nitpick, any other component like SH or DS could technically be called a HF as well since it processes at least its own logs, and forwards them).

Another thing is the type of data.

With Splunk there are three distinct "stages" of data.

Firstly you have the raw data. That's the data you're receiving on simple TCP/UDP inputs, read from files, pull with modular inputs and so on. This is a completely unprocessed data as it is returned by the source.

If raw data is processed at the UF, it's being "cooked" - a data stream is split into chunks (not single events yet!), each chunk is assigned some metadata (the default four - host, source, sourcetype, index) and that's it. This is the cooked data.

If raw data or cooked data is processed at the HF or indexer, it's getting parsed - Splunk applies all props and transforms applicable at index time (splits the stream into separate events, parses out the timestamp from events, does all the fancy index-time mangling...). After this stage you get your data as "cooked and parsed" (often called just "parsed" for short).

If the UF receives cooked or parsed data, it just forwards it.

If a HF/indexer receives already parsed data it doesn't process it, just forwards/indexes it. So the data is cooked only once and parsed only once on its path to the destination index.

There is one additional case - if you're using indexed extractions on a UF, it produces already cooked and parsed data.

Sending uncooked data is a very special case when you're sending data to an external non-splunk receiver. In this case you're actually "de-cooking" your data. But this is a fairly uncommon case.

So here you have it - a HF normally cooks and parses the data it receives (unless it's already parsed) and sends it to its outputs. So you don't need to do anything else by default to have your data cooked and parsed.

isoutamo

This was an excellent explanation about raw, cocked and parsed data!
One thing to add here. If HF has done ingest action stuff for data then next hf/indexer can manage that data again.

PickleRick

I don't have much experience with ingest actions but my understanding is that they indeed can be called later in the event's path - on already parsed data. Remember though that they do have limited functionality.

gcusello

Hi @danielbb ,

if you need to execute local searches on the local data on the HF, you can use the indexAndForward option otherwise you don't need it.

Obviously id you use this option, you index your data twice and you pay double license.

About coocked data, by default all the HFs send coocked data, infact, if you need to apply transformations to your data, you have to put the conf files in the HFs.

Anyway HFs send coocked data both with indexAndForward =True or indexAndForward = False, to send not coocked data you have to apply a different configuration in your outputs.conf, but in this case you give more jobs to your Indexers.

Ciao.

Giuseppe

danielbb

That's great, but what defines in the configurations an HF to be an HF?

gcusello

Hi @danielbb ,

an HF is a Full Splunk instance where logs are forwarded to other Splunk instances and it isn't used for other roles (e.g. Seagc Heads, Cluster Manager, etc...).

It's usually used to receive logs from externa source as Service Providers or to concentrate logs from other Forwarders (heavy or Universal).

It's frequently also used as syslog server, but also a UF can be used for the same purpose.

So it's a conceptual definition, not a configuration, the only relevant configuration for an HF is log forwarding,

Ciao.

Giuseppe

danielbb

I got it, however, I'm setting these three machines and I would like the HF to send cooked data while the SH should send uncooked data to the indexer. Based on what you're saying, it appears that whenever we forward the data, it is already cooked, is it right?

gcusello

Hi @danielbb ,

usually also the SH sends coocked data, only UFs, by default, send uncoocked data.

Ciao.

Giuseppe

danielbb

Great, so how do I configure the SH to send uncooked data?

gcusello

Hi @danielbb ,

why do you want to send uncoocked data from SH?, there no reason for this!

Anyway, if you want to apply this strange thing, see at https://docs.splunk.com/Documentation/Splunk/8.0.2/Forwarding/Forwarddatatothird-partysystemsd#Forwa...

in few words put in outputs.conf

[tcpout]

[tcpout:fastlane]
server = 10.1.1.35:6996
sendCookedData = false

Ciao.

Giuseppe

isoutamo

Why you want to do this? Splunk has designed to use cooked data between its components.
If you really want to broke your installation you found instructions from outputs.conf, inputs.conf files and some articles and answers.

How do I configure the HF to produce raw data and forward it to the indexer?

distributed search

Preparing your Splunk Environment for OpenSSL3

Deprecation of Splunk Observability Kubernetes “Classic Navigator” UI starting ...

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...