Getting Data In

How does a Universal Forwarder send raw and cooked data to an indexer?

Path Finder

So I know that during the input phase, a universal forwarder will take the raw data, add some metadata tags to it, and send it over to the indexer as "cooked" data, which is really just event data. I know that an indexer stores both event data and raw data. However, how does the universal forwarder get the raw data over to the indexer — as in, does the UF send one stream of "cooked" data and one stream of raw data to the indexer?

Also, are both raw and cooked data sent by default to the indexers and are either of those configurable, in terms of not sending or send the data, or even sending the cooked data to one source and the raw data to a different source?

Any help would be appreciated!

Splunk Employee
Splunk Employee

Actualiteit, only a heavy forwarder will send unparsed data. But some meta data will be added.

You could forward _raw from your Splunk instance, is that an option?

The forwarders, heavy & universal can forward to other destinations, but where build for Splunk at the end of the day. I personally would go that route. More details can be found here:

http://docs.splunk.com/Documentation/SplunkCloud/6.6.1/Forwarding/Forwarddatatothird-partysystemsd

New Member

R_B fwijnholds_splunk · May 12 at 04:02 PM

Right, I know the UF will send cooked (unparsed) and raw data to multiple indexers and even third party systems too. I was just >>curious if you could send the raw data to one server, say a third party server, and the "cooked" data to the indexers. In addition, I was wondering if you could have the option to send "cooked" data without sending the correlating raw data. I'm not trying to produce this situation in my environment, I'm just trying to understand the nitty-gritty of how the UF is working and the possibilities that are capable with it.

I regret pulling this thread from the bone pile but you are asking my question. I need to send my data to two locations as mentioned. Normal processing will go to the indexer, as I understand it cooked and raw, but I need a copy of the untouched data (raw) sent to a 3rd party system/application.

Did anyone find a way to do this yet? We want to use the TA's provided by Splunk but need the above setup to work.

0 Karma

Motivator

Your forwarder should send the data to two forwarders: a heavy forwarder with the TA for processing and forwarding to index and another machine which indexes separately without processing for distribution to the app.

New Member

The problem with that answer, as I understand things, is the forwarder will tag it and even using separation events its still not the "raw" data but the "cooked" or "parsed data". I need the untouched version to be send to this 3rd party app.

0 Karma

Motivator

Among the simpler options entirely sidestepping Splunk are syslog, or network directory share to the app. Unless you are looking to leverage the data compression of Splunk during forwarding to limit bandwidth or the file tracking and tailing on the client, a data stream separate from Splunk forwarding should work well enough. So, if you don't want Splunk to touch the data, is there any reason why another option cannot be considered?

And for the record, Splunk can be configured to send the data uncooked from the universal forwarder to the app, but as you noted, there will still be a limited degree of interaction with the data.

New Member

I had that thought also. Then ran into a snag. We have uf on windows devices that won't work for in my understanding. Again native and not with out third party syslog\syslog-ng client on the windows node.

Again we are trying to send spl-cooked to index tier and raw to a cloudera hdfs tier.

Or am I missing something?

0 Karma

Path Finder

Thank you for the links, however I already read through those docs and they didn't really explain what I'm looking for. They explained how the raw data from the data source gets cooked at the UF then processed at the indexer. However, they don't explain how the raw data goes from the UF to the indexer and whether you can turn on or off sending cooked or raw data, or even send cooked to one location and raw to a different location.

My main question really is does the raw data and the cooked data get sent over in separate streams from the UF to the indexer? From my understanding, the UF will tag the raw data it is receiving and send that event data to the indexer, but the moment that the UF tags the data it is no longer raw data.

0 Karma

Splunk Employee
Splunk Employee

I believe this answer matches your question:

https://answers.splunk.com/answers/13196/universal-forwarder-sending-cooked-data-to-indexer.html

The universal forwarder does send unparsed data. In this context, "cooked" merely means that blocks of data have been tagged with default fields, such as source, sourcetype and host. Both parsed and unparsed data are considered "cooked":

"Raw" data is totally unprocessed -- no tagging at all.

http://www.splunk.com/base/Documentation/latest/Deploy/Aboutforwardingandreceivingdata#Types_of_data

Path Finder

Thank you for linking me to that question. The link in that thread led me to http://docs.splunk.com/Documentation/Splunk/6.6.0/Forwarding/Typesofforwarders. In that doc, it does appear to explain that raw data gets sent over in a TCP stream untouched and "cooked" data gets sent over separately after it gets tagged. So that answers my main question for the most part.

Does anyone know though if you could send cooked data without sending raw data? Or if you could send cooked data to one source and the raw data to a different source?

Thanks!

0 Karma

Splunk Employee
Splunk Employee

The Splunk Universal Forwarder will send unparsed data to the indexer Or multiple destinations for that matter. You can configure this in the outputs.conf per sourcetype.

With a Heavy forwarders you could achieve this by using a SED script in the transforms.conf At the very bottom of that page there's an example with
[hide-ip-address]

What is the use case you are trying to achieve?

Path Finder

Right, I know the UF will send cooked (unparsed) and raw data to multiple indexers and even third party systems too. I was just curious if you could send the raw data to one server, say a third party server, and the "cooked" data to the indexers. In addition, I was wondering if you could have the option to send "cooked" data without sending the correlating raw data. I'm not trying to produce this situation in my environment, I'm just trying to understand the nitty-gritty of how the UF is working and the possibilities that are capable with it.

0 Karma