Pulling data from Fluentd Plugin to Splunk, how do...

vya9836 · ‎05-14-2016

We are pulling data like Red Hat logs, Apigee, Ansible etc. from AWS through fluentd plugin which is forwarding data to our Heavy Forwarder in AWS, and then from that, the HF to another HF in a DMZ to another HF outside of DMZ.

The data is passing through and getting indexed, so the firewall rules and ports are established properly. However, when trying to transform the data so that we can split it into numerous sourcetypes, it will not work. It still applies the original sourcetype applied from fluentd plugin.

In the fluentd plugin, we are defining index name, sourcetype, and the default format is JSON. We are trying to override this index and sourcetype at the destination for differentiating types of data with different sourcetypes by defining inputs.conf, props.conf, transforms.conf. It is not applying the values what we define here at the destination. It is only taking the values that the source is defining in the fluentd plugin config file.

So the question is, can we add a props and transforms config in fluentd plugin in AWS to differentiate the logs with sourcetypes? Can anyone suggest a possible solution for this kind of problem?

FLuentd plugin is ----k24d/fluent-plugin-splunkapi
We are using Splunk 6.2.2 in all Indexers, Forwarders etc

Here are the configs that we defined at the destination.
Please help us.

inputs.conf

[splunktcp://1600]
connection_host = ip
sourcetype = journald
index = aws_fluentd_index

props.conf

[source::poc.aws.system.journald]
KV_MODE = json
TIME_PREFIX=^
TIME_FORMAT=%Y-%m-%d %T %z
SHOULD_LINEMERGE=false
MAX_TIMESTAMP_LOOKAHEAD=30
NO_BINARY_CHECK = 1
pulldown_type = 1

[source::poc.aws.system.journald]
TRANSFORMS-override=override_ST_journald,override_IDX_journald

transforms.conf

[override_ST_journald]
SOURCE_KEY=_raw 
REGEX=.*
FORMAT = sourcetype::journald
DEST_KEY = MetaData:Sourcetype

[override_IDX_journald]
SOURCE_KEY=_raw 
REGEX=.*
FORMAT = aws_fluentd_index
DEST_KEY = _MetaData:Index

agup006 · ‎07-02-2017

Hi vya998,

Thanks for using Fluentd!

The first bit is that the Splunk API plugin that you referenced is deprecated, and you should switch to sending messages over TCP or through the Splunk HTTP Event Collector. Additionally, I see that your configuration for translating and parsing data is being done on the Splunk indexer side. I would recommend translating those configurations over to Fluentd to distribute that compute layer to the endpoints so Splunk can focus on search. Fluentd has the ability to do most of the common translation on the node side including nginx, apache2, syslog [RFC 3624 and 5424], etc.

Additionally, if you are interested in the Fluentd Enterprise Splunk TCP and HTTP Event Collector plugin and help in optimizing parsing and transformation logic you can email me at A at TreasureData dot com. More info for https://fluentd.treasuredata.com

Thanks,
Anurag

somesoni2 · ‎05-17-2016

Apart from location of the props and transforms (which should be in HF in your case), does the source of the data is really poc.aws.system.journald ?

vya9836 · ‎05-17-2016

yes the source is same as mentioned

somesoni2 · ‎05-17-2016

Can you confirm the location of props and transforms.conf, is it in Heavy forwarder OR Indexers?

vya9836 · ‎05-17-2016

Destination Heavy forwarder and also the indexers

acharlieh · ‎05-17-2016

You have 3 HFs before your indexer? Which one do you have props and transforms on? If you're wanting to apply props and transforms on the middle or last one, I think you need to override the route attribute in tour splunktcp stanza in inputs.conf to send the data that was already parsed by the previous HF back through the typingQueue and pipeline instead of skipping straight through to the indexingQueue. I haven't done this personally so YMMV. I'm also assuming of course that your source matches the source set by the plugin. For some reference links: http://docs.splunk.com/Documentation/Splunk/6.4.0/admin/Inputsconf and https://wiki.splunk.com/Community:HowIndexingWorks

vya9836 · ‎05-17-2016

Which HF are you suggesting to override the route attribute in? The middle one?

acharlieh · ‎05-17-2016

Whichever one you want to do the re-parsing on. It's tricky as you may impact other data being forwarded over the same port and re-apply props & transforms to data you didn't expect to, but if the middle one is dedicated to gathering this data, then definitely. (of course the easier way would be if the origin HF could just set the index and sourcetype appropriately at input time, but I'm not familiar with the plugin you're using).

Pulling data from Fluentd Plugin to Splunk, how do we transform the data to split into numerous sourcetypes?

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Join the Conversation

Pulling data from Fluentd Plugin to Splunk, how do we transform the data to split into numerous sourcetypes?

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...