Getting Data In

Pulling data from Fluentd Plugin to Splunk, how do we transform the data to split into numerous sourcetypes?

vya9836
New Member

We are pulling data like Red Hat logs, Apigee, Ansible etc. from AWS through fluentd plugin which is forwarding data to our Heavy Forwarder in AWS, and then from that, the HF to another HF in a DMZ to another HF outside of DMZ.

The data is passing through and getting indexed, so the firewall rules and ports are established properly. However, when trying to transform the data so that we can split it into numerous sourcetypes, it will not work. It still applies the original sourcetype applied from fluentd plugin.

In the fluentd plugin, we are defining index name, sourcetype, and the default format is JSON. We are trying to override this index and sourcetype at the destination for differentiating types of data with different sourcetypes by defining inputs.conf, props.conf, transforms.conf. It is not applying the values what we define here at the destination. It is only taking the values that the source is defining in the fluentd plugin config file.

So the question is, can we add a props and transforms config in fluentd plugin in AWS to differentiate the logs with sourcetypes? Can anyone suggest a possible solution for this kind of problem?

FLuentd plugin is ----k24d/fluent-plugin-splunkapi
We are using Splunk 6.2.2 in all Indexers, Forwarders etc

Here are the configs that we defined at the destination.
Please help us.

inputs.conf

[splunktcp://1600]
connection_host = ip
sourcetype = journald
index = aws_fluentd_index

props.conf

[source::poc.aws.system.journald]
KV_MODE = json
TIME_PREFIX=^
TIME_FORMAT=%Y-%m-%d %T %z
SHOULD_LINEMERGE=false
MAX_TIMESTAMP_LOOKAHEAD=30
NO_BINARY_CHECK = 1
pulldown_type = 1

[source::poc.aws.system.journald]
TRANSFORMS-override=override_ST_journald,override_IDX_journald

transforms.conf

[override_ST_journald]
SOURCE_KEY=_raw 
REGEX=.*
FORMAT = sourcetype::journald
DEST_KEY = MetaData:Sourcetype

[override_IDX_journald]
SOURCE_KEY=_raw 
REGEX=.*
FORMAT = aws_fluentd_index
DEST_KEY = _MetaData:Index
0 Karma

agup006
Explorer

Hi vya998,

Thanks for using Fluentd!

The first bit is that the Splunk API plugin that you referenced is deprecated, and you should switch to sending messages over TCP or through the Splunk HTTP Event Collector. Additionally, I see that your configuration for translating and parsing data is being done on the Splunk indexer side. I would recommend translating those configurations over to Fluentd to distribute that compute layer to the endpoints so Splunk can focus on search. Fluentd has the ability to do most of the common translation on the node side including nginx, apache2, syslog [RFC 3624 and 5424], etc.

Additionally, if you are interested in the Fluentd Enterprise Splunk TCP and HTTP Event Collector plugin and help in optimizing parsing and transformation logic you can email me at A at TreasureData dot com. More info for https://fluentd.treasuredata.com

Thanks,
Anurag

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Apart from location of the props and transforms (which should be in HF in your case), does the source of the data is really poc.aws.system.journald ?

0 Karma

vya9836
New Member

yes the source is same as mentioned

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Can you confirm the location of props and transforms.conf, is it in Heavy forwarder OR Indexers?

0 Karma

vya9836
New Member

Destination Heavy forwarder and also the indexers

0 Karma

acharlieh
Influencer

You have 3 HFs before your indexer? Which one do you have props and transforms on? If you're wanting to apply props and transforms on the middle or last one, I think you need to override the route attribute in tour splunktcp stanza in inputs.conf to send the data that was already parsed by the previous HF back through the typingQueue and pipeline instead of skipping straight through to the indexingQueue. I haven't done this personally so YMMV. I'm also assuming of course that your source matches the source set by the plugin. For some reference links: http://docs.splunk.com/Documentation/Splunk/6.4.0/admin/Inputsconf and https://wiki.splunk.com/Community:HowIndexingWorks

0 Karma

vya9836
New Member

Which HF are you suggesting to override the route attribute in? The middle one?

0 Karma

acharlieh
Influencer

Whichever one you want to do the re-parsing on. It's tricky as you may impact other data being forwarded over the same port and re-apply props & transforms to data you didn't expect to, but if the middle one is dedicated to gathering this data, then definitely. (of course the easier way would be if the origin HF could just set the index and sourcetype appropriately at input time, but I'm not familiar with the plugin you're using).

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...