All Apps and Add-ons
Highlighted

How to configure Heavy Forwarder to pull data from AWS SQS and then send it to Indexer cluster

New Member

I am looking for a solution for my current environment:

- Data residing on AWS S3. This data is from various sources and we collect them to AWS S3 buckets

- We are planning to install HF under the same AWS account where the data is available on S3. This data should be injected from S3 to Heavy Forwarder (HF) and then from HF, it should get ingested into Indexer cluster

- Since we are getting the data from various different sources, do we need to install individual Splunk apps or add-ons for these data types on HF. Data may be Cylance, FireEye etc. data? Since couple of these apps require data ingestion directly from the source device, it seems we cannot use them for our purpose.

My question is: Should we directly inject data from S3 to HF and then from HF to Indexer cluster?

Here is a flow to show end to end picture:

AWS S3 (Data from sources) ->> AWS SQS ->> HF (with Splunk App for AWS to pull data from AWS SQS) ->> Indexer cluster

 

Thanks.

Labels (2)
Tags (2)
0 Karma
Highlighted

Re: How to configure Heavy Forwarder to pull data from AWS SQS and then send it to Indexer cluster

SplunkTrust
SplunkTrust
Parsing of the events will be done by the HF so the add-ons should be installed there. The add-ons that have inputs should have those inputs disabled since you already have the data in hand. All you really need are the props and transforms.
---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: How to configure Heavy Forwarder to pull data from AWS SQS and then send it to Indexer cluster

New Member

Thank you for your response. Actually the purpose of configuring HF between AWS S3 and the Indexer cluster is that we don't want Indexer cluster to pull data in any case thereby not using any CPU/Memory for this purpose. So, HF is there just to pull data from S3 and forward it to the Indexer cluster. This indexer cluster is in some other AWS account and we have access to its endpoints. We want indexing and all to happen on this indexer cluster. For this reason we don't want to install any logtype (ie. Cylance, AMP etc.) specific add-ons/apps on HF.

My other question is, since log specific Apps usually require direct ingestion from the source device to the App, we cannot use these apps. So, can we rely on default indexing that pulls Selected fields and Interesting fields as indexed by Indexer from the data that is ingested into the Indexer cluster?

0 Karma
Highlighted

Re: How to configure Heavy Forwarder to pull data from AWS SQS and then send it to Indexer cluster

SplunkTrust
SplunkTrust

You may want "indexing and all" to happen on the cluster, but it doesn't work that way.  Heavy forwarders are indexers that don't store data.  They will do all the work of indexing (parsing, typing, etc.) and then send the results to the cluster to be stored.  That cannot be changed and is why any apps that assist with parsing must be installed on the HF.

Another option is to replace the HF with a Universal Forwarder (UF).  UFs do virtually no parsing themselves so all of the work will be done by the indexer cluster.

Since you already have your data in S3, the only parts of the apps you need are the ones that don't input data (no inputs.conf, nothing in /bin).

You can use default indexing if you wish, but you may be disappointed in the results.  Apps often are created to enhance default indexing.

---
If this reply helps you, an upvote would be appreciated.
Highlighted

Re: How to configure Heavy Forwarder to pull data from AWS SQS and then send it to Indexer cluster

Thanks. Looks like we have to go for UFs as per your suggestion.

Could you please elaborate more on your following comment?:

"Since you already have your data in S3, the only parts of the apps you need are the ones that don't input data (no inputs.conf, nothing in /bin)."

Could you please point us to the document that can provide more details on the above? Being new Splunk users, we don't have complete understanding on the above. If I understand it correctly, you are saying that we don't need to configure these log specific apps to pull data directly from the devices which is the condition for most of the apps.

 

Thanks again.

0 Karma
Highlighted

Re: How to configure Heavy Forwarder to pull data from AWS SQS and then send it to Indexer cluster

SplunkTrust
SplunkTrust

Apps and add-ons have three main functions - bringing data in, processing it, and displaying it.  Not all apps do all three.  It's not necessary to use an app/add-on for all of its capabilities, either.

In fact, it's standard (and necessary) practice to disable parts of an app depending on the instance type on which it is installed.  See https://docs.splunk.com/Documentation/AddOns/released/Overview/Distributedinstall

An add-on that uses API calls to ingest data likely will also have props.conf settings that process the data.  Your data is already in S3 so you don't need the API calls.  But you do need the props.conf file for field extractions and other needs.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: How to configure Heavy Forwarder to pull data from AWS SQS and then send it to Indexer cluster

Thanks.

One more question: Since we can't use Heavy Forwarders as discussed above for our use-case, it seems we can't either use the Universal forwarders, because we have to pick the data/logs from AWS S3 buckets and ingest into our Indexer cluster. Could you please suggest any other way, that can satisfy our requirements.

 

In nutshell, we want to fetch data from S3, send it to intermediate forwarder (Heavy/Universal) and then this forward sends same data to the indexer cluster. In other words, is there a way disable indexing at Heavy forwarder level? If answer is no, then what option do we have to handle the above use case?

 

Thanks again.

0 Karma
Highlighted

Re: How to configure Heavy Forwarder to pull data from AWS SQS and then send it to Indexer cluster

SplunkTrust
SplunkTrust

How is the data being read from S3?  The answer will determine which forwarder can be used.

There is no way to prevent a HF from parsing data.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: How to configure Heavy Forwarder to pull data from AWS SQS and then send it to Indexer cluster

We were planning to use Splunk AWS app to read data from S3. Now it seems we can't install this app on Universal Forwarder and we must use Heavy Forwarder (HF). But as HF performs parsing and indexing, using HF may cause unnecessary processing before HF forwards the data to the Indexer cluster. We just want to use HF as a forwarder and all the indexing, parsing etc. we want to do at the Indexer cluster level.

0 Karma
Highlighted

Re: How to configure Heavy Forwarder to pull data from AWS SQS and then send it to Indexer cluster

Path Finder
0 Karma
Speak Up for Splunk Careers!

We want to better understand the impact Splunk experience and expertise has has on individuals' careers, and help highlight the growing demand for Splunk skills.