Splunk Enterprise

How to filter logs of Applications hosted in the cloud in splunk Enterprise

Jayanthan
Loves-to-Learn

Hi Everyone,

We are using Splunk Enterprise in our company. We want to ingest logs from applications hosted on the cloud. But, when we try to connect we get a lot of logs which is unrelated to our application which in turn causes high License utilization.

Is there any method where we can filter out the logs that we want (like logs of specific application or  log source) before ingesting in Splunk so as to reduce the License Utilization while getting required security logs for the application.

 

Labels (2)
0 Karma

Prewin27
Contributor

@Jayanthan 

You have multiple options depending on your architecture.
Always best approach is to filter at the source itself, but if its not possible

Use props.conf and transforms.conf on splunk enterprise to drop events before indexing

If you're on Splunk 9+, you can use Ingest Actions
Ref #https://docs.splunk.com/Documentation/Splunk/9.4.2/Data/DataIngest
#https://lantern.splunk.com/Splunk_Platform/Product_Tips/Data_Management/Using_ingest_actions_to_filt...

You can also consider using Splunk Edge Processor
#https://help.splunk.com/en/splunk-cloud-platform/process-data-at-the-edge/use-edge-processors/9.3.24...


Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!

0 Karma

Jayanthan
Loves-to-Learn

Hi @Prewin27 ,

Thanks for the reply,

But the solutions you suggested though they may help in filtering logs, they work only on Splunk Cloud Platform.

I wanted a solution where we can filter the logs from applications hosted on AWS or Azure and ingest them in Splunk Enterprise.

Splunk Enterprise Security 

0 Karma

livehybrid
Super Champion

Hi @Jayanthan 

The Ingest Actions and props/transforms options are both suitable for Splunk Enterprise as well as Splunk Cloud. 

The article @PickleRick posted gives a good overview of how to filter data using props/transforms. Ingest Actions gives a more UI friendly approach to very similar concepts if you are less familiar with props/transforms. Check out https://lantern.splunk.com/Splunk_Platform/Product_Tips/Data_Management/Sampling_data_with_ingest_ac... which is quite a good overview.

How are you currently getting your data? Is this sent from cloud apps to Splunk via HEC/UF/HF or are you pulling the data in with a specific app like AWS TA?

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

Jayanthan
Loves-to-Learn

Hi @livehybrid 

I am right now using "Splunk Add on for Microsoft Cloud services" and Splunk Add on for AWS"  using HEC to collect logs from the cloud

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I'm not familiar with these add ons so I'm not sure how your process works. If you're indeed receiving data on the HEC input, it's up to you on the source side to export only a subset of your events. That's usually the most effective way because it's better to not send the data than to send it, receive and then filter out wasting resources on stuff you don't need and don't want.

If you cannot do that, the document I pointed you to describes how you can filter your events.

0 Karma

isoutamo
SplunkTrust
SplunkTrust

There seems to be some nasty restrictions on this add on depending on what inputs you are using. Sometimes this leads that filtering some events away from streams is not so simple than docs said. Also those docs are not enough clear in this use case (at least user like I, which isn’t a native English speaker).

So could you tell more about your case, so we could better understand your issue? The minimum what we need to know is

  • your environment 
    • single node
    • distributed environment and if, which kind of
  • versions
  • Is your splunk in azure or AWS or even somewhere other cloud
  • one or more tenants
  • which inputs you have configured and how
  • probably something else is needed later
0 Karma

Jayanthan
Loves-to-Learn

Hi @isoutamo ,

we are using an on-premise Splunk Enterprise version 9.4.2 in a distributed environment with a multi-site indexer cluster and a search head cluster.

We have right now ingesting  OS logs, Security Logs and Application logs from Windows and Linux Servers using Universal Forwarders.

Some of Company's application are hosted in AWS and Microsoft Azure, we wanted to ingest the Security logs of those applications to monitor them for cybersecurity purposes.

But, when we connected to the cloud using the add-on, we where getting a lot of unwanted logs which led license over-utilization. 

When we tried filtering, due to the large amount of logs and continuous filtering our Splunk servers had high utilization which led to the whole Splunk service slowing down.

Hence, I wanted a method where we can filter out the unwanted logs or select only the required logs before it enters the Splunk servers.

Even if the solution is not from Splunk but from AWS or Azure.  It would fine as long as we can send logs to Splunk .

0 Karma

isoutamo
SplunkTrust
SplunkTrust

If you are ingesting with UF then props and transforms should work as in on prem.  You must just install those into first full splunk enterprise node.

What is this “the add on”? And is it running on HF or UF?

If you have lot of logs to filter then you probably want to use IHFs between UFs and your indexer cluster?

 

0 Karma

Jayanthan
Loves-to-Learn

Hi @isoutamo 

We want to ingest data from from applications hosted in the cloud, we can't use UF, HF since we can't install agents in them.

"The Add-ons" mentioned are applications available in Splunk Enterprise UI similar to "Search and Reporting". They help to pull data from AWS S3 logging Accounts (AWS Add-on) and Azure Event Hub (Azure Add-on).

My problem is these add-ons pull everything available in the logging accounts which includes logs unrelated to my application.

What I want is to filter the logs such that I can take or at least filter the logs pertaining only to my application.

0 Karma

Prewin27
Contributor

@Jayanthan 

In this case, as mentioned before you can use ingest action. It allows you to filter, mask, or route events before they are indexed.


For reference check this out #https://lantern.splunk.com/Splunk_Platform/Product_Tips/Data_Management/Using_ingest_actions_in_Splu...

Settings > Data > Ingest Actions and create rule set.


Alternatively, you can use the traditional method with your Heavy Forwarder by configuring props.conf and transforms.conf

Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!

0 Karma

PickleRick
SplunkTrust
SplunkTrust

There is a whole chapter in docs on that.

https://help.splunk.com/en/splunk-enterprise/forward-and-process-data/forwarding-and-receiving-data/...

But that's only if you can't configure your inputs (and with cloud services I suppose you'd be using a pull-mode API inputs or something similar) to only give you a subset of your logs.

0 Karma

Jayanthan
Loves-to-Learn

My problem here is that when I connect applications hosted on cloud to Splunk Enterprise using Add-ons, I am getting  lot of unwanted logs which when ingested shoots up the license utilization or increases server overhead and usage due to continuous filtering which induces a lag in Splunk system as whole.

 

I wanted to know if there is any method to filter the logs coming from cloud before ingesting it in Splunk. So that the processing load and license usage of Splunk will  be less.

0 Karma

livehybrid
Super Champion

Hi @Jayanthan 

There are a number of approaches you could take to do this such as Edge Processor, Ingest Actions, props/transforms or segregating at source.

What tools/apps/processes are you currently using bring the data in to Splunk? The most optimum way to reduce the amount of data ingested in to Splunk is to omit it at source (e.g. not send/pull it)!

Please let us know and we can hopefully drill further into options for you.

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma
Get Updates on the Splunk Community!

AppDynamics Summer Webinars

This summer, our mighty AppDynamics team is cooking up some delicious content on YouTube Live to satiate your ...

SOCin’ it to you at Splunk University

Splunk University is expanding its instructor-led learning portfolio with dedicated Security tracks at .conf25 ...

Credit Card Data Protection & PCI Compliance with Splunk Edge Processor

Organizations handling credit card transactions know that PCI DSS compliance is both critical and complex. The ...