Hi Everyone,
We are using Splunk Enterprise in our company. We want to ingest logs from applications hosted on the cloud. But, when we try to connect we get a lot of logs which is unrelated to our application which in turn causes high License utilization.
Is there any method where we can filter out the logs that we want (like logs of specific application or log source) before ingesting in Splunk so as to reduce the License Utilization while getting required security logs for the application.
You have multiple options depending on your architecture.
Always best approach is to filter at the source itself, but if its not possible
Use props.conf and transforms.conf on splunk enterprise to drop events before indexing
If you're on Splunk 9+, you can use Ingest Actions
Ref #https://docs.splunk.com/Documentation/Splunk/9.4.2/Data/DataIngest
#https://lantern.splunk.com/Splunk_Platform/Product_Tips/Data_Management/Using_ingest_actions_to_filt...
You can also consider using Splunk Edge Processor
#https://help.splunk.com/en/splunk-cloud-platform/process-data-at-the-edge/use-edge-processors/9.3.24...
Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!
Hi @Prewin27 ,
Thanks for the reply,
But the solutions you suggested though they may help in filtering logs, they work only on Splunk Cloud Platform.
I wanted a solution where we can filter the logs from applications hosted on AWS or Azure and ingest them in Splunk Enterprise.
Hi @Jayanthan
The Ingest Actions and props/transforms options are both suitable for Splunk Enterprise as well as Splunk Cloud.
The article @PickleRick posted gives a good overview of how to filter data using props/transforms. Ingest Actions gives a more UI friendly approach to very similar concepts if you are less familiar with props/transforms. Check out https://lantern.splunk.com/Splunk_Platform/Product_Tips/Data_Management/Sampling_data_with_ingest_ac... which is quite a good overview.
How are you currently getting your data? Is this sent from cloud apps to Splunk via HEC/UF/HF or are you pulling the data in with a specific app like AWS TA?
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
Hi @livehybrid
I am right now using "Splunk Add on for Microsoft Cloud services" and Splunk Add on for AWS" using HEC to collect logs from the cloud
I'm not familiar with these add ons so I'm not sure how your process works. If you're indeed receiving data on the HEC input, it's up to you on the source side to export only a subset of your events. That's usually the most effective way because it's better to not send the data than to send it, receive and then filter out wasting resources on stuff you don't need and don't want.
If you cannot do that, the document I pointed you to describes how you can filter your events.
There seems to be some nasty restrictions on this add on depending on what inputs you are using. Sometimes this leads that filtering some events away from streams is not so simple than docs said. Also those docs are not enough clear in this use case (at least user like I, which isn’t a native English speaker).
So could you tell more about your case, so we could better understand your issue? The minimum what we need to know is
Hi @isoutamo ,
we are using an on-premise Splunk Enterprise version 9.4.2 in a distributed environment with a multi-site indexer cluster and a search head cluster.
We have right now ingesting OS logs, Security Logs and Application logs from Windows and Linux Servers using Universal Forwarders.
Some of Company's application are hosted in AWS and Microsoft Azure, we wanted to ingest the Security logs of those applications to monitor them for cybersecurity purposes.
But, when we connected to the cloud using the add-on, we where getting a lot of unwanted logs which led license over-utilization.
When we tried filtering, due to the large amount of logs and continuous filtering our Splunk servers had high utilization which led to the whole Splunk service slowing down.
Hence, I wanted a method where we can filter out the unwanted logs or select only the required logs before it enters the Splunk servers.
Even if the solution is not from Splunk but from AWS or Azure. It would fine as long as we can send logs to Splunk .
If you are ingesting with UF then props and transforms should work as in on prem. You must just install those into first full splunk enterprise node.
What is this “the add on”? And is it running on HF or UF?
If you have lot of logs to filter then you probably want to use IHFs between UFs and your indexer cluster?
Hi @isoutamo
We want to ingest data from from applications hosted in the cloud, we can't use UF, HF since we can't install agents in them.
"The Add-ons" mentioned are applications available in Splunk Enterprise UI similar to "Search and Reporting". They help to pull data from AWS S3 logging Accounts (AWS Add-on) and Azure Event Hub (Azure Add-on).
My problem is these add-ons pull everything available in the logging accounts which includes logs unrelated to my application.
What I want is to filter the logs such that I can take or at least filter the logs pertaining only to my application.
In this case, as mentioned before you can use ingest action. It allows you to filter, mask, or route events before they are indexed.
For reference check this out #https://lantern.splunk.com/Splunk_Platform/Product_Tips/Data_Management/Using_ingest_actions_in_Splu...
Settings > Data > Ingest Actions and create rule set.
Alternatively, you can use the traditional method with your Heavy Forwarder by configuring props.conf and transforms.conf
Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!
There is a whole chapter in docs on that.
But that's only if you can't configure your inputs (and with cloud services I suppose you'd be using a pull-mode API inputs or something similar) to only give you a subset of your logs.
My problem here is that when I connect applications hosted on cloud to Splunk Enterprise using Add-ons, I am getting lot of unwanted logs which when ingested shoots up the license utilization or increases server overhead and usage due to continuous filtering which induces a lag in Splunk system as whole.
I wanted to know if there is any method to filter the logs coming from cloud before ingesting it in Splunk. So that the processing load and license usage of Splunk will be less.
Hi @Jayanthan
There are a number of approaches you could take to do this such as Edge Processor, Ingest Actions, props/transforms or segregating at source.
What tools/apps/processes are you currently using bring the data in to Splunk? The most optimum way to reduce the amount of data ingested in to Splunk is to omit it at source (e.g. not send/pull it)!
Please let us know and we can hopefully drill further into options for you.
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing