Hi,
We need to export login events from Windows and Linux servers from Splunk Cloud Platform to another system for further analysis. In on-prem deployments, we were able to use a forwarder to export syslog data over a TLS connection. Is this an option also in Splunk Cloud?
We see there's an option to use a REST API to get data from Splunk Cloud, but is it practical when we are talking about a large amount of data, all the time? We need to get the data within a few seconds and we are talking about a large number of server, so not sure that polling with REST API is the way to go.
Alternatively, are there other ways? Maybe cloud native ways like exporting to AWS CloudWatch or Kinesis streams?
Thanks, Gabriel
Hey @gabrielsz ,
If the other tools where you want to analyze the data can directly be integrated with Splunk (Ex: PowerBI, Tableau, Exabeam), then I'd suggest exploring those integrations as they are easy to set up and use and you can pull the data based on the searches that you can schedule which the tools can run at the frequency that you desire.
If the tool that you're trying to use would require the data being sent over to its database, then you can explore the following options:
1. If you're using syslog to onboard some of the data, then you can add the IP of the tool to have the syslog server forward the data to it as well.
2. For API based data onboardings, you can write your custom scripts to pull the data to your tool, as Splunk add-ons basically work the same way
3. Since you're using Splunk Cloud, you must be using Heavy forwarders/Universal forwarders to forward the data to indexers in order to avoid allowing all the devices to connect to cloud. You can set up the outputs.conf to route the incoming data to the heavy forwarders from UFs and then store the data on HFs, then use the agent of the tool/forward the data via outputs/develop a script to your tool from there and set up another script to rotate the logs to avoid consumption of the disk space.
4. Raise a support case with Splunk letting them know your requirements and have them forward the logs to your tool. The last resort which has the least probability of working and will cause latency.
What tool will you be using for analytics? That should throw a lot of light on the best path of log forwarding.
Hope this helps,
****If you found the answer helpful, kindly consider upvoting/accepting it as the answer as it helps other Splunkers find the solutions to similar issues****
Accidentally replied here instead of replying to the answer - can't delete, only edit, so please ignore this 🙂
You definitely don't want to pull this data after it has already been indexed.
The way to go is to separate the data before ingestion and send a copy to your external solution.
Just use on-prem forwarder(s) with additional outputs.
Thanks for your answer!
Can you elaborate a bit more about the reasons why pulling the data from Splunk cloud is a bad idea?
Hey @gabrielsz ,
If the other tools where you want to analyze the data can directly be integrated with Splunk (Ex: PowerBI, Tableau, Exabeam), then I'd suggest exploring those integrations as they are easy to set up and use and you can pull the data based on the searches that you can schedule which the tools can run at the frequency that you desire.
If the tool that you're trying to use would require the data being sent over to its database, then you can explore the following options:
1. If you're using syslog to onboard some of the data, then you can add the IP of the tool to have the syslog server forward the data to it as well.
2. For API based data onboardings, you can write your custom scripts to pull the data to your tool, as Splunk add-ons basically work the same way
3. Since you're using Splunk Cloud, you must be using Heavy forwarders/Universal forwarders to forward the data to indexers in order to avoid allowing all the devices to connect to cloud. You can set up the outputs.conf to route the incoming data to the heavy forwarders from UFs and then store the data on HFs, then use the agent of the tool/forward the data via outputs/develop a script to your tool from there and set up another script to rotate the logs to avoid consumption of the disk space.
4. Raise a support case with Splunk letting them know your requirements and have them forward the logs to your tool. The last resort which has the least probability of working and will cause latency.
What tool will you be using for analytics? That should throw a lot of light on the best path of log forwarding.
Hope this helps,
****If you found the answer helpful, kindly consider upvoting/accepting it as the answer as it helps other Splunkers find the solutions to similar issues****
Thanks a lot for your detailed answer!
It's actually a tool that we are going to develop, and we are considering our options to collect the data. Using forwarders is definitely an option, but this requires changes in the on-prem environment, we wonder if it's possible to integrate directly with Splunk cloud. I believe this leaves options 2 and 4. How practical is it to be constantly pulling data using the API? It's around 4K events per second.
It's pointless to use up resources to index events only to search for them in a few seconds. Especially if you realize how "heavy" the process of searching is compared to a simple relaying of events to another output.
You have to send a REST call to the search-head. It has to parse your request, order the proper execution steps, sometimes optimze your search a bit, spawn searches to search peers (indexers), make sure they have current knowledge bundle, maybe synchronize it, poll the search peers for the result, pull back the results, consolidate it, possibly doing some more work... The indexers (each of them, involved in the search!) have to retrieve the search request, parse it, execute, verify which buckets can contain requested events, read from the buckets, apply some additional filtering, transformations, return the results to the requesting searchhead... Hell of a lot of work to do just to get a simple list of events.
And if you want to spawn such searches every few seconds as you said... Maybe you won't DOS your instance but you'll cause significant load.
Aren't the resources handled by Splunk in the case of Splunk Cloud Platform? Not that we want so spend all the their resources, these are valid points, but isn't it up to them to assign enough resources?
I don't know which license you are on (ingest-based or workload based) - this might change your point of view "slightly". I'm not sure (as I probably already said - I'm not a cloud user myself) if there are some provisions in the service agreement against such stressing the infrastructure on purpose (after all the service costs are being calculated according to some standardized usage) but that's not even the point.
The point is that it's simply bad design and bad programming to waste resources when it's completely unnecessary. It's like doing a search over "All time" in splunk, calculating some stats and in the end filtering only last day of the results. Sure, you can do it like that but any sane user would simply do a "earliest=-1d" in the beginning of the search and have it done in 10s instead of half an hour.
EDIT: Oh, and one more thing, in your "index and search" approach you really don't have a reliable method for making sure that you don't have gaps in your events or duplicated ones because of time range overlaps/gaps.
@PickleRick , What should be the approach taken for data landing in Cloud through modular inputs and not from any UF/HF to export out?
OK, I understand your points. We'll go with the forwarders approach. Thanks!
Sorry for the delayed response here, but I agree with Rick. Pulling data from Splunk has its drawbacks. The forwarder method too has some important aspects to consider, your queue sizes, ulimits and thruputs. Kindly consider increasing the following:
1. Sizes of parsing, tcpout and the queue forwarding data to your HFs.
2. ulimits on your HFs as they will be handling large amount of data going to Splunk and your tools. Look out for errors like "Broken pipes", "Socket timeout which usually indicate the problem of ulimit running out.
3. Increasing thruput limits. It might take extra bandwidth, but will ensure that your data gets to your tool on time, considering you have 4K events per seconds, I'd suggest increasing it to a good number that bods well with your organization's prescribed limits.
Apart from that, kindly consider the following as well:
1. Avoiding the usage of useACK. This cause sticky memory issues where data gets stuck in the memory and only gets released on restarting splunkd. Better to avoid it, especially if you have old versions of UFs.
2. Building an alert on "Cooked connections timed out" and queue sizes of your HFs. These will proactively let you when your HFs start to dwindle.
3. Script to rotate the logs on disk to ensure that your disk space remains free enough to handle all the operations.
4. Enable bootstart. Binding splunkd with systemd/init.d always helps to respawn it, should something go wrong and helps you to avoid the manual intervention as much as possible.
Hope this helps,
Best wishes