i have my databricks setup in aws which runs multiple ETL pipelines. i want to send logs, metrices, application flow tracker etc. in splunk. i am not sure on how this can be achieved. i have my organisation splunk setup where i can generate my auth token and can see the endpoint details. whether this is enough to push data from databricks to splunk or i need to have open telemetry alike collector which will read the data stored in databricks /some/location and push them to splunk?
Hi @sugata
I dont think Databricks has a specific Splunk connector as such, but I did work with Databricks and sending its own logs to Splunk in a previous life...
How are you running Databricks? You might find the easiest way is to run a Splunk Universal Forwarder to send the specific log files from the Databricks worker nodes to your Splunk environment.
There is also the Databricks Add-on for Splunk app on Splunkbase but this is more designed to run queries against Databricks and/or trigger jobs, although this could also be used to gather telemetry.
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
Thanks for your reply @livehybrid
The add-on that you mentioned is good for querying databricks (sending a command TO databricks) but I am looking for a solution which can send logs FROM databricks.
Example - am building a 10 steps ETL pipeline in databricks, which is hosted in aws. At the end of the each step, i need to write a log in splunk aout its success/failure. I have a schema defined for the log. So my question is how to send that event/log into splunk, which is hosted somewhere else and not in aws.
I feel splunk might have somekind of API exposed for that. just dont know which API, how to call, how to configure, what are the best practices etc.