Hello everyone,
I hope you’re doing well.
I need assistance with integrating Splunk with Elasticsearch. My goal is to pull data from Elasticsearch and send it to Splunk for analysis. I have a few questions on how to achieve this effectively:
1. **Integration Methods:** Are there recommended methods for integrating Splunk with Elasticsearch?
2. **Tools and Add-ons:** What tools or add-ons can be used to facilitate this integration?
3. **Setup and Configuration:** Are there specific steps or guidelines to follow for setting up this integration correctly?
4. **Examples and Guidance:** Could you provide any examples or guidance on how to configure Splunk to pull data from Elasticsearch?
Any help or useful resources would be greatly appreciated.
Thank you in advance for your time and assistance!
Hi @tuts,
Use Elasticsearch Data Integrator - Module Input if your requirements match the following:
The add-on uses the Python Elasticsearch client search() method, which wraps the Elasticsearch Search API.
The add-on will search for all documents in the configured index list with configured date field values greater than or equal to now minus the configured offset and less than or equal to now.
E.g. Given logs-*,metrics-*, @timestamp, and -24h, respectively, the add-on will retrieve documents in pages of 1,000:
GET /logs-*,metrics-*/_search?from=0&size=1000
{
"query": {
"bool": {
"must": [
{
"range": {
"@timestamp": {
"gte": "now-24h",
"lte": "now"
}
}
}
]
}
}
}
Elasticsearch limits scrolling using the from and size parameters to 10,000 results (10 pages of 1,000 documents).
If you need to retrieve more documents per interval or need more control over how search results are presented prior to entering the Splunk ingest pipeline, you should evaluate REST API Module Input or similar solutions. You might also consider writing your own modular input or scripted input.
A custom solution would allow to control the query language (Query DSL, ES|QL, SQL, etc.), scrolling, checkpointing, etc.
If you have more specific questions, members of the community like me with experience in both Splunk and Elasticsearch can assist.
There are at least two separate apps for "integration" with ES (haven't used either so can't help much in terms of reviewing them). But the question (not necessarily for answering here, just a food for thought) is what do you really wanna do. Because in terms of high-level overview you have twp options:
1. Simply pull the data from ES, ingest it into Splunk and work with it as any other Splunk-indexed data. This has two drawbacks - you're getting data already pre-processed by ES and it might be in a completely different format than Splunk native addons for your source types would expect. And of course you're wasting resources (most notably storage).
2. Try to search data from your ES cluster and only do "post-processing" in Splunk. While this might work (I suppose those apps on Splunkbase aim at it) you're not using Splunk's abilities to the fullest - most importantly you're not using Splunk's map-reduce processing splitting the workload and parallelizing it if possible. So while it might be possible with one or both of those apps just as you can query a SQL database using dbconnect it is probably not something I'd do on big datasets.