All Apps and Add-ons

How Much Latency Does Hadoop Connect Add?

David
Splunk Employee
Splunk Employee

I'm trying to figure out how much of an additional delay Hadoop Connect would add to my existing Splunk log latency to get data into Hadoop. I.e., Most of my logs are available in Splunk within seconds of being created. How long before they would be available in Hadoop? Minutes/Hours/Days?

Thank you

Tags (1)
0 Karma
1 Solution

rdagan_splunk
Splunk Employee
Splunk Employee

Using Hadoop Connect Export, every 5 minutes is the minimum frequency allowed. 
So at the minimum - every 5 minutes a search will start .. As the job runs, Splunk processes chunks of data received from the search and creates compressed files, locally on the search head. These files are moved to HDFS if they reach 64MB size or if cumulatively they consume more than 1GB, or the search completes successfully.
Therefore, for a short search with little results I would say maybe every 6 minutes you will get a new file into HDFS.  For a larger results, it will take longer for the file to get upto 64MB and to move the 64MB into HDFS.

View solution in original post

rdagan_splunk
Splunk Employee
Splunk Employee

Using Hadoop Connect Export, every 5 minutes is the minimum frequency allowed. 
So at the minimum - every 5 minutes a search will start .. As the job runs, Splunk processes chunks of data received from the search and creates compressed files, locally on the search head. These files are moved to HDFS if they reach 64MB size or if cumulatively they consume more than 1GB, or the search completes successfully.
Therefore, for a short search with little results I would say maybe every 6 minutes you will get a new file into HDFS.  For a larger results, it will take longer for the file to get upto 64MB and to move the 64MB into HDFS.

Get Updates on the Splunk Community!

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI! Discover how Splunk’s agentic AI ...

[Puzzles] Solve, Learn, Repeat: Dereferencing XML to Fixed-length events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...