Solved: Data Ingestion to Cluster

raghu0463 · ‎01-30-2019

Hi All,
need help on data ingestion to cluster
i was trying to ingest data to indexer cluster, built on AWS linux,
cluster config - 1master, 1 sh, 2 idx, 1 uf
first ingested single file to main index but unable to ingest to newly created index

fverdi · ‎01-31-2019

There could be a number of things wrong here, and we'll need more information to determine what the cause is. There are a number of ways to do this, but here are some steps that I recommend trying.

Let's start in order from the file to the indexer:
1. Confirm the permissions on the file are configured to allow Splunk to read the file
2. Confirm the file contains new data that has not already been read-in by Splunk. If you're trying to ingest the same file twice, Splunk will not do this because it's already marked as read in the fish bucket
3. If this is just test data, add a new line/new event to the file to be read in
4. Use btprobe to remove the file from the fish bucket so Splunk will read it in again ./btprobe -d SPLUNK_HOME/var/lib/splunk/fishbucket/splunk_private_db --file /home/ec2-user/Test --reset
5. Double-check your monitor stanza, keep in mind that it is case-sensitive

Is this file being read in by the UF?

If so, run searches for index=_* host=<UF hostname> error OR warn and index=_* host=<UF hostname> "/home/ec2-user/Test" and share any noteworthy results, or if there is a lack of results, that is noteworthy too.

On the indexers:
1. Confirm that the new index is properly configured
2. Are you currently receiving any error messages in the top right corner of the Splunk Web UI?

3. Attempt to cut the UF out of the equation by dropping the file on one of your indexers using a oneshot ./splunk add oneshot /var/log/applog -index Test_Index and see if/how that works (not as a permanent solution, but as a troubleshooting step to isolate the issue)
4. Run a searches for index=_internal host IN (<idx1>,<idx2>) error OR warn and index=_internal host IN (<idx1>,<idx2>) "/home/ec2-user/Test" and share any noteworthy results.

On the search head:
1. Be sure to run searches for all time, in case there is an issue with time stamp extraction index=Test_Index earliest=1 latest=now()
2. Run a search to make sure the data didn't somehow end up somewhere else index=* source="/home/ec2-user/Test" earliest=1 latest=now()
3. Finally, let's look for any internal events related to this file index=_* source="/home/ec2-user/Test" earliest=1 latest=now() and note which host they're coming from if you've attempted to read this file in from multiple hosts.

Report back with your findings and we can continue to move forward to narrow the troubleshooting and identify the root cause(s).

View solution in original post

fverdi · ‎01-31-2019

There could be a number of things wrong here, and we'll need more information to determine what the cause is. There are a number of ways to do this, but here are some steps that I recommend trying.

Let's start in order from the file to the indexer:
1. Confirm the permissions on the file are configured to allow Splunk to read the file
2. Confirm the file contains new data that has not already been read-in by Splunk. If you're trying to ingest the same file twice, Splunk will not do this because it's already marked as read in the fish bucket
3. If this is just test data, add a new line/new event to the file to be read in
4. Use btprobe to remove the file from the fish bucket so Splunk will read it in again ./btprobe -d SPLUNK_HOME/var/lib/splunk/fishbucket/splunk_private_db --file /home/ec2-user/Test --reset
5. Double-check your monitor stanza, keep in mind that it is case-sensitive

Is this file being read in by the UF?

If so, run searches for index=_* host=<UF hostname> error OR warn and index=_* host=<UF hostname> "/home/ec2-user/Test" and share any noteworthy results, or if there is a lack of results, that is noteworthy too.

On the indexers:
1. Confirm that the new index is properly configured
2. Are you currently receiving any error messages in the top right corner of the Splunk Web UI?

3. Attempt to cut the UF out of the equation by dropping the file on one of your indexers using a oneshot ./splunk add oneshot /var/log/applog -index Test_Index and see if/how that works (not as a permanent solution, but as a troubleshooting step to isolate the issue)
4. Run a searches for index=_internal host IN (<idx1>,<idx2>) error OR warn and index=_internal host IN (<idx1>,<idx2>) "/home/ec2-user/Test" and share any noteworthy results.

On the search head:
1. Be sure to run searches for all time, in case there is an issue with time stamp extraction index=Test_Index earliest=1 latest=now()
2. Run a search to make sure the data didn't somehow end up somewhere else index=* source="/home/ec2-user/Test" earliest=1 latest=now()
3. Finally, let's look for any internal events related to this file index=_* source="/home/ec2-user/Test" earliest=1 latest=now() and note which host they're coming from if you've attempted to read this file in from multiple hosts.

Report back with your findings and we can continue to move forward to narrow the troubleshooting and identify the root cause(s).

raghu0463 · ‎01-31-2019

Hi Fverdi,
Thanks for your response which helped me to debug in the right path.

-- The problem is resolved I think the issue is I was trying to Ingest the same data file with different name.
-- I opened the port 9079

And I have one more question, how do I need to check already ingested file names in Fishbucket please from command line ?

Thanks,
Raghu

raghu0463 · ‎01-30-2019

input.conf
[monitor:///home/ec2-user/Test]
disabled = false
index = Test_index

output.conf
[tcpout]
defaultGroup = default-autolb-group

[tcpout:default-autolb-group]
disabled = false

server = 1.2.3.4:9997,5.6.7.8:9997

Data Ingestion to Cluster

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Join the Conversation

Data Ingestion to Cluster

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...