Splunk Dev

Data Ingestion to Cluster

raghu0463
Explorer

Hi All,
need help on data ingestion to cluster
i was trying to ingest data to indexer cluster, built on AWS linux,
cluster config - 1master, 1 sh, 2 idx, 1 uf
first ingested single file to main index but unable to ingest to newly created index

0 Karma
1 Solution

fverdi
Explorer

There could be a number of things wrong here, and we'll need more information to determine what the cause is. There are a number of ways to do this, but here are some steps that I recommend trying.

Let's start in order from the file to the indexer:
1. Confirm the permissions on the file are configured to allow Splunk to read the file
2. Confirm the file contains new data that has not already been read-in by Splunk. If you're trying to ingest the same file twice, Splunk will not do this because it's already marked as read in the fish bucket
3. If this is just test data, add a new line/new event to the file to be read in
4. Use btprobe to remove the file from the fish bucket so Splunk will read it in again ./btprobe -d SPLUNK_HOME/var/lib/splunk/fishbucket/splunk_private_db --file /home/ec2-user/Test --reset
5. Double-check your monitor stanza, keep in mind that it is case-sensitive

Is this file being read in by the UF?

  • If so, run searches for index=_* host=<UF hostname> error OR warn and index=_* host=<UF hostname> "/home/ec2-user/Test" and share any noteworthy results, or if there is a lack of results, that is noteworthy too.

On the indexers:
1. Confirm that the new index is properly configured
2. Are you currently receiving any error messages in the top right corner of the Splunk Web UI?

3. Attempt to cut the UF out of the equation by dropping the file on one of your indexers using a oneshot ./splunk add oneshot /var/log/applog -index Test_Index and see if/how that works (not as a permanent solution, but as a troubleshooting step to isolate the issue)
4. Run a searches for index=_internal host IN (<idx1>,<idx2>) error OR warn and index=_internal host IN (<idx1>,<idx2>) "/home/ec2-user/Test" and share any noteworthy results.

On the search head:
1. Be sure to run searches for all time, in case there is an issue with time stamp extraction index=Test_Index earliest=1 latest=now()
2. Run a search to make sure the data didn't somehow end up somewhere else index=* source="/home/ec2-user/Test" earliest=1 latest=now()
3. Finally, let's look for any internal events related to this file index=_* source="/home/ec2-user/Test" earliest=1 latest=now() and note which host they're coming from if you've attempted to read this file in from multiple hosts.

Report back with your findings and we can continue to move forward to narrow the troubleshooting and identify the root cause(s).

View solution in original post

fverdi
Explorer

There could be a number of things wrong here, and we'll need more information to determine what the cause is. There are a number of ways to do this, but here are some steps that I recommend trying.

Let's start in order from the file to the indexer:
1. Confirm the permissions on the file are configured to allow Splunk to read the file
2. Confirm the file contains new data that has not already been read-in by Splunk. If you're trying to ingest the same file twice, Splunk will not do this because it's already marked as read in the fish bucket
3. If this is just test data, add a new line/new event to the file to be read in
4. Use btprobe to remove the file from the fish bucket so Splunk will read it in again ./btprobe -d SPLUNK_HOME/var/lib/splunk/fishbucket/splunk_private_db --file /home/ec2-user/Test --reset
5. Double-check your monitor stanza, keep in mind that it is case-sensitive

Is this file being read in by the UF?

  • If so, run searches for index=_* host=<UF hostname> error OR warn and index=_* host=<UF hostname> "/home/ec2-user/Test" and share any noteworthy results, or if there is a lack of results, that is noteworthy too.

On the indexers:
1. Confirm that the new index is properly configured
2. Are you currently receiving any error messages in the top right corner of the Splunk Web UI?

3. Attempt to cut the UF out of the equation by dropping the file on one of your indexers using a oneshot ./splunk add oneshot /var/log/applog -index Test_Index and see if/how that works (not as a permanent solution, but as a troubleshooting step to isolate the issue)
4. Run a searches for index=_internal host IN (<idx1>,<idx2>) error OR warn and index=_internal host IN (<idx1>,<idx2>) "/home/ec2-user/Test" and share any noteworthy results.

On the search head:
1. Be sure to run searches for all time, in case there is an issue with time stamp extraction index=Test_Index earliest=1 latest=now()
2. Run a search to make sure the data didn't somehow end up somewhere else index=* source="/home/ec2-user/Test" earliest=1 latest=now()
3. Finally, let's look for any internal events related to this file index=_* source="/home/ec2-user/Test" earliest=1 latest=now() and note which host they're coming from if you've attempted to read this file in from multiple hosts.

Report back with your findings and we can continue to move forward to narrow the troubleshooting and identify the root cause(s).

raghu0463
Explorer

Hi Fverdi,
Thanks for your response which helped me to debug in the right path.

-- The problem is resolved I think the issue is I was trying to Ingest the same data file with different name.
-- I opened the port 9079

And I have one more question, how do I need to check already ingested file names in Fishbucket please from command line ?

Thanks,
Raghu

0 Karma

raghu0463
Explorer

input.conf
[monitor:///home/ec2-user/Test]
disabled = false
index = Test_index

output.conf
[tcpout]
defaultGroup = default-autolb-group

[tcpout:default-autolb-group]
disabled = false

server = 1.2.3.4:9997,5.6.7.8:9997

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...