Getting Data In

How exactly does upload file for one shot indexing work?

tanmaybalwa
Engager

I am clear of steps needed for uploading a .tar file but I have a question about how does it work. Splunk indexes the file eventually and stores it in the database which isn't easily human readable. Path to indexes can be configured in splunk settings. Knowing this, my queries are:

  1. When you upload of file of say 10 MB on the remote splunk server, where is it stored? i tried $SPLUNK_HOME/var/spool/splunk immediately after uploading the file. There was no file in it.
  2. Do we have a way to configure where the uploaded file is stored?
  3. Does the file get eventually deleted on the remote server?
  4. If logs from a different date are uploaded later which have still got the already indexed data is that repetition handled?

Thanks!

0 Karma
1 Solution

somesoni2
Revered Legend

Below are the answers to your queries

  1. When you upload a file (any size), the file is not actually getting copied to Splunk server, instead data from the file is first getting transferred via protocol decided by configured inputs and saved in a temporary binary file (in folder $SPLUNK_HOME/var/spool/splunk/) which is not human readable. Later these binary files are parsed and data is stored into indexes (see more here).
  2. Since, your file not uploaded literally and you can't read binary files anyways, it query becomes irrelevant.
  3. There will not be any impact on the actual data file by splunk. It should still remain in original place, if any other program is not doing anything with it.
  4. Splunk creates the handler for a file based on first few characters of the file content and the data already indexed so far. If an updated file is placed, it will just index the new data. This is default behavior of Splunk.

Hope this helps.

View solution in original post

somesoni2
Revered Legend

Below are the answers to your queries

  1. When you upload a file (any size), the file is not actually getting copied to Splunk server, instead data from the file is first getting transferred via protocol decided by configured inputs and saved in a temporary binary file (in folder $SPLUNK_HOME/var/spool/splunk/) which is not human readable. Later these binary files are parsed and data is stored into indexes (see more here).
  2. Since, your file not uploaded literally and you can't read binary files anyways, it query becomes irrelevant.
  3. There will not be any impact on the actual data file by splunk. It should still remain in original place, if any other program is not doing anything with it.
  4. Splunk creates the handler for a file based on first few characters of the file content and the data already indexed so far. If an updated file is placed, it will just index the new data. This is default behavior of Splunk.

Hope this helps.

tanmaybalwa
Engager

Thanks for the reply! 🙂

0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...