Getting Data In

How exactly does upload file for one shot indexing work?

tanmaybalwa
Engager

I am clear of steps needed for uploading a .tar file but I have a question about how does it work. Splunk indexes the file eventually and stores it in the database which isn't easily human readable. Path to indexes can be configured in splunk settings. Knowing this, my queries are:

  1. When you upload of file of say 10 MB on the remote splunk server, where is it stored? i tried $SPLUNK_HOME/var/spool/splunk immediately after uploading the file. There was no file in it.
  2. Do we have a way to configure where the uploaded file is stored?
  3. Does the file get eventually deleted on the remote server?
  4. If logs from a different date are uploaded later which have still got the already indexed data is that repetition handled?

Thanks!

0 Karma
1 Solution

somesoni2
Revered Legend

Below are the answers to your queries

  1. When you upload a file (any size), the file is not actually getting copied to Splunk server, instead data from the file is first getting transferred via protocol decided by configured inputs and saved in a temporary binary file (in folder $SPLUNK_HOME/var/spool/splunk/) which is not human readable. Later these binary files are parsed and data is stored into indexes (see more here).
  2. Since, your file not uploaded literally and you can't read binary files anyways, it query becomes irrelevant.
  3. There will not be any impact on the actual data file by splunk. It should still remain in original place, if any other program is not doing anything with it.
  4. Splunk creates the handler for a file based on first few characters of the file content and the data already indexed so far. If an updated file is placed, it will just index the new data. This is default behavior of Splunk.

Hope this helps.

View solution in original post

somesoni2
Revered Legend

Below are the answers to your queries

  1. When you upload a file (any size), the file is not actually getting copied to Splunk server, instead data from the file is first getting transferred via protocol decided by configured inputs and saved in a temporary binary file (in folder $SPLUNK_HOME/var/spool/splunk/) which is not human readable. Later these binary files are parsed and data is stored into indexes (see more here).
  2. Since, your file not uploaded literally and you can't read binary files anyways, it query becomes irrelevant.
  3. There will not be any impact on the actual data file by splunk. It should still remain in original place, if any other program is not doing anything with it.
  4. Splunk creates the handler for a file based on first few characters of the file content and the data already indexed so far. If an updated file is placed, it will just index the new data. This is default behavior of Splunk.

Hope this helps.

tanmaybalwa
Engager

Thanks for the reply! 🙂

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...