Getting Data In

Import Data from multiple folders

Scan001
Explorer

Hi, I wish to import data from a folder structure and cannot find or understand how to do this.

I have over a hundred folders with five distinct .gz files in each. I wish to import the contents of one of these files from each folder into SPLUNK for analyses. I will not need to monitor these folders again.

I have gone into the Data Inputs >> Files & Folders and created a DataInput, I chose "Once Only".

In the Files & directories listing my new input shows up and has Number of files = 130 . I am unable to find how do I import and index them. I had expected them to show up somewhere in the Data Summary screen, am I missing something?

Tags (3)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Creating a folder input will index all the files in that folder. If you only wish to import a single file from each folder, I suggest writing a script around the oneshot command (/opt/splunkforwarder/bin/splunk add oneshot filename -index foo -sourcetype bar -hostname localhost.localdomain -auth "admin:changeme"). See the docs at http://docs.splunk.com/Documentation/Splunk/latest/Data/MonitorfilesanddirectoriesusingtheCLI.

---
If this reply helps you, Karma would be appreciated.
0 Karma

Scan001
Explorer

At this point I have unzipped the files and now have 12 (1 per Month) Folders and I want to import ALL the files in each folder. I have read everything I can find on this and for cannot find anything I can understand.

I have created a new Data Inputs > Files and Folders. The "Number of Files" is correct, but I cannot see the content of these files.

Any ideas would be great, I'm sure it can be easily sorted, but I'm stuck

Many thanks

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Where are you looking for the content of the files? You won't see it in the Data Inputs screen except when adding a new single file. Once you've imported a file, you can tell Splunk to monitor the directory in which the file resides.

---
If this reply helps you, Karma would be appreciated.
0 Karma

Scan001
Explorer

The script is giving a "No results found"
my complete script is:
/opt/splunkforwarder/bin/splunk add oneshot auth-detail.gz -index wifi -sourcetype My-PC -hostname /Users/Philip/Dropbox/Projects/Access_Logs -auth "admin:changeme"

Did I misinterpret something along the way?

Thanks again

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Allow me to suggest a few changes

/opt/splunkforwarder/bin/splunk add oneshot auth-detail.gz -index wifi -sourcetype gzip -hostname philip -auth "admin:changeme"

Depending on the nature of the data within the .gz files, you may want to consider unzipping the files within your script and then indexing the uncompressed data. You could then specify a sourcetype that better describes the contents.

---
If this reply helps you, Karma would be appreciated.
0 Karma

Scan001
Explorer

Thanks Richgalloway, I think I'm getting to understand what is happening here. But, how does this script identify the folders containing my files?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

If you need the folder names, try specifying the full path of the input file (/Users/Philip/Dropbox/Projects/Access_Logs/auth-detail.gz).

---
If this reply helps you, Karma would be appreciated.
0 Karma

Scan001
Explorer

richgallaway.
My main requirement is to read recursively through a folder tree and pick files of a specific name and import them. (All these files have the same name). I can see the Oneshot command would work for a single file but, can it search for further instances of files?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Yes, the oneshot command only processes a single file. That's why I suggested a script. Have the script iterate over the folders and call the oneshot command for each file you want to index.

---
If this reply helps you, Karma would be appreciated.
0 Karma

Scan001
Explorer

Can you suggest a script that could do that for me, or would you have a link to a source that can help?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I don't know of any existing scripts that I can reference. Google should be able to help, however.

---
If this reply helps you, Karma would be appreciated.
0 Karma

Scan001
Explorer

/opt/splunkforwarder/bin/splunk add oneshot filename -index foo -sourcetype bar -hostname localhost.localdomain -auth "admin:changeme"

Thanks richgalloway

So for my case I replace, am I correct to say
filename - will be the file name I wish to import
foo - create an index e.g.. my index
bar - unique identifier (can be anything)
localhost.localdomain - route to the folder structure I wish to interrogate.

Am I right with these comments?

As a newbe I appreciate your assistance

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Almost right. Sourcetype can be just about anything, but you should use a built-in sourcetype if one matches your data. See http://docs.splunk.com/Documentation/Splunk/latest/Data/Listofpretrainedsourcetypes. The hostname parameter should be the name of the system the files are on.

---
If this reply helps you, Karma would be appreciated.
0 Karma

Scan001
Explorer

I think it is the hostname that stopping the command running, I don't fully understand it,. If my folder is for example c:\folders\test\ what would my hostname be?

Thsnks

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Hostname can be just about anything. DNS name (or part of it) is one option. In your case you might use "Scan001-PC" or whatever name Windows calls your computer. You can even omit the -hostname option.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...