Getting Data In

Scripting to pull in logs from a URL

I am trying to create a script that will index retrieve Apache server logs but have been unable to figure out how to do it. I am not able to place a forwarder on the machine, but I do have HTTP access to the log directory. I have tried creating a script to pull down the log files using WGET (DOS, Windows), and I get it to pull down the files, but I can not figure out how to get Splunk to index them. The files are compressed, so I access_log_1.gz, access_log_2.gz, etc. I have places the script in the $Splunk_Home\bin\scripts dir, and it points to a .bat file in $Splunk_Home\bin. The only line of the bat file is: "wget -r -nv -nH -A "*.gz""

Can someone point me to documentation or examples that show how to do this? Do I have to create an App to do it? Or can I just use a script only?

We use rsync to copy the apache logs from our web servers to our splunk server...

You can download a windows version of rsync from

FYI: cwRsync is a packaging of rsync for MS Windows

Getting the logs to your Splunk server is only 1/2 of the battle. You have to set up a source to actually index the files. I'm making the assumption you are sticking these files somewhere else other then the scripts directory.

You will want to check out for more information on setting up a monitor..

This is what I was missing. I am not able to do this in one script. I have to create a script that will pull over the files, and then set up a monitor on the directory to pull in the logs as they are written there.

