I have a script that goes to a website and downloads a text file. It then converts it to a CSV so I can import it into Splunk.
The file changes daily, so I run my script daily.
My question is, being new to Splunk, how do I set something up to automatically import this file into Splunk AND set it to run daily to get the new updates?
I can do it manually using a lookup table, but I want to be able to run it automatically everyday.
Any suggestions?
It's time to read the docs pages on "Getting Data In".
http://docs.splunk.com/Documentation/Splunk/7.2.0/Data/Getstartedwithgettingdatain
You probably want to monitor the location that your csv file is created into.
http://docs.splunk.com/Documentation/Splunk/7.2.0/Data/Monitordata
As long as the file gets a new name every day, it will automatically be ingested. If not, then the system will ingest the data whenever it seems to have changed, and will usually detect that correctly in your situation.
Yes, I'm being a little vague there, to avoid re-explaining any potential exceptions, which are covered at a high level in the "How the monitor processor works" section on this page, and on the related links form that page.
https://docs.splunk.com/Documentation/Splunk/7.2.0/Data/Monitorfilesanddirectories
And, one last item, when you set up a file monitor, you don't have to worry about the "daily" part... it will ingest the file whenever it changes. If that's once a day, great. If that's once an hour, great.
As long as some Splunk instance is running on the box that the file is being loaded onto, and has been told to monitor that file... no matter whether that monitoring instance is the single Splunk instance in an all-in-one installation you are playing with, or whether it's a light weight Splunk universal forwarder loaded onto some other box that's generating your data and then the UF is forwarding it to a Splunk indexer somewhere, or any other valid combination of instances, it will all be handled automatically.
It's time to read the docs pages on "Getting Data In".
http://docs.splunk.com/Documentation/Splunk/7.2.0/Data/Getstartedwithgettingdatain
You probably want to monitor the location that your csv file is created into.
http://docs.splunk.com/Documentation/Splunk/7.2.0/Data/Monitordata
As long as the file gets a new name every day, it will automatically be ingested. If not, then the system will ingest the data whenever it seems to have changed, and will usually detect that correctly in your situation.
Yes, I'm being a little vague there, to avoid re-explaining any potential exceptions, which are covered at a high level in the "How the monitor processor works" section on this page, and on the related links form that page.
https://docs.splunk.com/Documentation/Splunk/7.2.0/Data/Monitorfilesanddirectories
And, one last item, when you set up a file monitor, you don't have to worry about the "daily" part... it will ingest the file whenever it changes. If that's once a day, great. If that's once an hour, great.
As long as some Splunk instance is running on the box that the file is being loaded onto, and has been told to monitor that file... no matter whether that monitoring instance is the single Splunk instance in an all-in-one installation you are playing with, or whether it's a light weight Splunk universal forwarder loaded onto some other box that's generating your data and then the UF is forwarding it to a Splunk indexer somewhere, or any other valid combination of instances, it will all be handled automatically.
I dont see the Add Data or Monitor Data pages. Is there another way?
@aimeeandrus - Even the free version has this. At the top right, settings -> data -> files & directories. At a company, you would need admin or superuser. If they expect you to be responsible for entry of the data, then they either have to give you that access, or provide you with the support person to do it.
Awesome, thank you! I don't see it so I must just be setups as a user. I will contact my team to help me. Thank you!
I don't seem to have access to see the Add Data or Monitor Data pages. Are these the only ways to do this?
You could have a universal forwarder monitor for the new csv file that gets generated. As soon as the file is dropped into the monitored directory it will be ingested.
How do I know where the directory is? I am not an admin, only have user privileges.