We are looking at deploying splunk for our application servers log files, these log files are about 3GB per day.
I've had a look around the inputs and it does not seem possible to filter the incoming data.
Ideally we would be able to place a filer on each input to filter out and collect only Java errors. This is to help cut down on the amount of space we need to store the indexes.
The only other way i can think to do this is use a scripted input which filters all the data before passing it onto splunk. Basically cat the file and grep out just the errors.
Can you think of any better way to do this please?
Assuming that you can use a regex to determine which particular events are of interest to you, routing to the nullQueue is the best solution: http://answers.splunk.com/questions/96/how-do-i-exclude-some-events-from-being-indexed-by-splunk.
If the decision is on a file-by-file basis, whitelists and blacklists in inputs.conf is the best solution.
View solution in original post
Thank you for the help!