Hi, I would like to collect (and parse) data/logs without indexing them as they don't need to be searched with Splunk, but still need to be collected.
I have not been able to find specifics in the documentation, the community or by sifting through the Splunk menu as admin.
Summary of our discussion:
What you intend to do is not possible with Splunk.
Splunk does not allow to store data without indexing, e.g. without counting it against your license.
A possible solution would be to have syslog-ng (or another daemon) write those logs to disk, where you can search and process them using common Linux tools like grep.
Hope that helps - if it does I'd be happy if you would upvote/accept this answer, so others could profit from it. 🙂
Summary of our discussion:
What you intend to do is not possible with Splunk.
Splunk does not allow to store data without indexing, e.g. without counting it against your license.
A possible solution would be to have syslog-ng (or another daemon) write those logs to disk, where you can search and process them using common Linux tools like grep.
Hope that helps - if it does I'd be happy if you would upvote/accept this answer, so others could profit from it. 🙂
What @xpac is suggesting is "forwarding to a third party receiver" -- in this case a syslog-ng daemon -- which writes the parsed data to disk. Other commonly used receivers would be external SIEMs.
Relevant documentation: Forward data to third-party systems. Just be aware that if you send data to a third-party receiver, and that receiver blocks, it will also cause local indexing to block. Shouldn't be an issue with something like syslog, but can be a problem with SIEMs.
@OLWI your requirement does not seem to be clear. You need to collect and parse data but not index or search? Could you please elaborate on what is your requirement. What you need to collect and why you need to parse? If you do not have to index or search in Splunk, why do you want to use Splunk?
There are logs which I want to index, search and create alerts for. But some logs are only required for a detailed analysis, which happens less than once a month. However, those logs would only push me over the license quota.
I therefore do intend to use Splunk, but I want to avoid using another software to collect and manage other logs.
The Splunk WBT specifically mentions that parsing logs does not affect the license, only indexing. So far I have not been able to figure out how to parse without indexing.
If you don't index data, it's not stored. It is completely lost. Parsing is only the step that comes before indexing - preprocessing the data, that means. Are you aware of that?
Splunk should certainly be capable of collecting logs without doing any processing (which includes indexing) and storing them. This is a basic feature and required for reliability. The lack of such would make me doubt the quality of the product.
Indexing is Splunk's way of storing it and if you ever want to do something with the data in splunk, you will need to index it.
So as xpac is also trying to explain: collecting without indexing doesn't make much sense, unless you intend to send it to some other non-splunk solution for storing the data.
As I tried to explain above: I do not intend to do anything with the data in Splunk, I therefore don't want to index it.
I also hope that, at the core, Splunk makes a difference between indexing and storing, because they are completely different things.
The non-splunk solution would be the hard drive, where I'd be happy to have the logs simply stored as .txt or .log or whatever original raw format.
Then this is going to be your solution (like, having syslog-ng or another daemon write to disk).
When Splunk stores data to disk, it always indexes them. There is no other way.
I am sorry to hear that. Maybe if I find the time, I'll write an app that will do what I intended (if possible).
Thank you for your time. If you post a summary of this as answer, I'll mark it as the solution.
If you have data which will only add detail to indexed data and would change once in a month, you can try uploading as lookup table.
So - you want to collect (ingest) data, then parse (preprocess) it, and then? Drop some of them?
We seem to have a misunderstanding in what your actual requirement is - maybe you can rephrase what you want to do?
Ok. I'll try to rephrase: I want to use Splunk to collect logs and store them. Just not in searchable, indexed buckets.
According to the WBT, It is possible to setup a complex environment in which one Splunk instance only parses the logs and then sends them to another instance which handles indexing.
The first/parsing instance obviously holds the parsed data in some way and instead of forwarding the data to indexing, I would like to simply store it.
Maybe... you only want to collect and parse them, and then forward them somewhere else? I'm as confused as @niketnilay.