at first, list and analyze the data to take from a source and identify the ones that you want to index yes, we have already done this. We already used monitor on folder at the Splunk forwarder to define the sourcetype and the index. However, we can't stop if the external person send us invalid file types and weird content, it will just ingest in it. So far, I have searched Splunk answers and documentation, there is no way to ensure the content is "clean". what whitelist can do, is to monitor file extension only For example, to monitor only files with the .log extension, make the following change: [monitor:///mnt/logs]
whitelist = \.log$ or based on the file name, but can't check the content. at least, if there are still unwanted data, you can create a filter on Indexer to delete those data before indexing (https://docs.splunk.com/Documentation/Splunk/8.1.3/Forwarding/Routeandfilterdatad#Filter_event_data_...). This one is based on specific regex expression, doesnt seem to fit in, as we are looking for a whitelist. Thus, if I am monitoring this folder [monitor:///var/log/putlogshere] whitelist = \.log$ sourcetype=xx index=index1 and I implement the whitelist , the user can still send in a log with weird data which fulfil the whitelist condition and it will still be forwarded to the indexer, is that correct?
... View more