ive been reading the documentation and am stumped at this part:
If you create a blacklist line for each file you want to ignore, Splunk activates only the last filter.
There are 7 different kinds of logs. 4 of them has its own sourcetype, while the other 3 are to be blacklisted as they aren't required. All of them are logged to the same directory(something which I cant change). Based on the above, im confused to whether i can blacklist those 3 types logs. Similarly or otherwise, am i allowed to configure 4 "indexers" to identify and filter the individual logs for indexing?
some examples of logfile names:
i.e. the extensions are actually the date of the logs with the last digits representing the hours. my regex will be limited to the first few phrases that identifies the log type.
i have tried:
[monitor://C:\Documents and Settings\attgjh1\Desktop\Whitelist test\ceiinboundhandler*]
disabled = false
followTail = 0
sourcetype = Whitelist_inbound
[monitor://C:\Documents and Settings\attgjh1\Desktop\Whitelist test\ceipulsarhandler*]
disabled = false
followTail = 0
sourcetype = Whitelist_pulsar
It doesnt seem to be indexing my logs now
In older Splunk versions, things worked differently. I think this text is a holdover from olden times. Maybe.
But if you are talking about blacklisting inputs, in inputs.conf you can specify
And that will not index files that contain strings matching either
The vertical bar | means or in regular expressions. You can add more to the list, of course.
You can also specify the blacklist in the GUI, if you click on More Options.
RE: "multiple indexers", I am not sure what you mean. If you simply blacklist the files that you don't want, all the other files in the directory will be indexed. If you need to specify the sourcetypes for these files, you can do it in props.conf.
just to clarify:
im using web splunk. since the indexer is already monitory a directory, in order for it to not monitor other files other than the one specified my whitelist just have to include this regex right?
i.e. in Manager » Data inputs » Files & directories, for each "path to data" settings, i only need to modify the whitelist field:
whitelist = "pulsarhandler" for sourcetype: pulsar
whitelist = "adminreport" for sourcetype: admin
thanks for your help!
The easiest way to do what you listed in your comment:
It is more efficient to put the file in the monitor stanza. No need for the whitelist. Also, this will prevent possible problems, because you cannot have identical monitor stanzas.
I notice that you are using a directory name with spaces in it - can you avoid that or escape it?
Also, are you reusing test files for this? Splunk will not index a file that it has already indexed, even if you move it to a different directory and rename it. You can fix this using crc salt - although usually you don't want Splunk to be indexing data again! If this is really just test data, you can also delete the first few lines of each file - which will make Splunk see it as a different file.
im using a new test data log by modifying all the timestamps.
but even with spaces in my directory, i had no problem monitoring them initially.
as brought up earlier. the monitor must be unique, but my logs are all in the same directory (no more subfolders), so i cant simply add more monitors with their whitelist. was working fine if i only wanted to monitor and filter only one of the logs.
By "modify the time stamps", do you mean that you changed the data in the file, or just changed the timestamp of the file?
i changed all the timestamps associated with each event. set them 2 years back. the data itself are often repeated with varying timestamps itself. Splunk never had a problem reading them separately as long as there was a time difference on the events.
it worked the first time when i whitelisted only one log file (so i think im feeding new test data correctly) but the problem pop'd up when i try to monitor the same directory with a different timestamp, which is an error itself. so now im trying to figure out the solution u psoted. but it doesnt seem to be working now =/
Aha! Check out these settings in props.conf: MAXDAYSAGO, MAXDIFFSECSAGO, MAXDIFFSECSHENCE
You might also want to check out ignoreOlderThan in inputs.conf -- though this is disabled by default.
I am wondering if Splunk is not indexing this data because it is "too old"
hmm. strangely enough, they are disabled already.
ill list down the steps i took since i posted this question.
i used websplunk default whitelist setting and keyed:
fed in those logs backdated 2 years ago. 6 events were recorded in there. it ignored other files due to whitelist. (working as intended)
i attempted to create a 2nd monitor in the same directory for "ceipulsarhandler", was unable to do so as Splunk does not allow multiple monitors.
continued in next comment