I think I'm missing a clue here. I have logs being dumped in /var/log/splunk - most devices are appliances, not in DNS and I have resolution turned off in syslog-ng anyway. So, I end up with 100's of directories by IP address under /var/log/splunk. FWIW - my full log paths are actually /var/log/splunk/host/month/day.log
Often times, I'll have a range of addresses that are part of a single system, so I want them all in a single index, single source type. I figured it would be easy to write a monitor:: stanza to pick up a range of IP's and it was, except it does not work.
So, here is what I have, what did I do wrong? In this example I want to pickup everything below /var/log/splunk/10.10.10.132 through /var/log/splunk/10.10.10.141
[monitor:///var/log/splunk/.../10\.10\.10\.(1(3[2-9]|4[0-1]))] host_segment=4 sourcetype=bar index=foo ...etc..
I'm also wondering what kind of hit I'm putting on my syslog forwarder when I start using regex in the inputs.conf - is it better to just have an individual monitor:: line for each IP, it would be easy enough to write a script that auto-generated my inputs.conf from a list of IP's - in the end, I will have thousands of devices sending data over syslog.
Maybe this will behave differently?
Or perhaps you could break it into two
Thousands of entries in
inputs.conf will not work. In fact, if you are going to monitor a very large number of files (like 5000 or more), you should consider an indexing and archiving strategy that removes older files from being monitored. This is true no matter how many (or few) monitor stanzas you have in
BTW, here is the reference in the docs that talks about input paths (although I think you may have already found it)
Specify input paths with wildcards
I gave this a try, It still does not work. At least I now understand that a . in the monitor stanza of inputs.conf is not a RegEx . I simplified the regex to 10.10.10.1[7-9] Tried it with various combos of trailing / ... and * - nothing works.
I don't think a whitelist/blacklist will work, it looks like those are for files, not directories.
I'm wondering if the location of the ... is the problem. My actual path is /var/log/splunk/host/month/day.log. For example, today that directory would have a file /var/log/splunk/hostname/12/19.log Tomorrow that file would be 20.log and 19.log would be gone and archived. In my case, hostname is always an IP address and I am going to have thousands of them, most of them in sizable groups of sequential IP's.
I like having my incoming syslog files broken up by IP and day, it seems to be a very common way to setup rsyslog or syslog-ng. Perhaps I need to rethink the whole approach to make this more manageable.
I'm starting to think it is time for a script that generates inputs.conf stanza's. Just feed it a list of IP's and the associated indexname I want.
Whitelists and blacklists will work for both directories and files. It's just that people always seem to use files in their examples...
Yep, I think that your ... could be in the wrong place. And I wonder if the
 in the regular expression will break the monitor stanza. It shouldn't, but...
You could do
But before you change anything, run this
./splunk cmd btool inputs list --debug >inputs.debug.list
This may help - it shows how all the input.conf stanzas are combined. This won't catch everything, but it may give you some insight.