Going through other splunk answers questions I couldn't get anything that I think should be working to work here.
Essentially have a bunch of web servers sending in apache logs via a LWF on the web servers with an inputs.conf of:
[monitor:///data/logs/httpd/] disabled = false followTail = 1 index = weblog blacklist = (.*\.gz|.*\.tgz|.*\.tar.gz|.*archive.*)$
They are currently coming in as various auto-generated sourcetypes, so I wanted to set the ones that should be access_combined and apache_error to be so.
I added this to /system/local/props.conf on the indexer and restarted splunk:
[source::...access*] sourcetype=access_combined [source::...error*] sourcetype=apache_error
Only certain access_combined logs are working properly, such as:
Whereas any of these are not:
Note that any that contain the rotatelogs appended timestamp to the end are the ones that are not getting the sourcetype set. What's really interesting is all of the apache_error ones are getting applied properly regardless if they have the rotatelogs appended timestamp or not.
I could not find anything in system/default/props.conf that would cause this and it appears my use of ... and * are correct as far as I can tell. Any thoughts?
and that was it. Moral of the story:
sourcetypes get applied at the LWF
if the source has been monitored for some time and you change something like the sourcetype, clear out splunk_install/etc/apps/learned/local/sourcetypes.conf
Heh, I need a drink after this one. I had been looking at the sourcetypes.conf for learned on the indexer - but really should have been on the LWF. I bet clearing that out and putting the settings back on the LWF will make this all better. Will give it a whirl once I have clearance tomorrow.
aieee....and couple other things of note. Moving to the LWF with the sourcetype changes in props.conf didn't make a difference. It also appears that the working access_combined and apache_error were automatically working prior to my changes - I just hadn't been looking at them back then. So still something with the filename.appended_timestamp.
Hrm...actually it looks like I read that link I put above incorrectly. It shows sourcetype being set at the input phase which would be on the LWF in that combo. Not sure why the apache_error worked though....
Had thought about that, but after reading here: http://www.splunk.com/wiki/Where_do_I_configure_my_Splunk_settings%3F, it seems I should be able to do this at the indexer level.
Definitely not looking to pull in old archived files. The way apache rotatelogs works is it will rotate the access_log at a certain size and the new log file will get a timestamp appended to it - so I can't just watch for access_log. The actual monitor has been working fine for some time now, just wanted to get rid of the random sourcetypes that were being added.
I can't say that this is your problem, but in our case, we set the sourcetype in props.conf on the LWF itself, not on the indexer so the LWF sends with the proper metadata attached already.
So you actually do want it to pull in old archived log files? As opposed to say, just whitelisting *.log$ ?