I have a situation in which it would seem that for .dat files inside an archive I can not make it honor the settings listed in a system/local/props.conf.
We have the following 6 unique log files. Note: The example below is proof to myself of the issue and not how my real world sources are gathered. However, I have two customer installations both using the same data format (.dat files inside zip files) so it has a direct customer impact right now.
Now the weird thing is that 1.log,2.dat and 3.zip WILL be indexed correctly. The archive containing the .dat file will be ignored. So it seems the above stanza works fine for standalone .dat file not ones contained inside archives.
So I check splunkd.log for hints as to what is going on.
14:50:32.615 +1000 INFO ArchiveProcessor - reading path=c:\logs\56.zip (seek=0 len=1595)
10-03-2012 14:50:32.615 +1000 INFO ArchiveProcessor - Archive with path="c:\logs\56.zip" was already indexed as a non-archive, skipping.
So splunk believe's its seen the file before even though it hasn't. I can re-salt the files by renaming them and they will all be indexed again with the exception of any zip file with a .dat file inside.
Update: I have found temporary solution to this!!! (its bad [wouldn't be surprised to see someone from splunk complain loudly NOT to do this] and will probably break your input again upon splunk upgrade but by then hopefully its fixed).
Simply edit the etc/system/default/props.conf and remove the dat from the "known_binary" stanza.