Getting Data In

ArchiveProcessor - Bypassing normal system/local/props.conf processing for .dat files inside archives? (4.3.4)

Lucas_K
Motivator

I have a situation in which it would seem that for .dat files inside an archive I can not make it honor the settings listed in a system/local/props.conf.

Example.

We have the following 6 unique log files. Note: The example below is proof to myself of the issue and not how my real world sources are gathered. However, I have two customer installations both using the same data format (.dat files inside zip files) so it has a direct customer impact right now.

1.log
2.dat
3.zip (contains 3.log)
4.zip (contains 4.dat)
56.zip (contains 5.log and 6.dat)

All zip files are created using the same method. All file names are unique. All file events inside the files are unique.
The following app based inputs.conf

[monitor://c:\logs\]
index=logs
sourcetype=logs
followTail=0
alwaysOpenFile = 1
whitelist = \.dat$|\.log$|\.zip$
crcSalt = < SOURCE >

Due to the .dat being a known binary format I also use a sys/local/props.conf to stop the files from being ignored (as per http://splunk-base.splunk.com/answers/11118/how-to-monitor-datgz-files ).

[source::.+logs.+....(dat)]
sourcetype=logs
priority = 20

Now the weird thing is that 1.log,2.dat and 3.zip WILL be indexed correctly. The archive containing the .dat file will be ignored. So it seems the above stanza works fine for standalone .dat file not ones contained inside archives.

So I check splunkd.log for hints as to what is going on.

 14:50:32.615 +1000 INFO  ArchiveProcessor - reading path=c:\logs\56.zip (seek=0 len=1595)
10-03-2012 14:50:32.615 +1000 INFO  ArchiveProcessor - Archive with path="c:\logs\56.zip" was already indexed as a non-archive, skipping. 

So splunk believe's its seen the file before even though it hasn't. I can re-salt the files by renaming them and they will all be indexed again with the exception of any zip file with a .dat file inside.

This then leads to a post at the end of this OLD May 2011 thread (splunk v4.2 at the time) ( http://splunk-base.splunk.com/answers/24578/rolled-logs-compressed-immediately ).

Is there some magical setting inside the system/local/props.conf I need to set for sourcetype setting of .dat files INSIDE archives OR is this a known bug?

0 Karma
1 Solution

Lucas_K
Motivator

Update: I have found temporary solution to this!!! (its bad [wouldn't be surprised to see someone from splunk complain loudly NOT to do this] and will probably break your input again upon splunk upgrade but by then hopefully its fixed).

Simply edit the etc/system/default/props.conf and remove the dat from the "known_binary" stanza.

So just replace the following.

[source::....(0t|a|ali|asa|au|bmp|cg|cgi|class|d|dat|deb|del|dot|dvi|dylib|elc|eps|exe|ftn|gif|hlp|hqx|hs|icns|ico|inc|iso|jame|jin|jpeg|jpg|kml|la|lhs|lib|lo|lock|mcp|mid|mp3|mpg|msf|nib|o|obj|odt|ogg|ook|opt|os|pal|pbm|pdf|pem|pgm|plo|png|po|pod|pp|ppd|ppm|ppt|prc|ps|psd|psym|pyc|pyd|rast|rb|rde|rdf|rdr|rgb|ro|rpm|rsrc|so|ss|stg|strings|tdt|tif|tiff|tk|uue|vhd|xbm|xlb|xls|xlw)]
sourcetype = known_binary

with

 [source::....(0t|a|ali|asa|au|bmp|cg|cgi|class|d|deb|del|dot|dvi|dylib|elc|eps|exe|ftn|gif|hlp|hqx|hs|icns|ico|inc|iso|jame|jin|jpeg|jpg|kml|la|lhs|lib|lo|lock|mcp|mid|mp3|mpg|msf|nib|o|obj|odt|ogg|ook|opt|os|pal|pbm|pdf|pem|pgm|plo|png|po|pod|pp|ppd|ppm|ppt|prc|ps|psd|psym|pyc|pyd|rast|rb|rde|rdf|rdr|rgb|ro|rpm|rsrc|so|ss|stg|strings|tdt|tif|tiff|tk|uue|vhd|xbm|xlb|xls|xlw)]
sourcetype = known_binary

View solution in original post

0 Karma

Lucas_K
Motivator

Update: I have found temporary solution to this!!! (its bad [wouldn't be surprised to see someone from splunk complain loudly NOT to do this] and will probably break your input again upon splunk upgrade but by then hopefully its fixed).

Simply edit the etc/system/default/props.conf and remove the dat from the "known_binary" stanza.

So just replace the following.

[source::....(0t|a|ali|asa|au|bmp|cg|cgi|class|d|dat|deb|del|dot|dvi|dylib|elc|eps|exe|ftn|gif|hlp|hqx|hs|icns|ico|inc|iso|jame|jin|jpeg|jpg|kml|la|lhs|lib|lo|lock|mcp|mid|mp3|mpg|msf|nib|o|obj|odt|ogg|ook|opt|os|pal|pbm|pdf|pem|pgm|plo|png|po|pod|pp|ppd|ppm|ppt|prc|ps|psd|psym|pyc|pyd|rast|rb|rde|rdf|rdr|rgb|ro|rpm|rsrc|so|ss|stg|strings|tdt|tif|tiff|tk|uue|vhd|xbm|xlb|xls|xlw)]
sourcetype = known_binary

with

 [source::....(0t|a|ali|asa|au|bmp|cg|cgi|class|d|deb|del|dot|dvi|dylib|elc|eps|exe|ftn|gif|hlp|hqx|hs|icns|ico|inc|iso|jame|jin|jpeg|jpg|kml|la|lhs|lib|lo|lock|mcp|mid|mp3|mpg|msf|nib|o|obj|odt|ogg|ook|opt|os|pal|pbm|pdf|pem|pgm|plo|png|po|pod|pp|ppd|ppm|ppt|prc|ps|psd|psym|pyc|pyd|rast|rb|rde|rdf|rdr|rgb|ro|rpm|rsrc|so|ss|stg|strings|tdt|tif|tiff|tk|uue|vhd|xbm|xlb|xls|xlw)]
sourcetype = known_binary
0 Karma
Get Updates on the Splunk Community!

Application management with Targeted Application Install for Victoria Experience

  Experience a new era of flexibility in managing your Splunk Cloud Platform apps! With Targeted Application ...

Index This | What goes up and never comes down?

January 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Splunkers, Pack Your Bags: Why Cisco Live EMEA is Your Next Big Destination

The Power of Two: Splunk &#43; Cisco at "Ludicrous Scale"   You know Splunk. You know Cisco. But have you seen ...