topic load compressed files in Getting Data In

load compressed files

dmlee — Thu, 24 Mar 2011 08:06:11 GMT

Hi,

as we know , before splunk eat a compressed file, splunk will decompress it first then index it.

but, if we have many compressed files under the same directory (ex: ap_20110301.zip, ap_20110302.zip ...) and their original file name are the same (ex:ap.log), what will happen ?

will splunk decompress all those files then index them ? or decompress and index one by one ?

because their original file name are the same , if splunk decompress all of the files at first , it will overwrite existing files (actually, this is what we observed, but we want to make sure).

thanks.

Re: load compressed files

Stephen_Sorkin — Thu, 24 Mar 2011 08:56:54 GMT

Splunk never actually decompresses the files within archives to a temporary location on disk. Instead we use a library called "libarchive" that allows us to stream through the contents of archives. These streamed contents are then indexed.

Re: load compressed files

dmlee — Thu, 24 Mar 2011 13:29:12 GMT

lessons learned, thanks