Getting Data In

Splunk fails to monitor zip file

Contributor

Hello,

Trying to have Splunk monitor standard scan-reports from Foundstone (Vulnerability Assessment Scanner), but repeatedly seeing this in the splunkd.log:

11-22-2011 17:13:26.759 -0500 ERROR ArchiveFile - In archive '/data/splunk/splunk-4.2.4/var/spool/splunk/Monthly-Full-2010-102811.csv.zip': Bad ZIP file

This zip file opens fine on the windows system with the built-in zip, and on linux with "unzip."

  • Any ideas what is causing the problem?
  • Is it possible that Foundstone uses a compression algorithm that Splunk doesn't understand and if so, how can we test for this?
  • Any idea on how to get around it besides a scripted input?

Thanks,
Sean

0 Karma
1 Solution

Contributor

Answering my own question.

The problem we found with Foundstone, is that it saves the CSV report in a hierarchical directory structure with windows style backslash characters to note new directories. This is normally ok, but I believe that the Foundstone zipping function inserts the first directory in some strange way where Linux/python interpret it as a regular backslash character and not a directory.

You can see with the linux unzip command the file is not corrupt, but the resulting contents look funny:

sean@ubuntu:/tmp/temp$ unzip -lvt Monthly-Full-2010-102811.csv.zip 
Archive:  Monthly-Full-2010-102811.csv.zip
    testing: 18\CSV/en/authenticated_hosts.csv   OK
    testing: 18\CSV/en/csvmanifest.xml   OK
    testing: 18\CSV/en/network_assets.csv   OK
    testing: 18\CSV/en/vulnerabilities.csv   OK
No errors detected in compressed data of Monthly-Full-2010-102811.csv.zip.

I believe that Splunk's monitoring process is doing some input validation and getting stuck on this backslash character.

The way I found to get around this issue, is to write a small wrapper to unzip the file in advance then have Splunk eat the files inside.

I found no output options in the Foundstone management UI that could control this behavior.

Best,

Sean

View solution in original post

Contributor

With Foundstone or some other application?

0 Karma

Contributor

Answering my own question.

The problem we found with Foundstone, is that it saves the CSV report in a hierarchical directory structure with windows style backslash characters to note new directories. This is normally ok, but I believe that the Foundstone zipping function inserts the first directory in some strange way where Linux/python interpret it as a regular backslash character and not a directory.

You can see with the linux unzip command the file is not corrupt, but the resulting contents look funny:

sean@ubuntu:/tmp/temp$ unzip -lvt Monthly-Full-2010-102811.csv.zip 
Archive:  Monthly-Full-2010-102811.csv.zip
    testing: 18\CSV/en/authenticated_hosts.csv   OK
    testing: 18\CSV/en/csvmanifest.xml   OK
    testing: 18\CSV/en/network_assets.csv   OK
    testing: 18\CSV/en/vulnerabilities.csv   OK
No errors detected in compressed data of Monthly-Full-2010-102811.csv.zip.

I believe that Splunk's monitoring process is doing some input validation and getting stuck on this backslash character.

The way I found to get around this issue, is to write a small wrapper to unzip the file in advance then have Splunk eat the files inside.

I found no output options in the Foundstone management UI that could control this behavior.

Best,

Sean

View solution in original post

Motivator

Great this is exactly what I needed. If it's not too much trouble can you post the unzip code you used. Thanks ever so much. I am using Founstone too and want to get the scan data directly without the operator having to uncompress the reports.

0 Karma

Motivator

I am haveing the same issue. Did you ever find a salution?

0 Karma