<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Problems indexing into zip files in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109282#M28497</link>
    <description>&lt;P&gt;looks like I had the same problem you had the last one should have been "pulse_.*\csv" as you put in your last comment.&lt;/P&gt;</description>
    <pubDate>Tue, 22 Oct 2013 13:50:20 GMT</pubDate>
    <dc:creator>tim9gray</dc:creator>
    <dc:date>2013-10-22T13:50:20Z</dc:date>
    <item>
      <title>Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109272#M28487</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;

&lt;P&gt;I am monitoring files that land in the same directory that I wish to be considered as different source types.  The way &lt;BR /&gt;
I want to distinguish them is with their names.  There will be three different source types and they will be csv files.&lt;BR /&gt;
The naming conventions will be &lt;CODE&gt;time_*.csv, pulse_*.csv, and flow_*.csv&lt;/CODE&gt;.&lt;/P&gt;

&lt;P&gt;I actually have this working using the following in inputs.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[monitor://C:\tpg\leamcsv\dualgamma_logs\...\pulse_*.csv]
sourcetype = DGC_PULSE
index=main
host_segment = 4
crcSalt = &amp;lt;SOURCE&amp;gt;

[monitor://C:\tpg\leamcsv\dualgamma_logs\...\flow_*.csv]
sourcetype = DGC_FLOW
index=main
host_segment = 4
crcSalt = &amp;lt;SOURCE&amp;gt;

[monitor://C:\tpg\leamcsv\dualgamma_logs\...\time_*.csv]
sourcetype = DGC_TIME
index=main
host_segment = 4
crcSalt = &amp;lt;SOURCE&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This works exactly as I want.  The use of crcSalt turns out to be necessary as many of the files have meta information that &lt;BR /&gt;
is identical and this forces the indexer to consider them all.&lt;/P&gt;

&lt;P&gt;As I said, the above works fine as long as the files to be monitored are landed as .csv files.  My requirements have changed&lt;BR /&gt;
and I will now be landing *.zip files containing the desired .csv files.&lt;/P&gt;

&lt;P&gt;It is not clear to me why, but splunk is not indexing the zip files using the above configuration.  Everything I read would seem&lt;BR /&gt;
to indicate that it should index the zip files.  Perhaps the monitor stanza is excluding the zip files - I haven't been able to figure&lt;BR /&gt;
that one out.&lt;/P&gt;

&lt;P&gt;I can say that if the monitor stanza is left open(&lt;CODE&gt;[monitor://C:\tpg\leamcsv\dualgamma_logs\...\]&lt;/CODE&gt;), it will index the contents of the zip files, but that leaves me unable to distingush&lt;BR /&gt;
the different sourcetypes(at least not in the way that I was doing).&lt;/P&gt;

&lt;P&gt;After doing some research I read that attempting to index multiple sourcetypes from a common directory could lead to inconsistent&lt;BR /&gt;
results(I dont have that link handy at the moment). At any rate, the suggestion was to use a more open qualification as I mentioned&lt;BR /&gt;
in the previous paragraph and assign the sourcetype on a per event basis or in props.conf.  I chose to do this in props.conf. I &lt;BR /&gt;
am using the following configuration:&lt;/P&gt;

&lt;P&gt;inputs.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[monitor://C:\tpg\leamcsv\dualgamma_logs\...\]
index=main
host_segment = 4
crcSalt = &amp;lt;SOURCE&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;props.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[source::...\pulse_*\.csv]
sourcetype=DGC_PULSE

[source::...\flow_*\.csv]
sourcetype=DGC_FLOW

[source::...\time_*\.csv]
sourcetype=DGC_TIME
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The problem I see now is that none of my expected sourcetypes are assigned.  Instead, I get csv, csv1, csv2, etc...  for sourcetypes.&lt;BR /&gt;
I suspect the issue is with my regular expressions I have used in props.conf.  From everything I have read, these look like they&lt;BR /&gt;
are correct, but I haven't been able to figure out what I am missing.&lt;/P&gt;

&lt;P&gt;Does any have any suggestions about my approach, and/or what might be wrong with my regular expressions?&lt;/P&gt;

&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Mon, 21 Oct 2013 22:22:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109272#M28487</guid>
      <dc:creator>tim9gray</dc:creator>
      <dc:date>2013-10-21T22:22:46Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109273#M28488</link>
      <description>&lt;P&gt;How about...&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[monitor://C:\tpg\leamcsv\dualgamma_logs\...\pulse_*]
sourcetype = DGC_PULSE
index=main
host_segment = 4
crcSalt = &amp;lt;SOURCE&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;that would work regardless if they are .zip or .csv&lt;/P&gt;

&lt;P&gt;Are they being bundled inside of a single .zip?&lt;/P&gt;

&lt;P&gt;If so:&lt;BR /&gt;
inputs.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[monitor://C:\tpg\leamcsv\dualgamma_logs\...\]
sourcetype = DGC_TIME
index=main
host_segment = 4
crcSalt = &amp;lt;SOURCE&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;transforms.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[transform_name1]
SOURCE_KEY = MetaData:Source
REGEX = pulse_*\.csv
DEST_KEY = MetaData:Sourcetype
FORMAT =  sourcetype::DGC_PULSE

[transform_name2]
SOURCE_KEY = MetaData:Source
REGEX = flow_*\.csv
DEST_KEY = MetaData:Sourcetype
FORMAT =  sourcetype::DGC_FLOW
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;props.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[DGC_TIME]
TRANSFORMS-transform_name = transform_name1, transform_name2
TIME_FORMAT = timeformat
SHOULD_LINEMERGE = false|true
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 22 Oct 2013 02:46:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109273#M28488</guid>
      <dc:creator>ShaneNewman</dc:creator>
      <dc:date>2013-10-22T02:46:30Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109274#M28489</link>
      <description>&lt;P&gt;After you do this, you will need to either go to yoursplunkrul:8000/info and click reload EAI Objects where ever these configs are deployed to: UF (will need instance restart), Indexer, ect.&lt;/P&gt;

&lt;P&gt;You may even want to restart the instance just for good measure.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Oct 2013 02:52:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109274#M28489</guid>
      <dc:creator>ShaneNewman</dc:creator>
      <dc:date>2013-10-22T02:52:49Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109275#M28490</link>
      <description>&lt;P&gt;Thanks for the input.  I tried this and am still getting csv, csv_1, etc for sourcetype.  I did splunk clean all on both my splunk instance and my universal forwarder.&lt;/P&gt;

&lt;P&gt;I think I understand what you have suggested and it looks very similar to what I was initially trying.  Is it substantially different? &lt;/P&gt;

&lt;P&gt;I am guessing that it is still failing on the regexes being used.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Oct 2013 03:38:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109275#M28490</guid>
      <dc:creator>tim9gray</dc:creator>
      <dc:date>2013-10-22T03:38:13Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109276#M28491</link>
      <description>&lt;P&gt;this is what I am using in transforms.conf:&lt;/P&gt;

&lt;P&gt;[transform_name1]&lt;BR /&gt;
SOURCE_KEY = MetaData:Source&lt;BR /&gt;
REGEX = pulse_*.csv&lt;BR /&gt;
DEST_KEY = MetaData:Sourcetype&lt;BR /&gt;
FORMAT =  sourcetype::DGC_PULSE&lt;/P&gt;

&lt;P&gt;and this is what I am using in props.conf:&lt;/P&gt;

&lt;P&gt;[DGC_PULSE]&lt;BR /&gt;
TRANSFORMS-transform_name = transform_name1&lt;/P&gt;

&lt;P&gt;I am not sure about this one - not sure about the mapping of the stanza name to sourcetype although I must admit I haven't look at the doc on this yet...&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 15:02:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109276#M28491</guid>
      <dc:creator>tim9gray</dc:creator>
      <dc:date>2020-09-28T15:02:03Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109277#M28492</link>
      <description>&lt;P&gt;What are the source names?&lt;/P&gt;</description>
      <pubDate>Tue, 22 Oct 2013 03:54:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109277#M28492</guid>
      <dc:creator>ShaneNewman</dc:creator>
      <dc:date>2013-10-22T03:54:34Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109278#M28493</link>
      <description>&lt;P&gt;Btw, you can replace transform_name 1,2 with anything you want, I was just using it as a filler name. Just make sure the names get put into the props.conf&lt;/P&gt;</description>
      <pubDate>Tue, 22 Oct 2013 03:56:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109278#M28493</guid>
      <dc:creator>ShaneNewman</dc:creator>
      <dc:date>2013-10-22T03:56:30Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109279#M28494</link>
      <description>&lt;P&gt;Ah, also the last bit goes in the props.conf.&lt;/P&gt;

&lt;P&gt;What we are doing is saying by default, all data from the inputs path are to be known as source type DGC_TIME. Then in the props.conf (by way of the transforms.conf) we say that if the source matches pulse_&lt;EM&gt;.csv that it's source type should be DGC_PULSE, if it matches flow_&lt;/EM&gt;.csv then it should be source type DGC_FLOW&lt;/P&gt;

&lt;P&gt;And I just noticed I did not escape the . So, replace _&lt;EM&gt;.csv in regex with _.&lt;/EM&gt;.csv&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 15:02:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109279#M28494</guid>
      <dc:creator>ShaneNewman</dc:creator>
      <dc:date>2020-09-28T15:02:05Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109280#M28495</link>
      <description>&lt;P&gt;iPad isn't letting me select code "_.*\.csv"&lt;/P&gt;</description>
      <pubDate>Tue, 22 Oct 2013 04:07:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109280#M28495</guid>
      <dc:creator>ShaneNewman</dc:creator>
      <dc:date>2013-10-22T04:07:41Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109281#M28496</link>
      <description>&lt;P&gt;the source names look something like this:&lt;BR /&gt;
time_DGC_DG14_23_2013_10_09_09_07_37.csv&lt;/P&gt;

&lt;P&gt;so are you saying the regex ought to look something like this:&lt;BR /&gt;
pulse_.&lt;EM&gt;csv or pulse_..csv or pulse_.&lt;/EM&gt;\csv?  None of those seem obvious to me.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 15:02:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109281#M28496</guid>
      <dc:creator>tim9gray</dc:creator>
      <dc:date>2020-09-28T15:02:17Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109282#M28497</link>
      <description>&lt;P&gt;looks like I had the same problem you had the last one should have been "pulse_.*\csv" as you put in your last comment.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Oct 2013 13:50:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109282#M28497</guid>
      <dc:creator>tim9gray</dc:creator>
      <dc:date>2013-10-22T13:50:20Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109283#M28498</link>
      <description>&lt;P&gt;I will post a new answer in the answer field so I can get the code bit to work.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Oct 2013 14:43:08 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109283#M28498</guid>
      <dc:creator>ShaneNewman</dc:creator>
      <dc:date>2013-10-22T14:43:08Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109284#M28499</link>
      <description>&lt;P&gt;Copy and paste these into the identified conf files. Then restart each instance they are deployed to. Be sure to change your time format in the props.conf.&lt;/P&gt;

&lt;P&gt;inputs.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[monitor://C:\tpg\leamcsv\dualgamma_logs\...\]
sourcetype = DGC_TIME
index=main
host_segment = 4
crcSalt = &amp;lt;SOURCE&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;transforms.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[extract_pulse_sourcetype]
SOURCE_KEY = MetaData:Source
REGEX = pulse_.*\.csv
DEST_KEY = MetaData:Sourcetype
FORMAT =  sourcetype::DGC_PULSE

[extract_flow_sourcetype]
SOURCE_KEY = MetaData:Source
REGEX = flow_.*\.csv
DEST_KEY = MetaData:Sourcetype
FORMAT =  sourcetype::DGC_FLOW
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;props.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[DGC_TIME]
TRANSFORMS-transform_1 = extract_pulse_sourcetype
TRANSFORMS-transform_2 = extract_flow_sourcetype
TIME_FORMAT = timeformat
SHOULD_LINEMERGE = false|true
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 22 Oct 2013 14:47:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109284#M28499</guid>
      <dc:creator>ShaneNewman</dc:creator>
      <dc:date>2013-10-22T14:47:06Z</dc:date>
    </item>
    <item>
      <title>Re: Problems indexing into zip files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109285#M28500</link>
      <description>&lt;P&gt;Did that work for you?&lt;/P&gt;</description>
      <pubDate>Thu, 24 Oct 2013 03:25:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Problems-indexing-into-zip-files/m-p/109285#M28500</guid>
      <dc:creator>ShaneNewman</dc:creator>
      <dc:date>2013-10-24T03:25:24Z</dc:date>
    </item>
  </channel>
</rss>

