<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Fields with and without spaces in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Fields-with-and-without-spaces/m-p/55022#M13426</link>
    <description>&lt;P&gt;On my portal I have Solaris web logs from which I must extract file names that were downloaded by the end user. These files come from many different sources, individuals and companies. Sometimes the filenames have spaces in them and sometimes they do not. They are intermixed in the logs. &lt;/P&gt;

&lt;P&gt;These filenames have spaces in them:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/krbox/resources/SW CH 32 0 2012-11-16.pdf
/krbox/system/platform/krbox304b/Attachment 4 111026M123B.pdf
/krbox/system/platform/krbox304b/Attachment 1 111026M121C-704.pdf
/krbox/system/platform/krbox304b/Attachment 6 Functional Test Logs.zip
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;While these do not:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/krbox/reference/publications/XARP/cnf/tb11-5895-1893-13.pdf
/krbox/system/platform/krbox304b/Attachment_7_D0206-001_KRBOX-304B_RS1Testing.pdf
/krbox/system/platform/krbox304b/Attachment_1_111026M121C-704.pdf
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This is how they might look in the actual log:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/krbox/reference/publications/XARP/cnf/tb11-5895-1893-13.pdf
/krbox/resources/SW CH 32 0 2012-11-16.pdf
/krbox/system/platform/krbox304b/Attachment_7_D0206-001_KRBOX-304B_RS1Testing.pdf
/krbox/system/platform/krbox304b/Attachment 1 111026M121C-704.pdf
/krbox/system/platform/krbox304b/Attachment 6 Functional Test Logs.zip
/krbox/system/platform/krbox304b/Attachment_1_111026M121C-704.pdf
/krbox/system/platform/krbox304b/Attachment 4 111026M123B.pdf
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I normally use the slash “/” as the delimiter in multi value fields to extract the filename. How can I extract the filenames regardless of the spaces? &lt;/P&gt;</description>
    <pubDate>Tue, 04 Jun 2013 18:31:01 GMT</pubDate>
    <dc:creator>kmattern</dc:creator>
    <dc:date>2013-06-04T18:31:01Z</dc:date>
    <item>
      <title>Fields with and without spaces</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Fields-with-and-without-spaces/m-p/55022#M13426</link>
      <description>&lt;P&gt;On my portal I have Solaris web logs from which I must extract file names that were downloaded by the end user. These files come from many different sources, individuals and companies. Sometimes the filenames have spaces in them and sometimes they do not. They are intermixed in the logs. &lt;/P&gt;

&lt;P&gt;These filenames have spaces in them:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/krbox/resources/SW CH 32 0 2012-11-16.pdf
/krbox/system/platform/krbox304b/Attachment 4 111026M123B.pdf
/krbox/system/platform/krbox304b/Attachment 1 111026M121C-704.pdf
/krbox/system/platform/krbox304b/Attachment 6 Functional Test Logs.zip
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;While these do not:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/krbox/reference/publications/XARP/cnf/tb11-5895-1893-13.pdf
/krbox/system/platform/krbox304b/Attachment_7_D0206-001_KRBOX-304B_RS1Testing.pdf
/krbox/system/platform/krbox304b/Attachment_1_111026M121C-704.pdf
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This is how they might look in the actual log:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/krbox/reference/publications/XARP/cnf/tb11-5895-1893-13.pdf
/krbox/resources/SW CH 32 0 2012-11-16.pdf
/krbox/system/platform/krbox304b/Attachment_7_D0206-001_KRBOX-304B_RS1Testing.pdf
/krbox/system/platform/krbox304b/Attachment 1 111026M121C-704.pdf
/krbox/system/platform/krbox304b/Attachment 6 Functional Test Logs.zip
/krbox/system/platform/krbox304b/Attachment_1_111026M121C-704.pdf
/krbox/system/platform/krbox304b/Attachment 4 111026M123B.pdf
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I normally use the slash “/” as the delimiter in multi value fields to extract the filename. How can I extract the filenames regardless of the spaces? &lt;/P&gt;</description>
      <pubDate>Tue, 04 Jun 2013 18:31:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Fields-with-and-without-spaces/m-p/55022#M13426</guid>
      <dc:creator>kmattern</dc:creator>
      <dc:date>2013-06-04T18:31:01Z</dc:date>
    </item>
    <item>
      <title>Re: Fields with and without spaces</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Fields-with-and-without-spaces/m-p/55023#M13427</link>
      <description>&lt;P&gt;Due to the complexity of the field you will have to use regex.  You can use a transform, inline rex, or inline regex extract your field.&lt;/P&gt;

&lt;P&gt;Below I start matching at end of line (end of field in this case) then create a not capture group for zip or pdf extentions then match everything expect for forward slash.&lt;/P&gt;

&lt;P&gt;Example using rex:&lt;BR /&gt;
&lt;CODE&gt;&lt;/CODE&gt;&lt;PRE&gt;&lt;CODE&gt;&lt;BR /&gt;
…| rex field=&amp;lt;your_field&amp;gt; "(?&amp;lt;filename&amp;gt;[^/]+.(?:zip|pdf)$)"| table filename&lt;BR /&gt;
&lt;/CODE&gt;&lt;/PRE&gt;&lt;/P&gt;

&lt;P&gt;Additional Reading:&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;&lt;A href="http://docs.splunk.com/Documentation/Splunk/5.0.3/SearchReference/Rex"&gt;Rex&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="http://www.regular-expressions.info/refadv.html"&gt;AdvancedRegex&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;Hope this help or gets you started.&lt;/P&gt;

&lt;P&gt;Cheers,&lt;/P&gt;</description>
      <pubDate>Tue, 04 Jun 2013 20:41:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Fields-with-and-without-spaces/m-p/55023#M13427</guid>
      <dc:creator>bmacias84</dc:creator>
      <dc:date>2013-06-04T20:41:10Z</dc:date>
    </item>
    <item>
      <title>Re: Fields with and without spaces</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Fields-with-and-without-spaces/m-p/55024#M13428</link>
      <description>&lt;P&gt;This extraction works at search time. In this case we capture anything at the end of the string ($) and the slash (/) delimiter.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype="answers-1370378440" | rex field=_raw "/(?&amp;lt; filename&amp;gt;[a-zA-Z0-9\s_\.-]+)$"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If the sample data is partial and you have the file name as a whole in a field, called &lt;CODE&gt;filename&lt;/CODE&gt;, then you would use it like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype="answers-1370378440" | rex field=filename "/(?&amp;lt; filename&amp;gt;[a-zA-Z0-9\s_\.-]+)$"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;HR /&gt;

&lt;P&gt;And, of course you can automate this with an entry in props.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[answers-1370378440]
EXTRACT-filename = /(?&amp;lt; filename&amp;gt;[a-zA-Z0-9\s_\.-]+)$
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;HR /&gt;

&lt;P&gt;PS: There is a space in &amp;lt;filename&amp;gt; due to the markup.&lt;/P&gt;</description>
      <pubDate>Tue, 04 Jun 2013 20:51:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Fields-with-and-without-spaces/m-p/55024#M13428</guid>
      <dc:creator>Gilberto_Castil</dc:creator>
      <dc:date>2013-06-04T20:51:55Z</dc:date>
    </item>
  </channel>
</rss>

