<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to extract a multivalue index-time field from another multivalue index-time field? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278464#M84086</link>
    <description>&lt;P&gt;I ended up extracting the multivalue subKey field at search time using &lt;CODE&gt;props.conf&lt;/CODE&gt; and &lt;CODE&gt;transforms.conf&lt;/CODE&gt;, saving it into a summary index and tokenizing it to preserve its multivalue nature in &lt;CODE&gt;fields.conf&lt;/CODE&gt;.&lt;/P&gt;

&lt;P&gt;The extraction is described in &lt;A href="https://answers.splunk.com/answers/481714/extracting-multiple-values-from-a-multivalue-field-1.html"&gt;this follow-up question&lt;/A&gt;.&lt;/P&gt;

&lt;P&gt;The need to tokenize the field in a summary index is due to the following: multivalue fields arrive to a summary index as a single value, apparently created by &lt;CODE&gt;mvjoin(source,'\n')&lt;/CODE&gt;. If I want to search on individual values, I need that TOKENIZER in &lt;CODE&gt;fields.conf&lt;/CODE&gt;.&lt;/P&gt;</description>
    <pubDate>Thu, 15 Dec 2016 19:08:34 GMT</pubDate>
    <dc:creator>arkadyz1</dc:creator>
    <dc:date>2016-12-15T19:08:34Z</dc:date>
    <item>
      <title>How to extract a multivalue index-time field from another multivalue index-time field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278458#M84080</link>
      <description>&lt;P&gt;I'm trying to extract two index-time fields from the input stream. Both should be multivalued. I successfully extracted the first one, and it is multivalued, just like I wanted. However, the second field, which is to be extracted from the first one (like a short code, which is a suffix of its full version), uses only the first value of it.&lt;/P&gt;

&lt;P&gt;Here is a quick example I've created:&lt;BR /&gt;
&lt;CODE&gt;transforms.conf&lt;/CODE&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[mainKey]
REGEX = record(?:\.\d+)?\.code="(?P&amp;lt;mainKey&amp;gt;[^"]+)"
#FORMAT = mainKey::$1
WRITE_META = true
REPEAT_MATCH = true
LOOKAHEAD = 1048576
MV_ADD = 1

[subKey]
REGEX = (?m-s)(?&amp;lt;=^|\s)[a-zA-Z]*(?P&amp;lt;subKey&amp;gt;\d+)(?=\s|$)
#FORMAT = subKey::$1
SOURCE_KEY = field:mainKey
WRITE_META = true
REPEAT_MATCH = true
MV_ADD = 1
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;CODE&gt;props.conf&lt;/CODE&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[testIndexFields]
DATETIME_CONFIG =
NO_BINARY_CHECK = true
category = Custom
description = Testing multivalue index-time fields
pulldown_type = true

TRANSFORMS-mainKey = mainKey
TRANSFORMS-subKey = subKey
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Where testIndexFields is a sourcetype I'm importing this data to.&lt;BR /&gt;
I prepared the following file as a data sample:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;2016-12-13 17:07:20, record.1.code="MAIN132" record.2.code="PRE9087", record.3.code="1405"
2016-12-13 17:07:40, record.code="SingleCode0123456"
2016-12-13 17:08:00, record.1.code="123BadOne", record.2.code="GoodOne1", record.3.code="NoSubKey"
2016-12-13 17:08:20, record.1.code="!alsobad123",record.2.code="TryThis1508"
2016-12-13 17:07:20, record.code="Unnumbered0001", record.code="Unnumbered0002", record.code="Unnumbered0003"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I'm expecting the data to be extracted like that:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;mainKey=MAIN132 mainKey=PRE9087 mainKey=1405 subKey=132 subKey=9087 subKey=1405
mainKey=SingleCode0123456 subKey=0123456
mainKey=123BadOne mainKey=GoodOne1 mainKey=NoSubKey subKey=1
mainKey=Unnumbered0001 mainKey=Unnumbered0002 mainKey=Unnumbered0003 subKey=0001 subKey=0002 subKey=0003
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;However, I'm getting this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;mainKey = MAIN132  mainKey = PRE9087  mainKey = 1405 subKey = 132
mainKey = SingleCode0123456 subKey = 0123456
mainKey = 123BadOne  mainKey = GoodOne1  mainKey = NoSubKey
mainKey = !alsobad123  mainKey = TryThis1508
mainKey = Unnumbered0001  mainKey = Unnumbered0002  mainKey = Unnumbered0003 subKey = 0001
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;As you can see, the subKey is extracted from the first occurrence of mainKey only. Is there a way to change this behavior?&lt;/P&gt;</description>
      <pubDate>Tue, 13 Dec 2016 15:53:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278458#M84080</guid>
      <dc:creator>arkadyz1</dc:creator>
      <dc:date>2016-12-13T15:53:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract a multivalue index-time field from another multivalue index-time field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278459#M84081</link>
      <description>&lt;P&gt;If your &lt;CODE&gt;mainKey&lt;/CODE&gt; regex is working fine and then from &lt;CODE&gt;mainkey&lt;/CODE&gt; you end up extracting &lt;CODE&gt;subKey&lt;/CODE&gt;then can you try to use the similar regex for &lt;CODE&gt;subKey&lt;/CODE&gt; like you have used for &lt;CODE&gt;mainKey&lt;/CODE&gt; and see if it works:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;REGEX = record(?:\.\d+)?\.code="(?&amp;lt;mainKeyPrefix&amp;gt;[^\d]+)(?&amp;lt;subKey&amp;gt;[\d]+)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;where &lt;CODE&gt;mainKeyPrefix&lt;/CODE&gt; and &lt;CODE&gt;subKey&lt;/CODE&gt; fields will be created. Else you can extract this at search time using above regex if thats what may also be an option, something like:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;your query to return mainKey
| rex field=mainKey "(?&amp;lt;mainKeyPrefix&amp;gt;[^\d]+)(?&amp;lt;subKey&amp;gt;[\d]+)"
| table mainKey, subKey
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 13 Dec 2016 16:10:08 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278459#M84081</guid>
      <dc:creator>gokadroid</dc:creator>
      <dc:date>2016-12-13T16:10:08Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract a multivalue index-time field from another multivalue index-time field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278460#M84082</link>
      <description>&lt;P&gt;Yes, your first suggestion was my next step - I don't really like it too much, because in practice I'm extracting mainKey from differently formatted records, so I have 4 or 5 transforms, all extracting mainKey, and I'd have to replicate, multiply (because subKey is extracted differently from different mainKey formats) and edit them to extract the subKey. Still, if one extraction from a multivalue field doesn't work, I'll have to create all that multitude of subKey extractions.&lt;/P&gt;

&lt;P&gt;Extracting subKey at search time doesn't really help because I want to search on subKey=... and it's not an indexed token (one of the points justifying index-time field creation). One of the possibilities that we are still looking at is to put everything into a summary index, extracting the subKey using rex in the summarizing search and saving it along with the mainKey.&lt;/P&gt;</description>
      <pubDate>Tue, 13 Dec 2016 16:46:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278460#M84082</guid>
      <dc:creator>arkadyz1</dc:creator>
      <dc:date>2016-12-13T16:46:29Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract a multivalue index-time field from another multivalue index-time field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278461#M84083</link>
      <description>&lt;P&gt;By the way, I did find that rex works really differently from the regular expression in &lt;CODE&gt;transforms.conf&lt;/CODE&gt;. During search time, rex - even the simplest &lt;CODE&gt;^[a-zA-Z]*(?P&amp;lt;subKey&amp;gt;\d+)$&lt;/CODE&gt;, with the crudest 'beginning of line'/'end of line' anchors, works as expected and returns multiple values when scanning a multivalue field. Is this a bug or a feature?&lt;BR /&gt;
I'll accept your answer since it seems that not much more can be done during index time, and there are two good workarounds there.&lt;/P&gt;</description>
      <pubDate>Tue, 13 Dec 2016 17:12:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278461#M84083</guid>
      <dc:creator>arkadyz1</dc:creator>
      <dc:date>2016-12-13T17:12:47Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract a multivalue index-time field from another multivalue index-time field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278462#M84084</link>
      <description>&lt;P&gt;I am happy that it worked out for you! Happy Splunking!&lt;/P&gt;</description>
      <pubDate>Tue, 13 Dec 2016 17:16:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278462#M84084</guid>
      <dc:creator>gokadroid</dc:creator>
      <dc:date>2016-12-13T17:16:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract a multivalue index-time field from another multivalue index-time field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278463#M84085</link>
      <description>&lt;P&gt;In the end I decided to extract that field (subKey) at search time and save into a summary index. The way I did the extraction is described in this &lt;A href="https://answers.splunk.com/answers/481714/extracting-multiple-values-from-a-multivalue-field-1.html"&gt;follow-up question&lt;/A&gt;.&lt;/P&gt;

&lt;P&gt;I'm leaving it here because it might be helpful to someone reading it some time later.&lt;/P&gt;</description>
      <pubDate>Thu, 15 Dec 2016 19:02:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278463#M84085</guid>
      <dc:creator>arkadyz1</dc:creator>
      <dc:date>2016-12-15T19:02:47Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract a multivalue index-time field from another multivalue index-time field?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278464#M84086</link>
      <description>&lt;P&gt;I ended up extracting the multivalue subKey field at search time using &lt;CODE&gt;props.conf&lt;/CODE&gt; and &lt;CODE&gt;transforms.conf&lt;/CODE&gt;, saving it into a summary index and tokenizing it to preserve its multivalue nature in &lt;CODE&gt;fields.conf&lt;/CODE&gt;.&lt;/P&gt;

&lt;P&gt;The extraction is described in &lt;A href="https://answers.splunk.com/answers/481714/extracting-multiple-values-from-a-multivalue-field-1.html"&gt;this follow-up question&lt;/A&gt;.&lt;/P&gt;

&lt;P&gt;The need to tokenize the field in a summary index is due to the following: multivalue fields arrive to a summary index as a single value, apparently created by &lt;CODE&gt;mvjoin(source,'\n')&lt;/CODE&gt;. If I want to search on individual values, I need that TOKENIZER in &lt;CODE&gt;fields.conf&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Thu, 15 Dec 2016 19:08:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-a-multivalue-index-time-field-from-another/m-p/278464#M84086</guid>
      <dc:creator>arkadyz1</dc:creator>
      <dc:date>2016-12-15T19:08:34Z</dc:date>
    </item>
  </channel>
</rss>

