<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How do you extract multi-value fields at ingestion &amp; deduplicate other multi-value fields? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383926#M112127</link>
    <description>&lt;P&gt;I'm guessing I also need to use the DELIMS key in my transforms? &lt;/P&gt;</description>
    <pubDate>Tue, 13 Nov 2018 15:45:18 GMT</pubDate>
    <dc:creator>zanb</dc:creator>
    <dc:date>2018-11-13T15:45:18Z</dc:date>
    <item>
      <title>How do you extract multi-value fields at ingestion &amp; deduplicate other multi-value fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383925#M112126</link>
      <description>&lt;P&gt;Hey everyone! &lt;/P&gt;

&lt;P&gt;I'm looking at extracting multi-value fields that contain multiple MAC addresses within a field. I know I can create and manipulated multi-value fields at search time, but I'd like to separate some of this data during ingestion. I've read through the transforms.conf and props.conf manual pages, but the language on transforming data into a multi-value field isn't very clear to me.&lt;/P&gt;

&lt;P&gt;For instance, I'm having trouble understanding what the "::$1" characters denote when using the "FORMAT" key in my transforms.conf file. I know it has to do with RegEx capture groups, but I'm just having a hard time relating how data is extracted and stored via a conf file, as I'm more used to using RegEx on the command line with grep.&lt;/P&gt;

&lt;P&gt;Would someone kindly help me with an example of how I would extract MAC addresses as a multi-value field? It would really help me bridge the gap of understanding for me, as it's been hard to find concrete examples of how to do this online.&lt;/P&gt;

&lt;P&gt;In the CSV file, the "MAC_address" field has the data encapsulated in quotes like this: "ad:00:12:af:21:31, 00:fd:aa:23:d1:a5, {so on}". So I'm thinking my conf files need to look something like this:&lt;/P&gt;

&lt;P&gt;props.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;#sourcetype
[mac_addy]
TRANSFORMS-mv_macaddress = mv_macaddress
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;transforms.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[mv_macaddress]
SOURCE_KEY=MAC_Address
REGEX=(([0-9A-F]{2}[:-]){5}([0-9A-F]{2})[,]+)
FORMAT=mv_macaddress::$1
MV_ADD=true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Would someone please confirm that I'm on the right path or point out any problems with my configuration?&lt;/P&gt;

&lt;P&gt;I also have fields in the CSV file that have multiple iterations of the same string (a URL), and I would like to deduplicate them so that random entries don't take up 2 page lengths of a webpage. Almost all of my Google searches for "Splunk deduplicate" point me to results about the search-time command "dedup" or how to use CRCsalt on my index to prevent duplicate whole entries. &lt;/P&gt;

&lt;P&gt;Is there any way I can define a string for a field and have Splunk drop or concatenate that field into one line so I don't have dozens (literally dozens of them!) of iterations of "&lt;A href="https://icanhas.cheezburger.com/"&gt;https://icanhas.cheezburger.com/&lt;/A&gt; &lt;A href="https://icanhas.cheezburger.com/"&gt;https://icanhas.cheezburger.com/&lt;/A&gt; &lt;A href="https://icanhas.cheezburger.com/"&gt;https://icanhas.cheezburger.com/&lt;/A&gt; &lt;A href="https://icanhas.cheezburger.com/"&gt;https://icanhas.cheezburger.com/&lt;/A&gt;".&lt;/P&gt;

&lt;P&gt;I appreciate your help. Thank you!&lt;/P&gt;</description>
      <pubDate>Tue, 13 Nov 2018 15:16:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383925#M112126</guid>
      <dc:creator>zanb</dc:creator>
      <dc:date>2018-11-13T15:16:44Z</dc:date>
    </item>
    <item>
      <title>Re: How do you extract multi-value fields at ingestion &amp; deduplicate other multi-value fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383926#M112127</link>
      <description>&lt;P&gt;I'm guessing I also need to use the DELIMS key in my transforms? &lt;/P&gt;</description>
      <pubDate>Tue, 13 Nov 2018 15:45:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383926#M112127</guid>
      <dc:creator>zanb</dc:creator>
      <dc:date>2018-11-13T15:45:18Z</dc:date>
    </item>
    <item>
      <title>Re: How do you extract multi-value fields at ingestion &amp; deduplicate other multi-value fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383927#M112128</link>
      <description>&lt;P&gt;You don't need &lt;CODE&gt;DELIMS&lt;/CODE&gt; when you have &lt;CODE&gt;REGEX&lt;/CODE&gt;.&lt;/P&gt;

&lt;P&gt;I think you need to change the &lt;CODE&gt;REGEX&lt;/CODE&gt; line to &lt;CODE&gt;REGEX=(([0-9A-Fa-f]{2}[:-]){5}([0-9A-Fa-f]{2})[,]?)&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Nov 2018 14:15:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383927#M112128</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2018-11-22T14:15:02Z</dc:date>
    </item>
    <item>
      <title>Re: How do you extract multi-value fields at ingestion &amp; deduplicate other multi-value fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383928#M112129</link>
      <description>&lt;P&gt;Missed it by &amp;gt;that&amp;lt; much; just change to this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;REGEX=(([0-9A-Fa-f]{2}[:-]){5}([0-9A-Fa-f]{2}))
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 22 Nov 2018 17:31:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383928#M112129</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2018-11-22T17:31:24Z</dc:date>
    </item>
    <item>
      <title>Re: How do you extract multi-value fields at ingestion &amp; deduplicate other multi-value fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383929#M112130</link>
      <description>&lt;P&gt;Did this work, @zanb?  Be sure to come back and comment or click &lt;CODE&gt;Accept&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Mon, 26 Nov 2018 21:25:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383929#M112130</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2018-11-26T21:25:57Z</dc:date>
    </item>
    <item>
      <title>Re: How do you extract multi-value fields at ingestion &amp; deduplicate other multi-value fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383930#M112131</link>
      <description>&lt;P&gt;Thanks for your help!&lt;/P&gt;</description>
      <pubDate>Mon, 26 Nov 2018 21:31:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-you-extract-multi-value-fields-at-ingestion-deduplicate/m-p/383930#M112131</guid>
      <dc:creator>zanb</dc:creator>
      <dc:date>2018-11-26T21:31:05Z</dc:date>
    </item>
  </channel>
</rss>

