<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: how to compare characters in two fields and return number of matches? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415064#M119507</link>
    <description>&lt;P&gt;See my answers here for background:&lt;BR /&gt;
&lt;A href="https://answers.splunk.com/answers/567851/how-can-i-compare-mvfields-and-get-a-diff.html"&gt;https://answers.splunk.com/answers/567851/how-can-i-compare-mvfields-and-get-a-diff.html&lt;/A&gt;&lt;BR /&gt;
&lt;A href="https://answers.splunk.com/answers/734599/how-to-compare-the-same-search-from-the-previous-d.html"&gt;https://answers.splunk.com/answers/734599/how-to-compare-the-same-search-from-the-previous-d.html&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;If your stuff is already in a mv-field then you can skip this part, but if not, do this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=YouShouldAlwaysSpecifyAnIndex AND sourcetype=AndSourcetypeToo
|  stats values(re_split) AS re_split values(se_split) AS se_split BY whatever
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;For run anywhere, try this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults 
| eval re_split="a g h e c t o p", se_split="g h p q a z t w" 
| makemv re_split 
| makemv se_split 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Then you can EITHER do this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| streamstats count AS _serial 
| multireport 
    [| mvexpand se_split 
    | where re_split!=se_split 
    | rename se_split AS se_only] 
    [| mvexpand re_split 
    | where re_split!=se_split 
    | rename re_split AS re_only] 
| stats values(*) AS * BY _serial
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;OR this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| nomv re_split
| nomv se_split
| rex field=re_split mode=sed "s/[\r\n\s]+/;/g"
| rex field=se_split mode=sed "s/[\r\n\s]+/;/g"
| eval setdiff = split(replace(replace(replace(replace(mvjoin(mvsort(mvappend(split(replace(re_split, "(;|$)", "#1;"), ";"), split(replace(se_split, "(;|$)", "#0;"), ";"))), ";"), ";(\w+)#0\;\1#1", ""), ";\w+#1", ""), "#0", ""), ";(?!\w)|^;", ""), ";")
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Fri, 12 Apr 2019 21:22:47 GMT</pubDate>
    <dc:creator>woodcock</dc:creator>
    <dc:date>2019-04-12T21:22:47Z</dc:date>
    <item>
      <title>how to compare characters in two fields and return number of matches?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415058#M119501</link>
      <description>&lt;P&gt;I have two fields &lt;CODE&gt;se_split&lt;/CODE&gt; and &lt;CODE&gt;re_split&lt;/CODE&gt; which are lined up like so&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;re_split          se_split
a                     g
g                     h
h                     p
e                     q
c                     a
t                     z
o                     t
p                     w
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Is there a way for me to compare the two fields character by character and add up how many characters match?&lt;BR /&gt;
much thanks.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Apr 2019 19:58:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415058#M119501</guid>
      <dc:creator>brienhawker</dc:creator>
      <dc:date>2019-04-12T19:58:25Z</dc:date>
    </item>
    <item>
      <title>Re: how to compare characters in two fields and return number of matches?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415059#M119502</link>
      <description>&lt;P&gt;All those characters appear in different row OR it's just one multivalued field in single row?&lt;/P&gt;</description>
      <pubDate>Fri, 12 Apr 2019 20:07:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415059#M119502</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2019-04-12T20:07:42Z</dc:date>
    </item>
    <item>
      <title>Re: how to compare characters in two fields and return number of matches?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415060#M119503</link>
      <description>&lt;P&gt;All character are in a single field.  I've taken usernames and strung them out and am now trying to compare the two so that i can see if they match a certain amount.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Apr 2019 20:10:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415060#M119503</guid>
      <dc:creator>brienhawker</dc:creator>
      <dc:date>2019-04-12T20:10:56Z</dc:date>
    </item>
    <item>
      <title>Re: how to compare characters in two fields and return number of matches?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415061#M119504</link>
      <description>&lt;P&gt;What would be the expected output based on your example? Do they have to match in the order they appear in the field OR just character match? (e.g. in your sample if you want to compare by order,, nothing matches, but if you just want to check existence of the character, you get &lt;CODE&gt;aghtp&lt;/CODE&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 12 Apr 2019 20:49:19 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415061#M119504</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2019-04-12T20:49:19Z</dc:date>
    </item>
    <item>
      <title>Re: how to compare characters in two fields and return number of matches?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415062#M119505</link>
      <description>&lt;P&gt;I just want to match if re_split is in se_split. if it returns the letters that are in that field that is fine because I can just have it count how many letters there are in comparison to se_split and come up with a final number that way. in the end i just want a number that tells me how many matching characters there are and then im going to subtract the number of matching characters from the number of characters in se_split and return a percentage value as my final number. so yes i just want to check existence. &lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 00:09:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415062#M119505</guid>
      <dc:creator>brienhawker</dc:creator>
      <dc:date>2020-09-30T00:09:28Z</dc:date>
    </item>
    <item>
      <title>Re: how to compare characters in two fields and return number of matches?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415063#M119506</link>
      <description>&lt;P&gt;It'll be easier to give solution if you can provide your current query. You basically have to create a new field which is copy of re_split, expand it (using mvexpand), then compare the character if it's present in se_split (using mvfind) then run some stats to count and combine rows back to original count.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 00:05:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415063#M119506</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2020-09-30T00:05:21Z</dc:date>
    </item>
    <item>
      <title>Re: how to compare characters in two fields and return number of matches?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415064#M119507</link>
      <description>&lt;P&gt;See my answers here for background:&lt;BR /&gt;
&lt;A href="https://answers.splunk.com/answers/567851/how-can-i-compare-mvfields-and-get-a-diff.html"&gt;https://answers.splunk.com/answers/567851/how-can-i-compare-mvfields-and-get-a-diff.html&lt;/A&gt;&lt;BR /&gt;
&lt;A href="https://answers.splunk.com/answers/734599/how-to-compare-the-same-search-from-the-previous-d.html"&gt;https://answers.splunk.com/answers/734599/how-to-compare-the-same-search-from-the-previous-d.html&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;If your stuff is already in a mv-field then you can skip this part, but if not, do this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=YouShouldAlwaysSpecifyAnIndex AND sourcetype=AndSourcetypeToo
|  stats values(re_split) AS re_split values(se_split) AS se_split BY whatever
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;For run anywhere, try this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults 
| eval re_split="a g h e c t o p", se_split="g h p q a z t w" 
| makemv re_split 
| makemv se_split 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Then you can EITHER do this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| streamstats count AS _serial 
| multireport 
    [| mvexpand se_split 
    | where re_split!=se_split 
    | rename se_split AS se_only] 
    [| mvexpand re_split 
    | where re_split!=se_split 
    | rename re_split AS re_only] 
| stats values(*) AS * BY _serial
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;OR this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| nomv re_split
| nomv se_split
| rex field=re_split mode=sed "s/[\r\n\s]+/;/g"
| rex field=se_split mode=sed "s/[\r\n\s]+/;/g"
| eval setdiff = split(replace(replace(replace(replace(mvjoin(mvsort(mvappend(split(replace(re_split, "(;|$)", "#1;"), ";"), split(replace(se_split, "(;|$)", "#0;"), ";"))), ";"), ";(\w+)#0\;\1#1", ""), ";\w+#1", ""), "#0", ""), ";(?!\w)|^;", ""), ";")
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 12 Apr 2019 21:22:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415064#M119507</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-04-12T21:22:47Z</dc:date>
    </item>
    <item>
      <title>Re: how to compare characters in two fields and return number of matches?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415065#M119508</link>
      <description>&lt;P&gt;Here is my query. I commented in the place that I'm trying to run that &lt;CODE&gt;mvfind&lt;/CODE&gt; command. I have two emails, i cut anything after &lt;CODE&gt;@&lt;/CODE&gt; giving me &lt;CODE&gt;usernames&lt;/CODE&gt; of sorts then i split them and now am trying to search by &lt;CODE&gt;sender&lt;/CODE&gt; to see if the &lt;CODE&gt;sender&lt;/CODE&gt; is trying to send something to an email that closely resembles his own.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=msexchange size&amp;gt;2000000 directionality="Originating" AND action="delivered" AND recipient!="*iccu.com" AND NOT
(message_subject="RE*" OR message_subject="FW*" OR message_subject="*EXTERNAL*") AND
(recipient="*gmail.com" OR recipient="*.edu" OR recipient="*hotmail.com" OR recipient "*yahoo.com" OR recipient="*msn.com" OR 
recipient="*outlook.com" OR recipient="*aol.com" OR recipient="*zoho.com" OR recipient="*icloud.com" OR recipient="*inbox.com" OR recipient="*mail.com" OR recipient="*yandex.com" OR recipient="*protonmail.com")
|rex field=recipient "^(?&amp;lt;re_name&amp;gt;.*)@(?&amp;lt;trash1&amp;gt;.*)$"      `comment("START OF FIRST 3 COMPARASON")`
|rex field=sender "^(?&amp;lt;se_name&amp;gt;.*)@(?&amp;lt;trash1&amp;gt;.*)$"
|rex field=se_name "^(?P&amp;lt;se_first3&amp;gt;...)"
|rex field=re_name "^(?P&amp;lt;re_first3&amp;gt;...)"
|eval first_value=if(like(se_first3, re_first3), 1, 0)   `comment("START OF LAST 3 COMPARASON")`
|rex field=re_name "(?&amp;lt;re_last3&amp;gt;\w{3})$" 
|rex field=se_name "(?&amp;lt;se_last3&amp;gt;\w{3})$" 
|eval last_value=if(like(se_last3, re_last3), 1, 0)
|eval subject_value=if(isnull(message_subject), 1, 0)

`comment("THIS IS WHERE IM PULLING OUT CHARACTERS THAT I DONT WANT")`
|replace "*.*" WITH "" in re_name
|replace "*_*" WITH "" in re_name

|eval se_cut=replace(se_name,"^.","")
|eval re_cut=replace(re_name,"^.","")

|eval se_mvmake=se_name
|eval re_mvmake=re_name

| rex field=re_mvmake "(?&amp;lt;re_my1&amp;gt;.*?)\d\d\d+(?&amp;lt;re_my2&amp;gt;.*)" | eval re_mvmake=if(isnull(re_my1), re_mvmake, re_my1+re_my2)
| rex field=se_mvmake "(?&amp;lt;se_my1&amp;gt;.*?)\d\d\d+(?&amp;lt;se_my2&amp;gt;.*)" | eval se_mvmake=if(isnull(se_my1), se_mvmake, se_my1+se_my2)
`comment("IGNORE THIS")`
|makemv re_mvmake delim=‘“\”’
|makemv se_mvmake delim=‘“\”’
|mvexpand re_mvmake
|mvexpand se_mvmake
|rex field=re_name max_match=10  "\"(?&amp;lt;re_mv&amp;gt;.*?)\""
|rex field=se_name max_match=10  "\"(?&amp;lt;se_mv&amp;gt;.*?)\""

`comment("THIS IS WHAT I AM WORKING ON NOW BASED ON YOUR COMMENT")`
|eval re_split=split(re_name,"")
|eval se_split=split(se_name,"")
|mvexpand se_split
|mvexpand re_split
|eval split_total=mvfind(se_split, re_split)

|eval threat_num=4
|eval is_threat=if(like(se_name, re_name), 2, 0)
|eval appraise=if(like(message_subject, "%ppraisal%"), 2, 0)     `comment("START OF MESSAGE SUBJECT COMPARASON")`
|eval payment=if(like(message_subject, "%ayment%"), 2, 0)
|eval loan=if(like(message_subject, "%oan%"), 2, 0)
|eval estimate=if(like(message_subject, "%stimate%"), 2, 0)
|eval size_count=if(size&amp;gt;5000000, 1, 0)
|eval diff=(threat_num - appraise - payment - loan - estimate + size_count + first_value + last_value + is_threat + subject_value)
|eval Threat_Level=case(diff=1 OR diff=2, "Low", diff=3 OR diff=4, "Medium", diff=5, "High", diff=6 OR diff=7 OR diff=8, "Urgent")
| eval Time=strftime(_time,"%Y-%m-%d %H:%M:%S")
`comment("|where Threat_Level="High" OR Threat_Level="Urgent"")`
|stats values(split_total) values(se_split) values(re_split) values(Threat_Level) as "Threat Level" values(size) as "Message Size"    values(recipient) as Recipient values(Time) as Time by sender subject | sort -"Message Size" "Threat Level"
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 12 Apr 2019 21:25:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415065#M119508</guid>
      <dc:creator>brienhawker</dc:creator>
      <dc:date>2019-04-12T21:25:41Z</dc:date>
    </item>
    <item>
      <title>Re: how to compare characters in two fields and return number of matches?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415066#M119509</link>
      <description>&lt;P&gt;Thank you.  This works great for what im trying to do.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Apr 2019 21:35:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415066#M119509</guid>
      <dc:creator>brienhawker</dc:creator>
      <dc:date>2019-04-12T21:35:47Z</dc:date>
    </item>
    <item>
      <title>Re: how to compare characters in two fields and return number of matches?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415067#M119510</link>
      <description>&lt;P&gt;I am curious: which one did you use?&lt;/P&gt;</description>
      <pubDate>Fri, 12 Apr 2019 22:25:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415067#M119510</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-04-12T22:25:00Z</dc:date>
    </item>
    <item>
      <title>Re: how to compare characters in two fields and return number of matches?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415068#M119511</link>
      <description>&lt;P&gt;So you need a &lt;CODE&gt;set diff&lt;/CODE&gt; that runs against 2 &lt;CODE&gt;multi-valued&lt;/CODE&gt; fields, essentially, &lt;CODE&gt;mvdiff&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Sat, 13 Apr 2019 16:16:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/how-to-compare-characters-in-two-fields-and-return-number-of/m-p/415068#M119511</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-04-13T16:16:43Z</dc:date>
    </item>
  </channel>
</rss>

