<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Regex in lookuptable in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264649#M79469</link>
    <description>&lt;P&gt;Upvote for self-commentary, "horribad" and "Bwahahahahahahahahahahahahahahahaha!!!!!"&lt;/P&gt;

&lt;P&gt;Really ingenious and evil and mad, mad I tell you... but for SCIENCE!!!!!!&lt;/P&gt;</description>
    <pubDate>Fri, 21 Apr 2017 16:53:43 GMT</pubDate>
    <dc:creator>DalJeanis</dc:creator>
    <dc:date>2017-04-21T16:53:43Z</dc:date>
    <item>
      <title>Regex in lookuptable</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264644#M79464</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;Can I use a regex in a static lookup table,I want to filter some alerts that trigger frequently like  &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Substantial Increase In [AC14-2.1] Prevent modification of system files - Caller MD5=41e25e514d90e9c8bc570484dbaff62b Events
Substantial Increase In [AC14-2.1] Prevent modification of system files - Caller MD5=41e25e514d90e9c8bc570484dbaff62b Events
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I have already created a lookuptable called &lt;CODE&gt;signaturecheck.csv&lt;/CODE&gt; and added all common signature so that it wont fire any signature that is seen in &lt;CODE&gt;signaturecheck.csv&lt;/CODE&gt;.&lt;BR /&gt;
But for this particular signature,the md5 value is changing frequently, so I want to know whether I can add a regex expression in the &lt;CODE&gt;lookuptable&lt;/CODE&gt; to filter it.&lt;/P&gt;

&lt;P&gt;Regards,&lt;/P&gt;</description>
      <pubDate>Wed, 30 Mar 2016 06:18:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264644#M79464</guid>
      <dc:creator>benmon</dc:creator>
      <dc:date>2016-03-30T06:18:09Z</dc:date>
    </item>
    <item>
      <title>Re: Regex in lookuptable</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264645#M79465</link>
      <description>&lt;P&gt;No, but you can do it "inside-out" by manually iterating with &lt;CODE&gt;map&lt;/CODE&gt; like this (assuming &lt;CODE&gt;signaturecheck.csv&lt;/CODE&gt; has a field called &lt;CODE&gt;RegEx&lt;/CODE&gt; and the events have a field called &lt;CODE&gt;MD5&lt;/CODE&gt;; just replace the &lt;CODE&gt;&amp;lt;your search here&amp;gt;&lt;/CODE&gt; part with your actual search):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| inputcsv MyTemporaryFile.csv
| search NOT [ &amp;lt;your search here&amp;gt; | streamstats count AS serial | outputcsv MyTemporaryFile.csv
| stats count AS Dr0pM3
| append [| inputcsv signaturecheck.csv ]
| where isnull(Dr0pM3)
| map maxsearches=99999 search="
   | inputcsv MyTemporaryFile.csv
   | eval Dr0pM3 = if(match(MD5, \"$RegEx$\"), \"DROP\", null())
   | where isnotnull(Dr0pM3) | fields serial | fields - _*
" | stats values(serial) AS serial ]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I tested this monstrosity like this:&lt;/P&gt;

&lt;P&gt;First generate a signature file like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;|noop | stats count AS RegEx | eval RegEx="ab,cd" | makemv delim="," RegEx | mvexpand RegEx | outputcsv signaturecheck.csv
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;It should look ( &lt;CODE&gt;|inputcsv signaturecheck.csv&lt;/CODE&gt; ) like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;RegEx
ab
cd
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Verify that the following search (embedded in next search, which is where your original search will go) generates 10 contrived events to filter:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;|noop | stats count AS raw | eval raw="a,b,c,ab,bc,cd,abc,bcd,cde,def" | makemv delim="," raw | mvexpand raw | streamstats count AS _time | eval MD5=raw | rename raw AS _raw
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;It should look like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;_raw    MD5                   _time
   a      a     1969-12-31 18:00:01
   b      b     1969-12-31 18:00:02
   c      c     1969-12-31 18:00:03
  ab     ab     1969-12-31 18:00:04
  bc     bc     1969-12-31 18:00:05
  cd     cd     1969-12-31 18:00:06
 abc    abc     1969-12-31 18:00:07
 bcd    bcd     1969-12-31 18:00:08
 cde    cde     1969-12-31 18:00:09
 def    def     1969-12-31 18:00:10
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Lastly, put it together like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| inputcsv MyTemporaryFile.csv
| search NOT [ |noop | stats count AS raw | eval raw="a,b,c,ab,bc,cd,abc,bcd,cde,def" | makemv delim="," raw | mvexpand raw | streamstats count AS _time | eval MD5=raw | rename raw AS _raw | table _time MD5 _raw | streamstats count AS serial | outputcsv MyTemporaryFile.csv
| stats count AS Dr0pM3
| append [| inputcsv signaturecheck.csv ]
| where isnull(Dr0pM3)
| map maxsearches=99999 search="
   | inputcsv MyTemporaryFile.csv
   | eval Dr0pM3 = if(match(MD5, \"$RegEx$\"), \"DROP\", null())
   | where isnotnull(Dr0pM3) | fields serial | fields - _*
" | stats values(serial) AS serial ]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;When tested, I got these correct results:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;MD5 _raw                   _time    serial
  a       a     1969-12-31 18:00:01         1
  b       b     1969-12-31 18:00:02         2
  c       c     1969-12-31 18:00:03         3
 bc      bc     1969-12-31 18:00:05         5
def     def     1969-12-31 18:00:10        10
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This is the most gruesomely horribad Splunk solution that I have ever crafted but I do not see any other way to do it.&lt;BR /&gt;
What I &lt;EM&gt;especially&lt;/EM&gt; like is the fact that the first part of the search references a file that DOES NOT EVEN EXIST but because &lt;CODE&gt;subsearches&lt;/CODE&gt; always run first, it gets created before that part of the search gets a chance to start.  Bwahahahahahahahahahahahahahahahaha!!!!!&lt;/P&gt;</description>
      <pubDate>Sun, 03 Apr 2016 01:08:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264645#M79465</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2016-04-03T01:08:48Z</dc:date>
    </item>
    <item>
      <title>Re: Regex in lookuptable</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264646#M79466</link>
      <description>&lt;P&gt;P.S. Be careful about the &lt;CODE&gt;maxsearches&lt;/CODE&gt; value; it could bite you and may need monitoring/adjustment.  For some reason, &lt;CODE&gt;0&lt;/CODE&gt; does not mean &lt;CODE&gt;unlimited&lt;/CODE&gt; like most options do.&lt;/P&gt;</description>
      <pubDate>Sun, 03 Apr 2016 02:01:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264646#M79466</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2016-04-03T02:01:42Z</dc:date>
    </item>
    <item>
      <title>Re: Regex in lookuptable</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264647#M79467</link>
      <description>&lt;P&gt;Also, this might be better writing into KVStore but I haven't learned how to do that yet so it is CSV all the way.&lt;/P&gt;</description>
      <pubDate>Sun, 03 Apr 2016 03:10:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264647#M79467</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2016-04-03T03:10:05Z</dc:date>
    </item>
    <item>
      <title>Re: Regex in lookuptable</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264648#M79468</link>
      <description>&lt;P&gt;Also, this would be UNNECESSARY if Splunk would enhance &lt;CODE&gt;match_type&lt;/CODE&gt; to support a &lt;CODE&gt;REGEX&lt;/CODE&gt; option.&lt;/P&gt;</description>
      <pubDate>Mon, 04 Apr 2016 01:09:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264648#M79468</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2016-04-04T01:09:11Z</dc:date>
    </item>
    <item>
      <title>Re: Regex in lookuptable</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264649#M79469</link>
      <description>&lt;P&gt;Upvote for self-commentary, "horribad" and "Bwahahahahahahahahahahahahahahahaha!!!!!"&lt;/P&gt;

&lt;P&gt;Really ingenious and evil and mad, mad I tell you... but for SCIENCE!!!!!!&lt;/P&gt;</description>
      <pubDate>Fri, 21 Apr 2017 16:53:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264649#M79469</guid>
      <dc:creator>DalJeanis</dc:creator>
      <dc:date>2017-04-21T16:53:43Z</dc:date>
    </item>
    <item>
      <title>Re: Regex in lookuptable</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264650#M79470</link>
      <description>&lt;P&gt;I &lt;STRONG&gt;&lt;EM&gt;really&lt;/EM&gt;&lt;/STRONG&gt; hate to destroy such a wonderful, &lt;STRONG&gt;&lt;EM&gt;evil&lt;/EM&gt;&lt;/STRONG&gt; answer as @woodcock's, but I hate the &lt;CODE&gt;map&lt;/CODE&gt; command more, and as Yoda &lt;STRONG&gt;&lt;EM&gt;should&lt;/EM&gt;&lt;/STRONG&gt; have said,&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;"Another, there &lt;STRONG&gt;&lt;EM&gt;is&lt;/EM&gt;&lt;/STRONG&gt;."&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;You &lt;STRONG&gt;&lt;EM&gt;can&lt;/EM&gt;&lt;/STRONG&gt; filter a set of results by building a single composite regex in a subsearch and feeding it back to the search. Here's a working sample that we ran on one of our systems...&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=foo "someuserid" 
| regex 
    [| noop 
     | stats count AS host 
     | eval host="hostname1,partialhostname2,I haz spaces,I sed \"I haz spaces\""  
     | makemv delim="," host 
     | mvexpand host 
     | rex mode=sed field=host "s/ /!?!?!?/g"
     | format "" "" "" "" "|" "" 
     | rex mode=sed field=search "s/host=\"//g s/\" / /g s/ //g s/!\?!\?!\?/ /g  s/(.)$/\1)\"/g" 
     | eval search= "host=\"(".search 
     | fields search]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The subsearch in brackets &lt;CODE&gt;[ ]&lt;/CODE&gt; returns a single field, &lt;CODE&gt;search&lt;/CODE&gt;, that has the value &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;host="(hostname1|partialhostname2|I haz spaces|I sed \"I haz spaces\")" 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;...so after substitution, the full line is... &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| regex host="(hostname1|partialhostname2|I haz spaces|I sed \"I haz spaces\")" 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;... and it operates as expected.  We've also verified that spaces and internal quotes work as expected.&lt;/P&gt;

&lt;P&gt;There is no reason you couldn't build one for your reasonably sized lookup table, as long as you don't run out of characters.  &lt;/P&gt;

&lt;P&gt;The above demonstrates a workaround &lt;CODE&gt;" "&lt;/CODE&gt; -&amp;gt; &lt;CODE&gt;"!?!?!?"&lt;/CODE&gt; -&amp;gt; &lt;CODE&gt;" "&lt;/CODE&gt; that preserves internal spaces while reformatting the output of &lt;CODE&gt;format&lt;/CODE&gt;.  &lt;/P&gt;

&lt;P&gt;It takes a little practice to get the regex right, but you can run the subsearch standalone without brackets as many times as you like until you are satisfied with the result.  &lt;/P&gt;</description>
      <pubDate>Thu, 12 Oct 2017 17:51:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264650#M79470</guid>
      <dc:creator>DalJeanis</dc:creator>
      <dc:date>2017-10-12T17:51:28Z</dc:date>
    </item>
    <item>
      <title>Re: Regex in lookuptable</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264651#M79471</link>
      <description>&lt;P&gt;Unfair.  Almost every &lt;CODE&gt;map&lt;/CODE&gt; solution can be turned inside-out as a &lt;CODE&gt;subsearch&lt;/CODE&gt; solution (and vice-versa)   Each method has VERY different pros/cons.&lt;/P&gt;</description>
      <pubDate>Mon, 23 Oct 2017 23:59:33 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264651#M79471</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2017-10-23T23:59:33Z</dc:date>
    </item>
    <item>
      <title>Re: Regex in lookuptable</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264652#M79472</link>
      <description>&lt;P&gt;Here is another technique to do fuzzy matching on multi-row outputs but without using a temp file, to avoid same-file-use for instances where many searches may call the code at once. We are using versions of this for some RBA enrichment macros. Thanks @japger_splunk for a tip on multireport and @woodcock for the map example!&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults 
| eval src="1.1.1.1;2.2.2.2;3.3.3.3;4.4.4.4;5.5.5.5" 
| makemv src delim=";" 
| mvexpand src `comment("

BEGIN ENRICHMENT BLOCK")`
    `comment("REMEMBER ROW ORDER AND MARK AS ORIGINAL DATA")` 
| eval original_row=1 
| streamstats count AS marker 
    `comment("FORK THE SEARCH: FIRST TO PRESERVE RESULTS, SECOND TO COMPARE EACH ROW AGAINST LOOKUP")` 
| multireport 
    [ ] 
    `comment("FOR EACH ROW, RUN FUZZY MATCH AGAINST LOOKUP AND SUMMARIZE THE RESULTS")` 
    [| map maxsearches=99999 search="
    | inputlookup notable_cache 
    | eval marker=$marker$, src=$src$
    | eval match=if(like(raw,\"%\".src.\"%\"), 1, 0) 
    | where match==1 
    | eval age_days = (now()-info_search_time)/86400 
    | eval in_notable_7d=if(age_days&amp;lt;=7,1,0), in_notable_30d=if(age_days&amp;lt;=30,1,0) 
    | stats values(marker) AS marker, sum(in_notable_7d) AS in_notable_7d_count, sum(in_notable_30d) AS in_notable_30d_count BY src
    "] 
    `comment("INTERLEAVE THE ORIGINAL RESULTS WITH THE LOOKUP MATCH RESULTS")` 
| sort 0 marker, in_notable_30d_count, in_notable_7d_count 
    `comment("TRANSPOSE DATA FROM ABOVE")` 
| streamstats current=f window=1 last(in_notable_30d_count) AS prev_in_notable_30d_count, last(in_notable_7d_count) AS prev_in_notable_7d_count 
    `comment("GET RID OF THE LOOKUP RESULTS")` 
| where original_row==1 
    `comment("CLEAN UP THE DATA")` 
| rename prev_in_notable_30d_count AS in_notable_30d_count, prev_in_notable_7d_count AS in_notable_7d_count 
| fillnull value=0 in_notable_30d_count, in_notable_7d_count 
| fields - original_row, marker
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 13 Feb 2020 00:05:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-in-lookuptable/m-p/264652#M79472</guid>
      <dc:creator>fharding</dc:creator>
      <dc:date>2020-02-13T00:05:47Z</dc:date>
    </item>
  </channel>
</rss>

