<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to lookup the best matching ingress url for url field in log? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615504#M213906</link>
    <description>&lt;P&gt;Dear Splunk community,&lt;/P&gt;
&lt;P&gt;I'm new to Splunk, so excuse my incompetence...&lt;/P&gt;
&lt;P&gt;What I'm trying to do is enriching my web access log with app name and team name from a csv lookup file.&lt;/P&gt;
&lt;P&gt;The CSV file "ingress_map.csv" looks like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;ingress,app,team
https://mycompany.com/abc,foo-bar,a-team
https://app.mycompany.com,good-app,b-team
https://app.mycompany.com/abc,better-app,c-team
https://app.mycompany.com/abc/xyz,best-app,d-team&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The url field of my web access log will seldom match exactly one of the ingresses, is it possible to have a lookup that finds the best matching ingress and adds the fields app and team to the log line? Or is there a better way of solving this problem?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards&lt;/P&gt;
&lt;P&gt;Terje Gravvold&lt;/P&gt;</description>
    <pubDate>Mon, 03 Oct 2022 08:28:45 GMT</pubDate>
    <dc:creator>tgravvold</dc:creator>
    <dc:date>2022-10-03T08:28:45Z</dc:date>
    <item>
      <title>How to lookup the best matching ingress url for url field in log?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615504#M213906</link>
      <description>&lt;P&gt;Dear Splunk community,&lt;/P&gt;
&lt;P&gt;I'm new to Splunk, so excuse my incompetence...&lt;/P&gt;
&lt;P&gt;What I'm trying to do is enriching my web access log with app name and team name from a csv lookup file.&lt;/P&gt;
&lt;P&gt;The CSV file "ingress_map.csv" looks like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;ingress,app,team
https://mycompany.com/abc,foo-bar,a-team
https://app.mycompany.com,good-app,b-team
https://app.mycompany.com/abc,better-app,c-team
https://app.mycompany.com/abc/xyz,best-app,d-team&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The url field of my web access log will seldom match exactly one of the ingresses, is it possible to have a lookup that finds the best matching ingress and adds the fields app and team to the log line? Or is there a better way of solving this problem?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards&lt;/P&gt;
&lt;P&gt;Terje Gravvold&lt;/P&gt;</description>
      <pubDate>Mon, 03 Oct 2022 08:28:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615504#M213906</guid>
      <dc:creator>tgravvold</dc:creator>
      <dc:date>2022-10-03T08:28:45Z</dc:date>
    </item>
    <item>
      <title>Re: Lookup the best matching ingress url for url field in log</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615517#M213909</link>
      <description>&lt;P&gt;You need to first define what is a "best match" in terms of data. &amp;nbsp;Will domain match suffice? &amp;nbsp;Domain and protocol? &amp;nbsp;Domain, protocol, plus a fixed number of paths?&lt;/P&gt;&lt;P&gt;Another question is how much flexibility is in the content of that lookup file.&lt;/P&gt;</description>
      <pubDate>Sun, 02 Oct 2022 03:25:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615517#M213909</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2022-10-02T03:25:02Z</dc:date>
    </item>
    <item>
      <title>Re: Lookup the best matching ingress url for url field in log</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615527#M213912</link>
      <description>&lt;P&gt;Thanks for the answer. Unfortunately I can not assume matching only hostname or a fixed path depth. If it helps, I can manipulate the ingress in the CSV. The CSV is exported via script. For example I can sort the CSV by index if it helps.&lt;/P&gt;&lt;P&gt;What I'm seeking is a function that matches each ingress from the CSV to the url in the log entry and picks the ingress that best matches (matches most characters from start of field.&lt;/P&gt;</description>
      <pubDate>Sun, 02 Oct 2022 11:34:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615527#M213912</guid>
      <dc:creator>tgravvold</dc:creator>
      <dc:date>2022-10-02T11:34:07Z</dc:date>
    </item>
    <item>
      <title>Re: Lookup the best matching ingress url for url field in log</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615539#M213918</link>
      <description>&lt;P&gt;I found this post that points out ways to compare strings the way I want:&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.splunk.com/t5/Splunk-Search/Can-splunk-compare-two-strings-and-return-likeness-similarity/m-p/35485" target="_blank"&gt;https://community.splunk.com/t5/Splunk-Search/Can-splunk-compare-two-strings-and-return-likeness-similarity/m-p/35485&lt;/A&gt;&lt;/P&gt;&lt;P&gt;But I don't know if this kind of comparison is possible with lookup tables. Would it be better to feed the CSV data into a index?&lt;/P&gt;</description>
      <pubDate>Sun, 02 Oct 2022 16:23:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615539#M213918</guid>
      <dc:creator>tgravvold</dc:creator>
      <dc:date>2022-10-02T16:23:25Z</dc:date>
    </item>
    <item>
      <title>Re: Lookup the best matching ingress url for url field in log</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615548#M213923</link>
      <description>&lt;P&gt;I agree that a Python/external script is better suited for what you need. &amp;nbsp;Ingesting CSV into index has lots of drawbacks, however.&lt;/P&gt;&lt;P&gt;One possibility - maybe an easy solution, is to just read the CSV in the script. &amp;nbsp;You can return all needed values from the script directly. &amp;nbsp;No need to pull the CSV into search at all. &amp;nbsp;If you really need the content in SPL after running the script, there can be several approaches. &amp;nbsp;For example,&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Make the script return the best matched string in CSV, then use this string to perform lookup. &amp;nbsp;This will return all bested matched field values again.&amp;nbsp;&lt;/LI&gt;&lt;LI&gt;Make &lt;A href="https://community.splunk.com/t5/Splunk-Search/How-to-compare-events-with-quot-milestone-quot-lookup/m-p/611871#M212728" target="_blank" rel="noopener"&gt;a "dummy" field in CSV&lt;/A&gt; and use that dummy field to return every entry in the lookup file into every event. &amp;nbsp;That technique is only useful in vary limited use cases.&lt;/LI&gt;&lt;LI&gt;Append &lt;FONT face="andale mono,times"&gt;| inputlookup&lt;/FONT&gt; into search, then use stats or some other technique to utilize it.&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Sun, 02 Oct 2022 21:31:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615548#M213923</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2022-10-02T21:31:07Z</dc:date>
    </item>
    <item>
      <title>Re: Lookup the best matching ingress url for url field in log</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615570#M213929</link>
      <description>&lt;P&gt;What about using wildcard lookups which will get you part of the way&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;A href="https://mycompany.com/*" target="_blank" rel="noopener"&gt;https://mycompany.com/*&lt;/A&gt;&lt;/TD&gt;&lt;TD&gt;foo-bar&lt;/TD&gt;&lt;TD&gt;a-team&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;A href="https://app.mycompany.com" target="_blank" rel="noopener"&gt;https://app.mycompany.com&lt;/A&gt;&lt;/TD&gt;&lt;TD&gt;good-app&lt;/TD&gt;&lt;TD&gt;b-team&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;A href="https://app.mycompany.com/*" target="_blank" rel="noopener"&gt;https://app.mycompany.com/*&lt;/A&gt;&lt;/TD&gt;&lt;TD&gt;better-app&lt;/TD&gt;&lt;TD&gt;c-team&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;A href="https://app.mycompany.com/*/*" target="_blank" rel="noopener"&gt;https://app.mycompany.com/*/*&lt;/A&gt;&lt;/TD&gt;&lt;TD&gt;best-app&lt;/TD&gt;&lt;TD&gt;d-team&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;A href="https://app.mycompany.com/*/*/*" target="_blank" rel="noopener"&gt;https://app.mycompany.com/*/*/*&lt;/A&gt;&lt;/TD&gt;&lt;TD&gt;gold-star-app&lt;/TD&gt;&lt;TD&gt;z-team&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Made with this search&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| makeresults
| eval _raw="ingress,app,team
https://mycompany.com/*,foo-bar,a-team
https://app.mycompany.com,good-app,b-team
https://app.mycompany.com/*,better-app,c-team
https://app.mycompany.com/*/*,best-app,d-team
https://app.mycompany.com/*/*/*,gold-star-app,z-team"
| multikv forceheader=1
| table ingress,app,team
| rex field=ingress max_match=0 "(?&amp;lt;p&amp;gt;/\*)"
| eval depth=mvcount(p)
| fillnull depth
| fields - p
| outputlookup ingress.csv&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;where the last column is path depth (based on number of /* elements in the path) and then (having made a wildcard lookup definition - WILDCARD(ingress), this example will&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| makeresults
| eval _raw="ingress,app,team
https://mycompany.com/abc,foo-bar,a-team
https://app.mycompany.com,good-app,b-team
https://app.mycompany.com/abc,better-app,c-team
https://app.mycompany.com/abc/xyz,best-app,d-team
https://app.mycompany.com/zyx/123/abc,gold-star-app,z-team"
| multikv forceheader=1
| table ingress,app,team
| lookup ingress ingress OUTPUT team as f_team depth as f_depth
| eval actual_team=mvindex(f_team, max(f_depth, 1) - 1)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;then return a match where it is exact or the match with the greatest path depth on the match.&lt;/P&gt;&lt;P&gt;Not sure if this would work in all cases and your requirements may be more specific than this...&lt;/P&gt;&lt;P&gt;BTW, I have used the fuzzy lookup app, which works reasonably well - I had to make a tweak to the underlying Python to get it to work in my context&lt;/P&gt;&lt;P&gt;&lt;A href="https://splunkbase.splunk.com/app/5237" target="_blank" rel="noopener"&gt;https://splunkbase.splunk.com/app/5237&lt;/A&gt;&lt;/P&gt;&lt;P&gt;but naturally it can use quite a bit of compute&lt;/P&gt;</description>
      <pubDate>Mon, 03 Oct 2022 03:12:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/615570#M213929</guid>
      <dc:creator>bowesmana</dc:creator>
      <dc:date>2022-10-03T03:12:39Z</dc:date>
    </item>
    <item>
      <title>Re: How to lookup the best matching ingress url for url field in log?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/616515#M214273</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/6367"&gt;@bowesmana&lt;/a&gt;&amp;nbsp;! I'll give your rex uri depth match a try. It will be a fairly simple solution if it fits the requirements.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Oct 2022 11:16:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-lookup-the-best-matching-ingress-url-for-url-field-in-log/m-p/616515#M214273</guid>
      <dc:creator>tgravvold</dc:creator>
      <dc:date>2022-10-10T11:16:16Z</dc:date>
    </item>
  </channel>
</rss>

