<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Matching with a lookup in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Matching-with-a-lookup/m-p/756247#M243063</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/240521"&gt;@DaveBunn&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;Let's start with a publicly available list of compromised packages: &lt;A href="https://github.com/wiz-sec-public/wiz-research-iocs/blob/main/reports/shai-hulud-2-packages.csv" target="_self"&gt;https://github.com/wiz-sec-public/wiz-research-iocs/blob/main/reports/shai-hulud-2-packages.csv&lt;/A&gt;. The CSV file contains Package and Version fields that we'll correlate to SBOM.Packages objects:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;Package,Version
02-echo,= 0.0.7
@accordproject/concerto-analysis,= 3.24.1
@accordproject/concerto-linter,= 3.24.1
@accordproject/concerto-linter-default-ruleset,= 3.24.1
@accordproject/concerto-metamodel,= 3.12.5
...&lt;/LI-CODE&gt;&lt;P&gt;Note: I'm not affiliated with Wiz, Inc. We're all about Splunk here, but I don't see anything on &lt;A href="https://research.splunk.com/" target="_self"&gt;https://research.splunk.com/&lt;/A&gt; except for an attack range dataset.&lt;/P&gt;&lt;P&gt;Let's also start with three small test cases, two positive and one negative:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;{"SBOM":{"Packages":[{"name":"@accordproject/concerto-linter","versionInfo":"3.24.1"}]}}
{"SBOM":{"Packages":[{"name":"lodash","versionInfo":"4.17.21"},{"name":"@accordproject/concerto-linter","versionInfo":"3.24.1"}]}}
{"SBOM":{"Packages":[{"name":"lodash","versionInfo":"4.17.21"}]}}&lt;/LI-CODE&gt;&lt;P&gt;I'll assume by your question that you're starting with fields extracted with either KV_MODE = json or spath and not indexed extractions:&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="50%" height="25px"&gt;&lt;STRONG&gt;SBOM.Packages{}.name&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="50%" height="25px"&gt;&lt;STRONG&gt;&amp;nbsp;SBOM.Packages{}.versionInfo&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="50%" height="25px"&gt;@accordproject/concerto-linter&lt;/TD&gt;&lt;TD width="50%" height="25px"&gt;3.24.1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="50%" height="25px"&gt;lodash&lt;BR /&gt;@accordproject/concerto-linter&lt;/TD&gt;&lt;TD width="50%" height="25px"&gt;4.17.21&lt;BR /&gt;3.24.1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="50%" height="25px"&gt;lodash&lt;/TD&gt;&lt;TD width="50%" height="25px"&gt;4.17.21&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;BR /&gt;As you've found, KV_MODE = json scans only the first 10240 characters of _raw by default. See the limits.conf.spec [kv] stanza maxchars setting for more information.&lt;/P&gt;&lt;P&gt;The main challenge is correlating a value in SBOM.Packages{}.name at index &lt;EM&gt;i&lt;/EM&gt; with a value at the same index in SBOM.Packages{}.versionInfo.&lt;/P&gt;&lt;P&gt;We can extract and concatenate those values into a single multi-valued field using JSON eval functions:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;| eval ioc=mvmap(json_array_to_mv(json_extract(_raw, "SBOM.Packages")), spath(_raw, "name").",".spath(_raw, "versionInfo"))&lt;/LI-CODE&gt;&lt;P&gt;We can do the same with shai-hulud-2-packages.csv and use the result as a search filter:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;| search [| inputlookup shai-hulud-2-packages.csv | eval ioc=Package.",".Version | fields ioc ]&lt;/LI-CODE&gt;&lt;P&gt;Combining them together in a complete example, only the positive test cases are returned:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;| makeresults format=csv data="_raw
\"{\"\"SBOM\"\":{\"\"Packages\"\":[{\"\"name\"\":\"\"@accordproject/concerto-linter\"\",\"\"versionInfo\"\":\"\"3.24.1\"\"}]}}\"
\"{\"\"SBOM\"\":{\"\"Packages\"\":[{\"\"name\"\":\"\"lodash\"\",\"\"versionInfo\"\":\"\"4.17.21\"\"},{\"\"name\"\":\"\"@accordproject/concerto-linter\"\",\"\"versionInfo\"\":\"\"3.24.1\"\"}]}}\"
\"{\"\"SBOM\"\":{\"\"Packages\"\":[{\"\"name\"\":\"\"lodash\"\",\"\"versionInfo\"\":\"\"4.17.21\"\"}]}}\"
"
| eval ioc=mvmap(json_array_to_mv(json_extract(_raw, "SBOM.Packages")), spath(_raw, "name").",= ".spath(_raw, "versionInfo"))
| search [| inputlookup shai-hulud-2-packages.csv | eval ioc=Package.",".Version | fields ioc ]&lt;/LI-CODE&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="100%"&gt;&lt;STRONG&gt;_raw&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="100%"&gt;{"SBOM":{"Packages":[{"name":"@accordproject/concerto-linter","versionInfo":"3.24.1"}]}}&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="100%"&gt;{"SBOM":{"Packages":[{"name":"lodash","versionInfo":"4.17.21"},{"name":"@accordproject/concerto-linter","versionInfo":"3.24.1"}]}}&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sun, 07 Dec 2025 18:34:50 GMT</pubDate>
    <dc:creator>tscroggins</dc:creator>
    <dc:date>2025-12-07T18:34:50Z</dc:date>
    <item>
      <title>Matching with a lookup</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Matching-with-a-lookup/m-p/756228#M243058</link>
      <description>&lt;P&gt;I'm trying to set up a regular search to check all our GitHub packages against the latest Shai Hulud npm packages.&lt;/P&gt;&lt;P&gt;within "SBOM.Packages{}" i'm trying to validate each of the field pairs &lt;STRONG&gt;SBOM.Packages{}.name&lt;/STRONG&gt; and &lt;STRONG&gt;SBOM.Packages{}.versionInfo&lt;/STRONG&gt; against a lookup table containing all the shai hulud compromised packages.&lt;/P&gt;&lt;P&gt;I started with&lt;BR /&gt;"index=github [|inputlookup shai-hulud.csv&amp;nbsp;| table&amp;nbsp;&lt;STRONG&gt;SBOM.Packages{}.name,&amp;nbsp;&lt;/STRONG&gt;&amp;nbsp;&lt;STRONG&gt;SBOM.Packages{}.versionInfo&lt;/STRONG&gt;]"&lt;/P&gt;&lt;P&gt;This code works for packages names within the first few thousand characters of the event log (probably 10,000 chars knowing Splunk), but it does not reliably locate package names located a hundred or so packages in.&lt;/P&gt;&lt;P&gt;I've been trying to get an spath command running through a foreach loop but just can't get the loop to work.&lt;/P&gt;&lt;P&gt;so - the question.&lt;/P&gt;&lt;P&gt;Anyone already have a piece of SPL that checks npm packages against a lookup list.&lt;/P&gt;&lt;P&gt;OR&lt;/P&gt;&lt;P&gt;Anyone have an inkling how to iterate through a few hundred SBOM.Packages{} and compare them to the current list of 1500 compromised npm name / version variants.&lt;/P&gt;</description>
      <pubDate>Sat, 06 Dec 2025 07:38:35 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Matching-with-a-lookup/m-p/756228#M243058</guid>
      <dc:creator>DaveBunn</dc:creator>
      <dc:date>2025-12-06T07:38:35Z</dc:date>
    </item>
    <item>
      <title>Re: Shai Hulud</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Matching-with-a-lookup/m-p/756237#M243059</link>
      <description>&lt;P&gt;If you use "iterate" in a sentence you're probably not thinking about your problem in a splunky way. &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Paste a sample of your data (anonymized/sanitized if needed) to visualize your problem and the expected outcome.&lt;/P&gt;</description>
      <pubDate>Sat, 06 Dec 2025 07:37:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Matching-with-a-lookup/m-p/756237#M243059</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2025-12-06T07:37:29Z</dc:date>
    </item>
    <item>
      <title>Re: Matching with a lookup</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Matching-with-a-lookup/m-p/756241#M243061</link>
      <description>&lt;P&gt;What exactly is in that file? &amp;nbsp;Who/what produces this lookup? &amp;nbsp;What is the goal (desired output) you are trying achieve with this file? What exactly is in the source data? &amp;nbsp;Like&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/231884"&gt;@PickleRick&lt;/a&gt;&amp;nbsp;says, this is not a Github forum nor a Shai Hulud forum. &amp;nbsp;Your question needs to focus on data and processing.&lt;/P&gt;&lt;P&gt;Based on the hint you dropped, I get the feeling that you are trying to find events containing certain field values that matches a list of values in the lookup. &amp;nbsp;The fields of interest are package's name and versioninfo as a pair.&lt;/P&gt;&lt;P&gt;There are several problems with the approach shown. &amp;nbsp;The biggest is the content of the file. &amp;nbsp;The root cause is Splunk's flattening of JSON arrays. &amp;nbsp;If you examine your raw data closely, you'll notice that&amp;nbsp;&lt;STRONG&gt;SBOM.Packages{}.name,&amp;nbsp;&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;SBOM.Packages{}.versionInfo&lt;/STRONG&gt; are not independent keys. &amp;nbsp;They are keys in elements of an array (which Splunk denotes as &lt;STRONG&gt;SBOM.Packages{}&lt;/STRONG&gt;). &amp;nbsp;You cannot arbitrarily pair them together.&lt;/P&gt;&lt;P&gt;Now, I assume either you (or your employer's organization) have control over the format and content of the lookup. &amp;nbsp;So, I strongly recommend that you organize your lookup around two essential keys, &lt;STRONG&gt;name&lt;/STRONG&gt;&amp;nbsp;and &lt;STRONG&gt;versionInfo&lt;/STRONG&gt;. &amp;nbsp;Make sure that the two fields are not mismatched for your real purpose.&lt;/P&gt;&lt;P&gt;The second problem is also caused by Splunk's flattening of JSON array. &amp;nbsp;After flattening,&amp;nbsp;&lt;STRONG&gt;SBOM.Packages{}.name&lt;/STRONG&gt;&amp;nbsp;and&amp;nbsp;&lt;STRONG&gt;SBOM.Packages{}.versionInfo&lt;/STRONG&gt; become unrelated multivalue fields, i.e., independent arrays of their own. &amp;nbsp;Using subsearch with such data is doomed to be inaccurate. &amp;nbsp;You have to return back to actual JSON array&amp;nbsp;&lt;STRONG&gt;SBOM.Packages{}&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;Provided that your lookup now has the correct pairs &lt;STRONG&gt;name&lt;/STRONG&gt; and &lt;STRONG&gt;versioninfo&lt;/STRONG&gt;, here is one traditional approach to seek out matches.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;index=github
| fields - SBOM.Packages{}.* ``` optional but helps performance ```
| spath path=SBOM.Packages{}
| mvexpand SBOM.Packages{}
| spath input=SBOM.Packages{}
| fields - SBOM.Packages{} ``` again, optional ```
| lookup shai-hulud.csv name versioninfo output name as match_name
| where isnotnull(match_name)&lt;/LI-CODE&gt;&lt;P&gt;Again, the actual solution depends a lot on what you want to do with this match. &amp;nbsp;There can be more efficient code paths to get to your end game.&lt;/P&gt;</description>
      <pubDate>Sun, 07 Dec 2025 07:26:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Matching-with-a-lookup/m-p/756241#M243061</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2025-12-07T07:26:25Z</dc:date>
    </item>
    <item>
      <title>Re: Matching with a lookup</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Matching-with-a-lookup/m-p/756247#M243063</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/240521"&gt;@DaveBunn&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;Let's start with a publicly available list of compromised packages: &lt;A href="https://github.com/wiz-sec-public/wiz-research-iocs/blob/main/reports/shai-hulud-2-packages.csv" target="_self"&gt;https://github.com/wiz-sec-public/wiz-research-iocs/blob/main/reports/shai-hulud-2-packages.csv&lt;/A&gt;. The CSV file contains Package and Version fields that we'll correlate to SBOM.Packages objects:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;Package,Version
02-echo,= 0.0.7
@accordproject/concerto-analysis,= 3.24.1
@accordproject/concerto-linter,= 3.24.1
@accordproject/concerto-linter-default-ruleset,= 3.24.1
@accordproject/concerto-metamodel,= 3.12.5
...&lt;/LI-CODE&gt;&lt;P&gt;Note: I'm not affiliated with Wiz, Inc. We're all about Splunk here, but I don't see anything on &lt;A href="https://research.splunk.com/" target="_self"&gt;https://research.splunk.com/&lt;/A&gt; except for an attack range dataset.&lt;/P&gt;&lt;P&gt;Let's also start with three small test cases, two positive and one negative:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;{"SBOM":{"Packages":[{"name":"@accordproject/concerto-linter","versionInfo":"3.24.1"}]}}
{"SBOM":{"Packages":[{"name":"lodash","versionInfo":"4.17.21"},{"name":"@accordproject/concerto-linter","versionInfo":"3.24.1"}]}}
{"SBOM":{"Packages":[{"name":"lodash","versionInfo":"4.17.21"}]}}&lt;/LI-CODE&gt;&lt;P&gt;I'll assume by your question that you're starting with fields extracted with either KV_MODE = json or spath and not indexed extractions:&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="50%" height="25px"&gt;&lt;STRONG&gt;SBOM.Packages{}.name&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="50%" height="25px"&gt;&lt;STRONG&gt;&amp;nbsp;SBOM.Packages{}.versionInfo&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="50%" height="25px"&gt;@accordproject/concerto-linter&lt;/TD&gt;&lt;TD width="50%" height="25px"&gt;3.24.1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="50%" height="25px"&gt;lodash&lt;BR /&gt;@accordproject/concerto-linter&lt;/TD&gt;&lt;TD width="50%" height="25px"&gt;4.17.21&lt;BR /&gt;3.24.1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="50%" height="25px"&gt;lodash&lt;/TD&gt;&lt;TD width="50%" height="25px"&gt;4.17.21&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;BR /&gt;As you've found, KV_MODE = json scans only the first 10240 characters of _raw by default. See the limits.conf.spec [kv] stanza maxchars setting for more information.&lt;/P&gt;&lt;P&gt;The main challenge is correlating a value in SBOM.Packages{}.name at index &lt;EM&gt;i&lt;/EM&gt; with a value at the same index in SBOM.Packages{}.versionInfo.&lt;/P&gt;&lt;P&gt;We can extract and concatenate those values into a single multi-valued field using JSON eval functions:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;| eval ioc=mvmap(json_array_to_mv(json_extract(_raw, "SBOM.Packages")), spath(_raw, "name").",".spath(_raw, "versionInfo"))&lt;/LI-CODE&gt;&lt;P&gt;We can do the same with shai-hulud-2-packages.csv and use the result as a search filter:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;| search [| inputlookup shai-hulud-2-packages.csv | eval ioc=Package.",".Version | fields ioc ]&lt;/LI-CODE&gt;&lt;P&gt;Combining them together in a complete example, only the positive test cases are returned:&lt;/P&gt;&lt;LI-CODE lang="javascript"&gt;| makeresults format=csv data="_raw
\"{\"\"SBOM\"\":{\"\"Packages\"\":[{\"\"name\"\":\"\"@accordproject/concerto-linter\"\",\"\"versionInfo\"\":\"\"3.24.1\"\"}]}}\"
\"{\"\"SBOM\"\":{\"\"Packages\"\":[{\"\"name\"\":\"\"lodash\"\",\"\"versionInfo\"\":\"\"4.17.21\"\"},{\"\"name\"\":\"\"@accordproject/concerto-linter\"\",\"\"versionInfo\"\":\"\"3.24.1\"\"}]}}\"
\"{\"\"SBOM\"\":{\"\"Packages\"\":[{\"\"name\"\":\"\"lodash\"\",\"\"versionInfo\"\":\"\"4.17.21\"\"}]}}\"
"
| eval ioc=mvmap(json_array_to_mv(json_extract(_raw, "SBOM.Packages")), spath(_raw, "name").",= ".spath(_raw, "versionInfo"))
| search [| inputlookup shai-hulud-2-packages.csv | eval ioc=Package.",".Version | fields ioc ]&lt;/LI-CODE&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="100%"&gt;&lt;STRONG&gt;_raw&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="100%"&gt;{"SBOM":{"Packages":[{"name":"@accordproject/concerto-linter","versionInfo":"3.24.1"}]}}&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="100%"&gt;{"SBOM":{"Packages":[{"name":"lodash","versionInfo":"4.17.21"},{"name":"@accordproject/concerto-linter","versionInfo":"3.24.1"}]}}&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 07 Dec 2025 18:34:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Matching-with-a-lookup/m-p/756247#M243063</guid>
      <dc:creator>tscroggins</dc:creator>
      <dc:date>2025-12-07T18:34:50Z</dc:date>
    </item>
  </channel>
</rss>

