Splunk Search

Need help with field extraction in search time

satyaallaparthi
Communicator

I have a raw Nessus file that I've processed by separating host names into individual hosts. However, I am encountering a problem with extracting data between <ReportItem> tags, especially when there are multiple lines involved (I have multiple report Items in one event under a hostname) .

 

Here is the regular expression I am using:

 

| rex field=_raw max_match=0 "\<ReportItem\s(?<pluginout>.*?)\<\/ReportItem\>"
OR
| rex field=_raw max_match=0 "\<ReportItem\s(?<pluginout>.*(\s+)?)\<\/ReportItem\>"

 

 

Unfortunately, it doesn't seem to capture anything that spans multiple lines, as shown in the example below:

 

"<ReportItem>

    ...

    (multiline content)

    ...

</ReportItem>"

 

Could you please help me adjust my regular expression to correctly capture multiline content within <ReportItem?

 

Note: ReportItem without multi lines are extracting fine.

 

any help would be appreciated

 

Labels (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| rex field=_raw max_match=0 "(?s)\<ReportItem>(?<pluginout>.*?)\<\/ReportItem\>"

Having offered that, @yuanliu is correct, it is usually better to treat structured data with correct tools e.g. spath, However, without a complete representation of your event data, and a fuller understanding of what it is you are actually trying to achieve, the rex above meets your minimal needs.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Your illustrated fragment suggests that your raw events are either XML or contains XML documents.  I strongly discourage treating structured data such as XML as plain text.  Please post complete sample event. (Anonymize as needed.)

0 Karma

satyaallaparthi
Communicator

I have  inserted the raw log in the xml code editor. One without new lines in it are extracting fine but not the ones with new lines or tabs are not even though I am using (?s)

0 Karma

satyaallaparthi
Communicator

 

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Thank you for sharing complete event.  If this is raw event, all you need is spath (or xmlkv, which has some interesting restrictions).  For example,

 

<your search>
| spath

 

These commands are QA tested by Splunk, much more robust than anything you can develop. (It also has the added benefit of getting richer data extracted.)

Here is a complete emulation.  Play with it and compare with real data.

 

| makeresults
| eval _raw = "</HostProperties><ReportItem severity=\"0\" port=\"0\" pluginFamily=\"Ubuntu Local Security Checks\" pluginName=\"Ubuntu 18.04 ESM / 20.04 LTS / 22.04 LTS : Vim vulnerabilities (USN-6420-1)\" pluginID=\"182769\" protocol=\"tcp\" <cvss_vector>AV:N/AC:L/Au:N/C:C/I:C/A:C</cvss_vector><description>The remote Ubuntu 18.04 ESM / 20.04 LTS / 22.04 LTS host has packages installed that are affected by multiple vulnerabilities as referenced in the USN-6420-1 advisory.

  - Heap-based Buffer Overflow in GitHub repository vim/vim prior to 9.0.0483. (CVE-2022-3234)

  - Use After Free in GitHub repository vim/vim prior to 9.0.0490. (CVE-2022-3235)

  - Use After Free in GitHub repository vim/vim prior to 9.0.0530. (CVE-2022-3256)

  - NULL Pointer Dereference in GitHub repository vim/vim prior to 9.0.0552. (CVE-2022-3278)

  - Use After Free in GitHub repository vim/vim prior to 9.0.0579. (CVE-2022-3297)

  - Stack-based Buffer Overflow in GitHub repository vim/vim prior to 9.0.0598. (CVE-2022-3324)

  - Use After Free in GitHub repository vim/vim prior to 9.0.0614. (CVE-2022-3352)

  - Heap-based Buffer Overflow in GitHub repository vim/vim prior to 9.0.0742. (CVE-2022-3491)

  - Heap-based Buffer Overflow in GitHub repository vim/vim prior to 9.0.0765. (CVE-2022-3520)

  - Use After Free in GitHub repository vim/vim prior to 9.0.0789. (CVE-2022-3591)

  - A vulnerability was found in vim and classified as problematic. Affected by this issue is the function     qf_update_buffer of the file quickfix.c of the component autocmd Handler. The manipulation leads to use     after free. The attack may be launched remotely. Upgrading to version 9.0.0805 is able to address this     issue. The name of the patch is. It is recommended to upgrade the affected component. The identifier of this vulnerability is VDB-212324. (CVE-2022-3705)

  - Use After Free in GitHub repository vim/vim prior to 9.0.0882. (CVE-2022-4292)

  - Floating Point Comparison with Incorrect Operator in GitHub repository vim/vim prior to 9.0.0804.
    (CVE-2022-4293)

Note that Nessus has not tested for these issues but has instead relied only on the application's self-reported version number.</description><synopsis>The remote Ubuntu host is missing one or more security updates.<plugin_output>
  - Installed package : vim_2:8.1.2269-1ubuntu5.17
  - Fixed package     : vim_2:8.1.2269-1ubuntu5.18

  - Installed package : vim-common_2:8.1.2269-1ubuntu5.17
  - Fixed package     : vim-common_2:8.1.2269-1ubuntu5.18

  - Installed package : vim-runtime_2:8.1.2269-1ubuntu5.17
  - Fixed package     : vim-runtime_2:8.1.2269-1ubuntu5.18

  - Installed package : vim-tiny_2:8.1.2269-1ubuntu5.17
  - Fixed package     : vim-tiny_2:8.1.2269-1ubuntu5.18

  - Installed package : xxd_2:8.1.2269-1ubuntu5.17
  - Fixed package     : xxd_2:8.1.2269-1ubuntu5.18

</plugin_output></ReportItem><ReportItem severity=\"0\" port=\"0\" pluginFamily=\"Ubuntu Local Security Checks\" pluginName=\"Ubuntu 16.04 ESM / 18.04 ESM / 20.04 LTS / 22.04 LTS / 23.04 : LibTIFF vulnerability (USN-6428-1)\" pluginID=\"182891\" protocol=\"tcp\" <description>The remote Ubuntu 16.04 ESM / 18.04 ESM / 20.04 LTS / 22.04 LTS / 23.04 host has packages installed that are affected by a vulnerability as referenced in the USN-6428-1 advisory.

  - A flaw was found in tiffcrop, a program distributed by the libtiff package. A specially crafted tiff file     can lead to an out-of-bounds read in the extractImageSection function in tools/tiffcrop.c, resulting in a     denial of service and limited information disclosure. This issue affects libtiff versions 4.x.
    (CVE-2023-1916)

Note that Nessus has not tested for this issue but has instead relied only on the application's self-reported version number.</description><synopsis>The remote Ubuntu host is missing a security update.</synopsis><cve>CVE-2023-1916</cve><xref>USN:6428-1</xref><see_also>https://ubuntu.com/security/notices/USN-6428-1</see_also><risk_factor>Medium</risk_factor><script_version>1.0</script_version><plugin_output>
  - Installed package : libtiff5_4.1.0+git191117-2ubuntu0.20.04.9
  - Fixed package     : libtiff5_4.1.0+git191117-2ubuntu0.20.04.10

</plugin_output></ReportItem><ReportItem severity=\"3\" port=\"0\" pluginFamily=\"Ubuntu Local Security Checks\" pluginName=\"Ubuntu 16.04 LTS / 18.04 LTS / 20.04 LTS / 22.04 LTS / 23.10 : GIFLIB vulnerabilities (USN-6824-1)\" pluginID=\"200257\" protocol=\"tcp\"<description>The remote Ubuntu 16.04 LTS / 18.04 LTS / 20.04 LTS / 22.04 LTS / 23.10 host has packages installed that are affected by multiple vulnerabilities as referenced in the USN-6824-1 advisory.</plugin_output></ReportItem>"
``` data emulation above ```
| spath
| fields plugin_output

 

This is the output (for brevity, I discarded all other nodes in XML):

plugin_output_raw_time
- Installed package : libtiff5_4.1.0+git191117-2ubuntu0.20.04.9 - Fixed package : libtiff5_4.1.0+git191117-2ubuntu0.20.04.10</HostProperties><ReportItem severity="0" port="0" pluginFamily="Ubuntu Local Security Checks" pluginName="Ubuntu 18.04 ESM / 20.04 LTS / 22.04 LTS : Vim vulnerabilities (USN-6420-1)" pluginID="182769" protocol="tcp" <cvss_vector>AV:N/AC:L/Au:N/C:C/I:C/A:C</cvss_vector><description>The remote Ubuntu 18.04 ESM / 20.04 LTS / 22.04 LTS host has packages installed that are affected by multiple vulnerabilities as referenced in the USN-6420-1 advisory. - Heap-based Buffer Overflow in GitHub repository vim/vim prior to 9.0.0483. (CVE-2022-3234) - Use After Free in GitHub repository vim/vim prior to 9.0.0490. (CVE-2022-3235) - Use After Free in GitHub repository vim/vim prior to 9.0.0530. (CVE-2022-3256) - NULL Pointer Dereference in GitHub repository vim/vim prior to 9.0.0552. (CVE-2022-3278) - Use After Free in GitHub repository vim/vim prior to 9.0.0579. (CVE-2022-3297) - Stack-based Buffer Overflow in GitHub repository vim/vim prior to 9.0.0598. (CVE-2022-3324) - Use After Free in GitHub repository vim/vim prior to 9.0.0614. (CVE-2022-3352) - Heap-based Buffer Overflow in GitHub repository vim/vim prior to 9.0.0742. (CVE-2022-3491) - Heap-based Buffer Overflow in GitHub repository vim/vim prior to 9.0.0765. (CVE-2022-3520) - Use After Free in GitHub repository vim/vim prior to 9.0.0789. (CVE-2022-3591) - A vulnerability was found in vim and classified as problematic. Affected by this issue is the function qf_update_buffer of the file quickfix.c of the component autocmd Handler. The manipulation leads to use after free. The attack may be launched remotely. Upgrading to version 9.0.0805 is able to address this issue. The name of the patch is. It is recommended to upgrade the affected component. The identifier of this vulnerability is VDB-212324. (CVE-2022-3705) - Use After Free in GitHub repository vim/vim prior to 9.0.0882. (CVE-2022-4292) - Floating Point Comparison with Incorrect Operator in GitHub repository vim/vim prior to 9.0.0804. (CVE-2022-4293) Note that Nessus has not tested for these issues but has instead relied only on the application's self-reported version number.</description><synopsis>The remote Ubuntu host is missing one or more security updates.<plugin_output> - Installed package : vim_2:8.1.2269-1ubuntu5.17 - Fixed package : vim_2:8.1.2269-1ubuntu5.18 - Installed package : vim-common_2:8.1.2269-1ubuntu5.17 - Fixed package : vim-common_2:8.1.2269-1ubuntu5.18 - Installed package : vim-runtime_2:8.1.2269-1ubuntu5.17 - Fixed package : vim-runtime_2:8.1.2269-1ubuntu5.18 - Installed package : vim-tiny_2:8.1.2269-1ubuntu5.17 - Fixed package : vim-tiny_2:8.1.2269-1ubuntu5.18 - Installed package : xxd_2:8.1.2269-1ubuntu5.17 - Fixed package : xxd_2:8.1.2269-1ubuntu5.18 </plugin_output></ReportItem><ReportItem severity="0" port="0" pluginFamily="Ubuntu Local Security Checks" pluginName="Ubuntu 16.04 ESM / 18.04 ESM / 20.04 LTS / 22.04 LTS / 23.04 : LibTIFF vulnerability (USN-6428-1)" pluginID="182891" protocol="tcp" <description>The remote Ubuntu 16.04 ESM / 18.04 ESM / 20.04 LTS / 22.04 LTS / 23.04 host has packages installed that are affected by a vulnerability as referenced in the USN-6428-1 advisory. - A flaw was found in tiffcrop, a program distributed by the libtiff package. A specially crafted tiff file can lead to an out-of-bounds read in the extractImageSection function in tools/tiffcrop.c, resulting in a denial of service and limited information disclosure. This issue affects libtiff versions 4.x. (CVE-2023-1916) Note that Nessus has not tested for this issue but has instead relied only on the application's self-reported version number.</description><synopsis>The remote Ubuntu host is missing a security update.</synopsis><cve>CVE-2023-1916</cve><xref>USN:6428-1</xref><see_also>https://ubuntu.com/security/notices/USN-6428-1</see_also><risk_factor>Medium</risk_factor><script_version>1.0</script_version><plugin_output> - Installed package : libtiff5_4.1.0+git191117-2ubuntu0.20.04.9 - Fixed package : libtiff5_4.1.0+git191117-2ubuntu0.20.04.10 </plugin_output></ReportItem><ReportItem severity="3" port="0" pluginFamily="Ubuntu Local Security Checks" pluginName="Ubuntu 16.04 LTS / 18.04 LTS / 20.04 LTS / 22.04 LTS / 23.10 : GIFLIB vulnerabilities (USN-6824-1)" pluginID="200257" protocol="tcp"<description>The remote Ubuntu 16.04 LTS / 18.04 LTS / 20.04 LTS / 22.04 LTS / 23.10 host has packages installed that are affected by multiple vulnerabilities as referenced in the USN-6824-1 advisory.</plugin_output></ReportItem>2024-07-16 14:52:11
Tags (1)
0 Karma

satyaallaparthi
Communicator

Actually, I forgot to mention in the main post. 

I tried “spath”, which is not extracting as expected (extracting other values for one field)

0 Karma

yuanliu
SplunkTrust
SplunkTrust

The fragment you illustrated is NOT a complete XML document.  Please post full event.  My suspicion is that your raw event contains an XML document, but also contains something that is not XML.  You will need to first extract XML into a field, then apply spath.

0 Karma

jotne
Builder

The .* does not match newline etc, so here is a trick I did find.  Change .* with [\s\S]*

example:

\<ReportItem\s(?<pluginout>[\s\S]*?)\<\/ReportItem\>

 

0 Karma
Get Updates on the Splunk Community!

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...