How to take the output from one search and use dat...

qcjacobo2577 · ‎04-04-2022

I have what is hopefully a really straightforward issue. Essentially I want to take the output (data within a specific field from sourcetypeA) from one search and use that data to search again within the same index but a different sourcetype (sourcetypeB).

Initial Search

"index"="data_index" "sourcetype"="sourcetypeA" "field1"="static_value"
| table "field2"
| dedup "field2"

The above search result is a single field that contains a single value per row, but ultimately more than 1 row and different values each row...something like below.....

field2
AAAAAAAAA
BBBBBBBBB
CCCCCCCCC

I then need to take each of the rows above and plug that into another search:

"index"="data_index" "sourcetype"="sourcetypeB"
| table "field3", | table "field4"
| dedup "field3"

PickleRick · ‎04-04-2022

A rule of thumb with subsearches is "don't use them!" (unless you have really no other way).

The straightforward answer would be to do

index=index sourcetype=sourcetypeB
     [ index=index sourcetype=sourcetypeA field2=whatever
       | stats values(field2) as field3 ]

But while this is a logical solution it is most often a wrong one in terms of performance and possible wrong results.

qcjacobo2577 · ‎04-04-2022

If sub-search is not the best way to approach this, what is? I definitely am looking for the best approach and outcome(s) here.

PickleRick · ‎04-04-2022

Depends on the actual use case (sometimes it is indeed the only way). Usually it's a matter of clever field selection and statsing.

qcjacobo2577 · ‎04-05-2022

So far so good....so thank you.

One follow-up....

Lets assume that:

The current sub-search contains computer asset information.
The computer asset information has a field (currently un-related to the sub-search so far in this discussion) named "tag" that may in some cases contain a multi-value.
I need to:
- Split that multi-value into individual values (which I am assuming I can use the "mvexpand" command for).

and

- Subsequently filter some of those now non-multi-value values out (for example, if I want to see assets and more importantly vulnerabilities that are related to servers and not desktops).

I have tried several attempts to get the result (am sure I am doing something wrong here). I hope this makes sense.

PickleRick · ‎04-06-2022

It's always easiest to think if you provide some (anonymized if need be) samples of "input" events and expected output. As they say - a picture is worth a thousand words 😉

qcjacobo2577 · ‎05-25-2022

@PickleRick Just following up to see if you have any thoughts as to how to build out this search. Thank you.

PickleRick · ‎05-27-2022

Must have missed your event 🙂

Let's see...

The way to pick up the relevant events is to simply concatenate the conditions. So you do something like that:

("index"="tenable" "sourcetype"="tenable:io:vuln" "severity"!="informational" "state"!="fixed" earliest=-14d) OR (index=tenable sourcetype=tenable:io:assets earliest=-30d)

One caveat - separate timeranges in subconditions seem to work in my 8.2.5 installation. I'm not sure if it's OK with earlier series.

Then you have to adjust the field in one of the sourcetypes (in your case it would be the assets sourcetype). For example, something like this:

| eval asset_uuid=coalesce(asset_uuid,uuid)

Now to simply group it all you could do

| stats values(*) as * by asset_uuid

That's the easiest way to do it but it might mess with your relationships between fields if you have many events for each asset_uuid so you mith want to - for example, simply do

| stats values(_raw) as _raw by asset_uuid

in order to just match raw event values and then parse the raw values again by

| spath

You could differentiate your events of both sourcetypes before statsing so you'd get two different raw event sets for each asset_uuid or two different sets of fields. There are many possibilities.

But the general technique is as shown before - find a common field, normalize it (rename other field to it if needed) and stats over this field.

qcjacobo2577 · ‎05-27-2022

@PickleRick This seems to work --- any reason why I would not want to use this approach?

index=tenable sourcetype=tenable:io:vuln severity!=informational state!=fixed
[
search index=tenable sourcetype=tenable:io:assets deleted_at=null
| rename uuid AS asset_uuid
| stats count by asset_uuid
| fields asset_uuid
| format
]
| fields "plugin.cvss_base_score" "plugin.synopsis" "plugin.name" "plugin.cve{}" "plugin.id" "output" "severity" "plugin.see_also{}" "plugin.solution" "state" "asset_uuid" "plugin.has_patch" "plugin.exploit_available" "plugin.exploited_by_malware" "plugin.publication_date"
| table "plugin.cvss_base_score" "plugin.synopsis" "plugin.name" "plugin.cve{}" "plugin.id" "output" "severity" "plugin.see_also{}" "plugin.solution" "state" "asset_uuid" "plugin.has_patch" "plugin.exploit_available" "plugin.exploited_by_malware" "plugin.publication_date"

PickleRick · ‎05-27-2022

The subsearches have its limitations. If you exceed a predefined number of returned events (by default it's 50k I think) or the subsearch executes for loner than its time limit (60s? I'm not sure) the subsearch will get silently terminated and you'll get only partial results and not know about this. That's the main problem with subsearches. For now it might be working pretty OK but it doesn't have to always. That's why it's advisable to avoid subsearches if possible unless you're pretty sure they are quick.

qcjacobo2577 · ‎05-27-2022

@PickleRick

First of all thank you for your time. It is appreciated.

Looked over your suggestions.

What I am not understanding is --- if the goal here is to take each unique "uuid" value (the number of unique asset_uuid values varies anywhere from 1 to x depending on the search time/scope) from the list of assets and return only vulnerabilities that contain that uuid value (so essentially running this search as a "foreach" loop of sorts), how am I getting there?

This initial search returns all vulnerabilities and all assets within the defined period of time.

("index"="tenable" "sourcetype"="tenable:io:vuln" "severity"!="informational" "state"!="fixed" earliest=-14d) OR (index=tenable sourcetype=tenable:io:assets earliest=-30d)

This will effectively return (among other data) all uuid and asset_uuid values (which we later create a common field name for), but does not consider that I only want to return vulnerabilities where the "asset_uuid" value exists in the "uuid" value.

Maybe I am missing something.

PickleRick · ‎05-27-2022

Sure. That's the approach to select and group the data. If you want to only get those values that have their counterpart, you have to add additional condition like

| where (some_condition_fulfillable_only_by_events_selecting_uuid)

Unfortunately, that might mean that the overall search as a whole will indeed be big and relatively slow.

Therefore you could consider using some form of acceleration for your subsearch from the approach I showed as the initial solution. You could - for example - schedule your subsearch contents and generate a lookup from its results. This way when running the outer search you'd only use a inputlookup as a source for the data in your subsearch which is very fast. I think that's the easiest way to do it.

qcjacobo2577 · ‎04-06-2022

=====Asset Data Example=====

{"has_agent": true, "has_plugin_results": true, "created_at": "2021-10-28T14:59:21.964Z", "terminated_at": null, "terminated_by": null, "updated_at": "2022-04-06T00:43:11.746Z", "deleted_at": null, "deleted_by": null, "first_seen": "2021-10-28T14:59:21.583Z", "last_seen": "2022-04-06T00:43:11.416Z", "first_scan_time": "2021-10-28T14:59:21.583Z", "last_scan_time": "2022-04-06T00:43:11.416Z", "last_authenticated_scan_date": "2022-04-06T00:40:40.971Z", "last_licensed_scan_date": "2022-04-06T00:40:40.971Z", "last_scan_id": "6cb1cf79-fabb-4b4b-b81f-d96894adbe9e", "last_schedule_id": "template-306ed309-a776-3ef5-87d2-8eb812a47148f389d609ffeb46dc", "azure_vm_id": null, "azure_resource_id": null, "gcp_project_id": null, "gcp_zone": null, "gcp_instance_id": null, "aws_ec2_instance_ami_id": null, "aws_ec2_instance_id": null, "agent_uuid": "3ed6e2d25f4f4501a403a66bd020ae08", "bios_uuid": "3e34f54c-2220-11b2-a85c-92d15106ac6c", "network_id": "00000000-0000-0000-0000-000000000000", "network_name": "Default", "aws_owner_id": null, "aws_availability_zone": null, "aws_region": null, "aws_vpc_id": null, "aws_ec2_instance_group_name": null, "aws_ec2_instance_state_name": null, "aws_ec2_instance_type": null, "aws_subnet_id": null, "aws_ec2_product_code": null, "aws_ec2_name": null, "mcafee_epo_guid": null, "mcafee_epo_agent_guid": null, "servicenow_sysid": null, "bigfix_asset_id": null, "agent_names": ["laptop"], "installed_software": ["cpe:/a:microsoft:edge:99.0.1150.30", "cpe:/a:google:chrome:100.0.4896.75"], "ipv4s": ["10.10.10.10"], "ipv6s": ["fe80:0:0:0:fdb9:6512:39dd:89d4"], "fqdns": ["laptop.acme.corp"], "mac_addresses": ["02:50:41:00:00:01"], "netbios_names": ["laptop"], "operating_systems": ["Microsoft Windows"], "system_types": ["general-purpose"], "hostnames": ["laptop"], "ssh_fingerprints": [], "qualys_asset_ids": [], "qualys_host_ids": [], "manufacturer_tpm_ids": [], "symantec_ep_hardware_keys": [], "sources": [{"name": "NESSUS_AGENT", "first_seen": "2021-10-28T14:59:21.583Z", "last_seen": "2022-04-06T00:43:11.416Z"}], "tags": [{"uuid": "ebf9b183-2cb4-48a7-ad6d-fad4ac224f99", "key": "WINDOWS", "value": "WINDOWS_LAPTOPS", "added_by": "c637bc33-8454-4b4e-9e32-6cda9841933e", "added_at": "2021-11-02T17:50:24.722Z"}, {"uuid": "cdad26ea-159a-40ff-bf14-0f6d1c6fa1e5", "key": "DBLN_AGENT", "value": "MOBILE", "added_by": null, "added_at": "2022-03-29T17:00:14.712Z"}], "network_interfaces": [{"name": "Ethernet 5", "virtual": null, "aliased": null, "fqdns": ["laptop.acme.corp"], "mac_addresses": ["02:50:41:00:00:01"], "ipv4s": ["10.10.10.10"], "ipv6s": []}, {"name": "Wi-Fi", "virtual": null, "aliased": null, "fqdns": ["l-laptop.acme.corp"], "mac_addresses": ["cc:d9:ac:cd:6d:4c"], "ipv4s": ["10.10.10.10"], "ipv6s": []}, {"name": "UNKNOWN", "virtual": null, "aliased": null, "fqdns": ["laptop.acme.corp"], "mac_addresses": ["5c:ff:35:c3:70:8b", "02:50:41:00:00:01", "cc:d9:ac:cd:6d:4c", "00:2b:67:db:b0:c4", "cc:d9:ac:cd:6d:50", "ce:d9:ac:cd:6d:4c", "cc:d9:ac:cd:6d:4d"], "ipv4s": ["10.10.10.10"], "ipv6s": ["fe80:0:0:0:fdb9:6512:39dd:89d4"]}], "IO_address": "cloud.tenable.com", "state": "Active", "uuid": "028c406e-1c0c-466b-a703-0a5612bb8b07"}

=====Vulnerability Data Example=====

{"output": "output here", "plugin": {"checks_for_default_account": false, "checks_for_malware": false, "cpe": ["cpe:/a:microsoft:teams"], "cvss3_base_score": 0.0, "cvss3_temporal_score": 0.0, "cvss_base_score": 0.0, "cvss_temporal_score": 0.0, "description": "Description here.", "exploit_available": false, "exploit_framework_canvas": false, "exploit_framework_core": false, "exploit_framework_d2_elliot": false, "exploit_framework_exploithub": false, "exploit_framework_metasploit": false, "exploited_by_malware": false, "exploited_by_nessus": false, "family": "Windows", "family_id": 20, "has_patch": false, "id": 144792, "in_the_news": false, "name": "Name here", "modification_date": "2022-03-30T00:00:00Z", "publication_date": "2021-01-07T00:00:00Z", "risk_factor": "None", "see_also": ["http://www.nessus.org", "https://docs.microsoft.com"], "solution": "N/A", "synopsis": "Synopsis here.", "type": "local", "unsupported_by_vendor": false, "version": "1.42"}, "port": {"port": 0, "protocol": "TCP"}, "scan": {"completed_at": "2022-04-06T00:36:51.238Z", "schedule_uuid": "template-306ed309-a776-3ef5-87d2-8eb812a47148f389d609ffeb46dc", "started_at": "2022-04-06T00:00:38.528Z", "uuid": "6cb1cf79-fabb-4b4b-b81f-d96894adbe9e"}, "severity": "informational", "severity_id": 0, "severity_default_id": 0, "severity_modification_type": "NONE", "first_found": "2021-08-11T01:07:12.920Z", "last_found": "2022-04-06T00:36:51.238Z", "state": "open", "indexed": "2022-04-06T00:37:26.908Z", "IO_address": "cloud.tenable.com", "asset_uuid": "028c406e-1c0c-466b-a703-0a5612bb8b07", "asset_fqdn": "laptop.acme.corp", "agent_uuid": "971d45e109d84c33b04be05302e07419", "ipv4": "10.10.10.10", "ipv6": null, "vendor_severity": "info"}

====Current Splunk Search=====

"index"="tenable" "sourcetype"="tenable:io:vuln" "severity"!="informational" "state"!="fixed" earliest=-14d
[ search index=tenable sourcetype=tenable:io:assets earliest=-30d
| stats values(uuid) as asset_uuid ]
| eval output = replace (output, "[\n\r]"," ")
| rename "plugin.cvss_base_score" AS "CVSS" "plugin.synopsis" AS "Description" "plugin.name" AS "Name" "plugin.cve{}" AS "CVE" "plugin.id" AS "Plugin_ID" "output" AS "Output" "severity" AS "Risk" plugin.see_also{} AS "See_Also" "plugin.solution" AS "Solution" "state" AS "State" "asset_uuid" AS "Asset_ID" "plugin.has_patch" AS "Patchable" plugin.exploit_available AS "Exploitable" plugin.exploited_by_malware AS "Exploited_By_Malware" plugin.publication_date AS "Plugin_Publish_Date"
| strcat Asset_ID ":" Plugin_ID Custom_ID
| table "asset_fqdn", "CVSS", "Description", "Name", "CVE", "Plugin_ID", "Output", "Risk", "See_Also", "Solution" , "State", "Asset_ID", "Custom_ID", "Patchable", "Exploitable", "Exploited_By_Malware"
| dedup "Custom_ID"

===Comments====

This field is the one I want to additionally filter on (in the above example): "value": "WINDOWS_LAPTOPS"
This field could have multiple values (in the example it does not).
The value is not limited by name in the example --- could be "WINDOWS_DESKTOPS" for example.

How to take the output from one search and use data to search again within the same index but a different sourcetype?

subsearch

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Splunk MCP & Agentic AI: Machine Data Without Limits

Join the Conversation

How to take the output from one search and use data to search again within the same index but a different sourcetype?

subsearch

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Splunk MCP & Agentic AI: Machine Data Without Limits