Hi,
could you please help me in replacing the join in below query?
index=168347-np [ | `last_np_sourcetype("index=168347-np","hardware")`] (physicalType=*) | fields physicalElementId deviceId
| join deviceId [ search index=168347-np [| `last_np_sourcetype( "index=168347-np", "group_members")` ] groupId=290681 | fields deviceId ]
| stats dc(physicalElementId) as Devices
Thanks
The general idea behind removing a join from a search is to combine the two search fragments to a single search separated by OR, e.g.
(index=168347-np [ | `last_np_sourcetype("index=168347-np","hardware")`] (physicalType=*)) OR
(index=168347-np [| `last_np_sourcetype( "index=168347-np", "group_members")` ] groupId=290681 )
...
then do stuff with the two data sets using eval/stats/...
however, in your example, if you are just using the join to filter on groupId as @nickhills suggests, there are other options, but to achieve a similar outcome to the filtering join you could do
(index=168347-np [ | `last_np_sourcetype("index=168347-np","hardware")`] (physicalType=*)) OR
(index=168347-np [| `last_np_sourcetype( "index=168347-np", "group_members")` ] groupId=290681 )
| stats values(groupId) as groupIds values(physicalElementId) as physicalElementIds by deviceId
| where mvcount(groupIds)=1
| stats dc(physicalElementIds) as Devices
which would search both data sets, then aggregate the two data sets retaining the groupIds by deviceId and the subsequent where clause performs the filter, as all deviceIds that have no corresponding groupId will be removed (effective inner join).
Then your final stats will just get the unique physicalElementId count.
Remember in Splunk that subsearches often come with limitations and performance implications and it is often useful to consider how a subsearch can be replaced with an alternative construct, like others have suggested here. It will generally come down to your data.
Hope this helps
The general idea behind removing a join from a search is to combine the two search fragments to a single search separated by OR, e.g.
(index=168347-np [ | `last_np_sourcetype("index=168347-np","hardware")`] (physicalType=*)) OR
(index=168347-np [| `last_np_sourcetype( "index=168347-np", "group_members")` ] groupId=290681 )
...
then do stuff with the two data sets using eval/stats/...
however, in your example, if you are just using the join to filter on groupId as @nickhills suggests, there are other options, but to achieve a similar outcome to the filtering join you could do
(index=168347-np [ | `last_np_sourcetype("index=168347-np","hardware")`] (physicalType=*)) OR
(index=168347-np [| `last_np_sourcetype( "index=168347-np", "group_members")` ] groupId=290681 )
| stats values(groupId) as groupIds values(physicalElementId) as physicalElementIds by deviceId
| where mvcount(groupIds)=1
| stats dc(physicalElementIds) as Devices
which would search both data sets, then aggregate the two data sets retaining the groupIds by deviceId and the subsequent where clause performs the filter, as all deviceIds that have no corresponding groupId will be removed (effective inner join).
Then your final stats will just get the unique physicalElementId count.
Remember in Splunk that subsearches often come with limitations and performance implications and it is often useful to consider how a subsearch can be replaced with an alternative construct, like others have suggested here. It will generally come down to your data.
Hope this helps
What's the point of this query? It's joining a single field to events that already have that field so what value is added?
I think ( @priyastalin correct me if wrong) it looks like this is just so you can filter on the groupId?
I'm not sure if seeing the contents of the macro would help to understand if its doing a simple filter or if there is something more complex going on to get the set of results.
If this is something you run frequently, I think a lookup might be a far simpler (and faster) approach.
Content of first query
"index=168347-np [ | `last_np_sourcetype("index=168347-np","hardware")`] (physicalType=*) | fields physicalElementId deviceId "
This query has 39718 events similarly like below
"chassisName: 10.241.241.22
cpyKey: 168347
deviceId: 19206245
deviceName: 10.241.241.22
hwRev: null
installedFlash: null
installedMemory: null
pcb: null
pcbRev: null
physicalElementId: 388725541
physicalSubType: null
physicalType: Power Supply
productFamily: Catalyst 2K/3K Series Power Supplies
productId: PWR-C1-1100WAC
productType: Power Supplies
serialNumber: DTN2246V1S4
serialNumberStatus: UNKNOWN
slot: PWRSPPLY2
swVersion: null
tan: null
tanRev: null
timeStamp: 2020-12-14T13:35:46"
Content of second query:
"index=168347-np [| `last_np_sourcetype( "index=168347-np", "group_members")` ] groupId=290681 | fields deviceId"
This query has 25 events similarly like below
It might help if you can also post the contents of the macro last_np_sourcetype()