Splunk Search

How to replace join in below query?

priyastalin
Explorer

@bowesmana 

Hi,

could you please help me in replacing the join in below query?

index=168347-np [ | `last_np_sourcetype("index=168347-np","hardware")`] (physicalType=*) | fields physicalElementId deviceId
| join deviceId [ search index=168347-np [| `last_np_sourcetype( "index=168347-np", "group_members")` ] groupId=290681 | fields deviceId ]
| stats dc(physicalElementId) as Devices

 

Thanks

Labels (1)
Tags (1)
0 Karma
1 Solution

bowesmana
Champion

@priyastalin 

The general idea behind removing a join from a search is to combine the two search fragments to a single search separated by OR, e.g. 

(index=168347-np [ | `last_np_sourcetype("index=168347-np","hardware")`] (physicalType=*)) OR
(index=168347-np [| `last_np_sourcetype( "index=168347-np", "group_members")` ] groupId=290681 )
...
then do stuff with the two data sets using eval/stats/...

however, in your example, if you are just using the join to filter on groupId as @nickhills suggests, there are other options, but to achieve a similar outcome to the filtering join you could do

(index=168347-np [ | `last_np_sourcetype("index=168347-np","hardware")`] (physicalType=*)) OR
(index=168347-np [| `last_np_sourcetype( "index=168347-np", "group_members")` ] groupId=290681 )
| stats values(groupId) as groupIds values(physicalElementId) as physicalElementIds by deviceId
| where mvcount(groupIds)=1
| stats dc(physicalElementIds) as Devices

which would search both data sets, then aggregate the two data sets retaining the groupIds by deviceId and the subsequent where clause performs the filter, as all deviceIds that have no corresponding groupId will be removed (effective inner join).

Then your final stats will just get the unique physicalElementId count.

Remember in Splunk that subsearches often come with limitations and performance implications and it is often useful to consider how a subsearch can be replaced with an alternative construct, like others have suggested here. It will generally come down to your data.

Hope this helps

 

 

View solution in original post

bowesmana
Champion

@priyastalin 

The general idea behind removing a join from a search is to combine the two search fragments to a single search separated by OR, e.g. 

(index=168347-np [ | `last_np_sourcetype("index=168347-np","hardware")`] (physicalType=*)) OR
(index=168347-np [| `last_np_sourcetype( "index=168347-np", "group_members")` ] groupId=290681 )
...
then do stuff with the two data sets using eval/stats/...

however, in your example, if you are just using the join to filter on groupId as @nickhills suggests, there are other options, but to achieve a similar outcome to the filtering join you could do

(index=168347-np [ | `last_np_sourcetype("index=168347-np","hardware")`] (physicalType=*)) OR
(index=168347-np [| `last_np_sourcetype( "index=168347-np", "group_members")` ] groupId=290681 )
| stats values(groupId) as groupIds values(physicalElementId) as physicalElementIds by deviceId
| where mvcount(groupIds)=1
| stats dc(physicalElementIds) as Devices

which would search both data sets, then aggregate the two data sets retaining the groupIds by deviceId and the subsequent where clause performs the filter, as all deviceIds that have no corresponding groupId will be removed (effective inner join).

Then your final stats will just get the unique physicalElementId count.

Remember in Splunk that subsearches often come with limitations and performance implications and it is often useful to consider how a subsearch can be replaced with an alternative construct, like others have suggested here. It will generally come down to your data.

Hope this helps

 

 

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

What's the point of this query?  It's joining a single field to events that already have that field so what value is added?

---
If this reply helps you, an upvote would be appreciated.
0 Karma

nickhills
Ultra Champion

I think ( @priyastalin correct me if wrong) it looks like this is just so you can filter on the groupId?

I'm not sure if seeing the contents of the macro would help to understand if its doing a simple filter or if there is something more complex going on to get the set of results.

If this is something you run frequently, I think a lookup might be a far simpler (and faster) approach.

If my comment helps, please give it a thumbs up!

priyastalin
Explorer

Content of first query

"index=168347-np [ | `last_np_sourcetype("index=168347-np","hardware")`] (physicalType=*) | fields physicalElementId deviceId "

This query has 39718 events  similarly like below

"chassisName: 10.241.241.22
   cpyKey: 168347
   deviceId: 19206245
   deviceName: 10.241.241.22
   hwRev: null
   installedFlash: null
   installedMemory: null
   pcb: null
   pcbRev: null
   physicalElementId: 388725541
   physicalSubType: null
   physicalType: Power Supply
   productFamily: Catalyst 2K/3K Series Power Supplies
   productId: PWR-C1-1100WAC
   productType: Power Supplies
   serialNumber: DTN2246V1S4
   serialNumberStatus: UNKNOWN
   slot: PWRSPPLY2
   swVersion: null
   tan: null
   tanRev: null
   timeStamp: 2020-12-14T13:35:46"

Content of second query:

"index=168347-np [| `last_np_sourcetype( "index=168347-np", "group_members")` ] groupId=290681 | fields deviceId"

This query has 25 events similarly like below

{ [-]
   deviceId: 16681845
   groupId: 290681
   groupName: ASR 1K Routers
   timeStamp: 2020-12-14T13:35:27

}
 
 
Final output is:
210
0 Karma

nickhills
Ultra Champion

It might help if you can also post the contents of the macro last_np_sourcetype()

If my comment helps, please give it a thumbs up!
0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!