Splunk Search

How to execute these condition based Joins?

mnj1809
Path Finder

Hello,

I need your help to find a way to achieve the following use case:

in main search:
I've to categories: Windows and NIX. both the categories have ip and hostname fields.

category             ip                         hostname
windows         x.x.x.x                    a
nix                     y.y.y.y                      b

Now my requirement is to join the above result set with another result set based on the following business rules:

- for windows, I want to join based on the hostname only.
- for nix, I want to join  based on both ip and hostname.

Thanks in advance for the help.

Labels (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Firstly, you should try to avoid using join in splunk - splunk is not an SQL database.

Secondly, it depends what your other data source contains and where it is coming from, e.g. another index or lookup file.

Please can you expand on your usecase?

0 Karma

mnj1809
Path Finder

Thanks a lot for your prompt response. Yes we should avoid joins to improve the performance. but sometimes we've to use joins. 🙂
The main search has the different data source and the sub search has the different one. 

my requirement is to compare two inventories let's say CMDB and Qualys inventories based on IP and hostnames in the CMDB inventory.

index=cmdb
| stats values(*) as * by category, hostname, ip
| join type=left hostname
[search  index=qualys
| stats values(*) as * by hostname]
| join type=left ip                    
[search  index=qualys
| stats values(*) as * by ip]

So my requirement is to join the cmdb hostnames with qualys hostnames only for "windows category" (I don't want to join the cmdb ip with qualys ip). 

But for nix category I want to join the cmdb hostnames and ip with both qualys hostnames and ip.
So as this way I find the assets which are in cmdb and not in qualys.

Hope this will help you to understand my requirement.
@ITWhisperer @gcusello 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

From the way you are trying to solve the problem, would it be fair to say

  • for windows, the ip address in cmdb may not match ip address in qualys for the same host name
  • for nix, there may be multiple ip addresses for the same host name in both cmdb and qualys
  • qualys doesn't record what the category is
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Assuming that points above are true, you might need to change this around if you want to keep the cmdb ip address rather than the qualys ip address

``` Copy cmdb field(s) and category to qualys events for nix (host and ip matches) ```
| eventstats values(cmdb_field) as cmdb_field values(category) as category by hostname ip
``` Uncategorised qualys events must be windows ```
| fillnull value="windows" category
``` Create matching key dependent on category ```
| eval hostip=if(category="windows",hostname."|".category,hostname."|".ip)
``` Copy cmdb field(s) to qualys events ```
| eventstats values(cmdb_field) as cmdb_field by hostip
``` Only need to keep qualys events ```
| where index="qualys"

 

0 Karma

mnj1809
Path Finder

First of all thanks to @ITWhisperer for the prompt responses but unfortunately the solution that you shared will not work in my use case. I find the solution from my own. What I did is just to take the ip in cmdb as null for windows because we can've multiple ips for windows workstations for the same host so that's why I don't want to use cmdb ip address to join with qualys ip address for windows, only want to use cmdb hostname to join with qualys hostname. But for nix category I want to use both cmdb ip address and hostname to join with qualys ip and hostname.

```cmdb ip and hostnames```
| eval ip=if(category="windows",null(),ip)
| join max=0 type=LEFT host_name
[```Qualys hostnames```]
| join max=0 type=left ip
[```Qualys ip```]

@ITWhisperer 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @mnj1809,

I suppose that you want to join the results in one search, but what are the results after joining?

anuway. if in another index you have ip, hostname and other information (e.g. field1 and field2, you could run something like this:

<your_main_search>
| eval key=if(category="windows",hostname,ip."|".hostname
| append è[ search 
   <your_secondary_search> 
   | eval key=if(category="windows",hostname,ip."|".hostname
   | field key field1 field2 ]
| stats 
   values(hostname) AS hostname
   values(ip) AS ip
   values(field1) AS field1
   values(field2) AS field2
   BY key
| fields - key

Ciao.

Giuseppe

0 Karma

mnj1809
Path Finder

Hi Giuseppe,

Thanks for your quick response but the solution that you provide will not work in my case.

@gcusello 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @mnj1809,

You could also use the way I described to choose the correlation key between the archives, but, as @ITWhisperer said, you should try to avoid using join in splunk - splunk is not an SQL database because it's very slow and requires many resources.

So I hint to adapt my approach to your need, instead to use join.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...