Splunk Search

How to join two indexes by non unique fields?

kb_vells
Path Finder

Index1 with fields (name, "team id", surName)
Index2 with fields (userId, correlationId, operation)

Questions1: I want to join two indexes which are having completely different sets of information. There is a joining field but the field names are not unique but the values are same

both name and userId's holding the same set of values. e.g name=John, userId=John
How do I combine these two indexes by name and userId fields to get results from both indexes

Question2:
we have created two form fields in splunk dashboard. one is "userId" and second one is "team id". "team id" is available only in Index1 and userId is available in Index2. As I mentioned in the question1, the only joining condition is using name and userId.
Please suggest best search query to combine the indexes filtered by "team id" and userId

Answering to Somesoni2, Ayn

Thanks a lot for your quick responses. Please find the sample logs from the index1 and index2. There are lot more fields in addition to the one I mentioned but I am not in a position to reveal them as they are sensitive.

Index1
2014-08-10 21:34:12,558 INFO TeamReportImpl - {name=John, "team id"=Team 1, surname=Wright}

Index 2
2014-08-10 22:24:11,668 INFO OperationReportImpl - {userId=John operation=Create,
correlationId=021C0E78-65D2-AF4F38A93D7E}

The requirement is, we have dashboard with three fields,
1. Date range
2. officer name
3. team drop down
I have to create several panel to display Total counts. e.g
4. Total count of Create operation
5. Total count of correlation id by team (even though the team is not provided in index2)
thanks again for prompt response

Tags (3)
1 Solution

somesoni2
Revered Legend

You can create field aliases for the common valued field (name and userId) as User and then you can join based on the field alias. (More here)

Ques 1 : Once the field alias is configured, you can use data from both the indexes like below

index=Index1 | table User "team id", surName | join User [search index=Index2 | table User, correlationId, operation] | table User, "temp id", surName, correlationId, operation

Ques 2: You can apply your teamId and UserId filters on the above query like this

 index=Index1 **User=$UserId$ 'team id'=$TeamId"** | table User "team id", surName | join User [search index=Index2 | table User, correlationId, operation] | table User, "temp id", surName, correlationId, operation 

These just sample query. If you could describe your exact requirement and provide sample data/output you need, these can be made more specific.

Update

Assuming that token for OfficerName is $OfficerName$ and token for Team is $TeamId$, then try these

Total count of Create operation

(index=Index1 name="$OfficeName$" "team id"="$TeamId$") OR (index=Index2 userId="$OfficeName$" operation="Create")
| eval User=coalesce(name,userId) | sort 0 User | eventstats first("team id") as TeamId first(surName) as surName by User | where isnotnull(TeamId) | stats count(operation) as CountCreateOperation

Total count of correlation id by team

(index=Index1 name="$OfficeName$" "team id"="$TeamId$") OR (index=Index2 userId="$OfficeName$" operation="Create")
| eval User=coalesce(name,userId) | sort 0 User | eventstats first("team id") as TeamId first(surName) as surName by User | where isnotnull(TeamId) | stats count(correlationId) as Count by TeamId

Update 2

Try this (token names were not matching in your sample)

<form>
  <label>Sample Form</label>
  <fieldset submitButton="true">
  <input type="text" token="OfficerName">    
    <default/>
  </input>
  <input type="dropdown" token="TeamId">     
    <prefix>TeamId="</prefix>
    <suffix>"</suffix>
    <populatingSearch fieldForValue="team id" fieldForLabel="team id">
      <![CDATA[search condition]>
    </populatingSearch>
    <default/>
  <choice value="*">All</choice>
</input>

  </fieldset>
  <row>    
      <table>
         <title>Total count of Create operation</title>
        <searchString>(index=Index1 name="$OfficerName$" "team id"="$TeamId$") OR (index=Index2 userId="$OfficerName$" operation="Create")
| eval User=coalesce(name,userId) | sort 0 User | eventstats first("team id") as TeamId first(surName) as surName by User | where isnotnull(TeamId) | stats count(operation) as CountCreateOperation</searchString>
        <earliestTime>-60m@m</earliestTime>
        <latestTime>now</latestTime>
      </table>   
  </row>
   <row>    
        <table>
           <title>Total count of correlation id by team </title>
          <searchString>(index=Index1 name="$OfficerName$" "team id"="$TeamId$") OR (index=Index2 userId="$OfficerName$" operation="Create")
| eval User=coalesce(name,userId) | sort 0 User | eventstats first("team id") as TeamId first(surName) as surName by User | where isnotnull(TeamId) | stats count(correlationId) as Count by TeamId</searchString>
          <earliestTime>-60m@m</earliestTime>
          <latestTime>now</latestTime>
        </table>   
  </row>
</form>

View solution in original post

kb_vells
Path Finder

Hi Somesoni2
I tried your approach and left you my comment. it didn't work

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

I would not use a join in this case, and in fact I'm not sure that index1 should really be an index. It seems to me that it would be better as a lookup, and then you could just do:

index=index2 | lookup lookup_of_index1 name AS userID OUTPUT team_id

and work with that. To create the lookup, you could just run a job against index1

index=index1 | rename 'team id' AS team_id | fields name, team_id, surname | outputlookup lookup_of_index_1

(or there are other ways to create a lookup as a CSV file.)

0 Karma

kb_vells
Path Finder

Hi Gkanapathy
both indexes are actually created from several logs. I have already created four dashboard using these indexes but the new requirement is to combine these indexes to get some result out of it

0 Karma

somesoni2
Revered Legend

Try the updated answer.

0 Karma

somesoni2
Revered Legend

You can create field aliases for the common valued field (name and userId) as User and then you can join based on the field alias. (More here)

Ques 1 : Once the field alias is configured, you can use data from both the indexes like below

index=Index1 | table User "team id", surName | join User [search index=Index2 | table User, correlationId, operation] | table User, "temp id", surName, correlationId, operation

Ques 2: You can apply your teamId and UserId filters on the above query like this

 index=Index1 **User=$UserId$ 'team id'=$TeamId"** | table User "team id", surName | join User [search index=Index2 | table User, correlationId, operation] | table User, "temp id", surName, correlationId, operation 

These just sample query. If you could describe your exact requirement and provide sample data/output you need, these can be made more specific.

Update

Assuming that token for OfficerName is $OfficerName$ and token for Team is $TeamId$, then try these

Total count of Create operation

(index=Index1 name="$OfficeName$" "team id"="$TeamId$") OR (index=Index2 userId="$OfficeName$" operation="Create")
| eval User=coalesce(name,userId) | sort 0 User | eventstats first("team id") as TeamId first(surName) as surName by User | where isnotnull(TeamId) | stats count(operation) as CountCreateOperation

Total count of correlation id by team

(index=Index1 name="$OfficeName$" "team id"="$TeamId$") OR (index=Index2 userId="$OfficeName$" operation="Create")
| eval User=coalesce(name,userId) | sort 0 User | eventstats first("team id") as TeamId first(surName) as surName by User | where isnotnull(TeamId) | stats count(correlationId) as Count by TeamId

Update 2

Try this (token names were not matching in your sample)

<form>
  <label>Sample Form</label>
  <fieldset submitButton="true">
  <input type="text" token="OfficerName">    
    <default/>
  </input>
  <input type="dropdown" token="TeamId">     
    <prefix>TeamId="</prefix>
    <suffix>"</suffix>
    <populatingSearch fieldForValue="team id" fieldForLabel="team id">
      <![CDATA[search condition]>
    </populatingSearch>
    <default/>
  <choice value="*">All</choice>
</input>

  </fieldset>
  <row>    
      <table>
         <title>Total count of Create operation</title>
        <searchString>(index=Index1 name="$OfficerName$" "team id"="$TeamId$") OR (index=Index2 userId="$OfficerName$" operation="Create")
| eval User=coalesce(name,userId) | sort 0 User | eventstats first("team id") as TeamId first(surName) as surName by User | where isnotnull(TeamId) | stats count(operation) as CountCreateOperation</searchString>
        <earliestTime>-60m@m</earliestTime>
        <latestTime>now</latestTime>
      </table>   
  </row>
   <row>    
        <table>
           <title>Total count of correlation id by team </title>
          <searchString>(index=Index1 name="$OfficerName$" "team id"="$TeamId$") OR (index=Index2 userId="$OfficerName$" operation="Create")
| eval User=coalesce(name,userId) | sort 0 User | eventstats first("team id") as TeamId first(surName) as surName by User | where isnotnull(TeamId) | stats count(correlationId) as Count by TeamId</searchString>
          <earliestTime>-60m@m</earliestTime>
          <latestTime>now</latestTime>
        </table>   
  </row>
</form>

kb_vells
Path Finder

Hi Somesoni2
You are absolutely right about the form fields tokens.
Now it is working and details are below

I am getting the result now. I tried different approach

index=index2 "operation=Create" | fields operation userId | rename userId as officer| join officer [search index=index1 $officer$ $teamId$ | fields "team id" name | rename name as officer ] |stats count(operation) as CreateCount

the prefix for $officer$ is name
the prefix for $teamId$ is "team id"

At runtime, the value is replaced as name="John" and "team id"="Team 1"
Thanks a ton for your prompt response

somesoni2
Revered Legend

Try the update 2, I can see the token names were not same in the query and prefix property was not required for textbox.

0 Karma

kb_vells
Path Finder

Hi Somesoni2, Thanks for your help,both queries failing after including name="$OfficeName$" "team id"="$TeamId$" or userId="$OfficeName$"
I removed the few details due to comments max length restriction

<input type="text" token="officer">
  <prefix>officer="</prefix>
  <default/>
</input>
<input type="dropdown" token="TeamId">     
  <prefix>TeamId="</prefix>
  <suffix>"</suffix>
  <populatingSearch fieldForValue="team id" fieldForLabel="team id">
    <![CDATA[search condition]>
  </populatingSearch>
  <default/>
  <choice value="*">All</choice>
0 Karma

Ayn
Legend

Easiest is to just create a search that will match the events from both indexes.

(index=index1 name=John) OR (index=index2 userId=John)

You don't explain how the search in Question 1 or 2 is supposed to work, so it's hard to give specific advice. What input will you give? What output do you want?

0 Karma
Get Updates on the Splunk Community!

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

We’ve been buzzing with excitement about the recent validation of Splunk Education! The 2024 Splunk Career ...

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...