Splunk Search

How to use a second index as a lookup for a value in the first index?

manhalmoussa
Explorer

Hello my fellow Splunkers,
i am trying to use a second index as a lookup for a field in the first index

index=products contains the products serialNumbers1
index=inventory contains the products serialNumbersAll and productsNames
serialNumbers1 is a subset of serialNumbersAll

i need to table serialNumbers1 and the equvelant productsNames
example:

(index=products OR index=inventory)
|table serialNumbers1 serialNumbersAll productsNames


we get

serialNumbers1 serialNumbersAll productsNames
111
222
333
444
                                   111                           apple
                                    222                          orange
                                    333                          banana
                                    444                          kiwi
                                    555
                                    666
                                    777
                                    888
                                

the desired output is

serialNumbers1 serialNumbersAll productsNames
111                                                              apple
222                                                              orange
333                                                              banana
444                                                              kiwi
                               111                               apple
                                222                              orange
                               333                               banana
                               444                               kiwi
                               555                               lemon
                               666                               vege
                               777                               potatoes
                               888                               sweet potatoes


notes: i have a huge set of data more than 200K so using eventstats is not an option as it hits the limit, increasing the limit is not an option
also using a lookup table is not an option for me as well

Labels (1)
Tags (3)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

This example will copy the productsNames fields to the events from the first index

| makeresults 
| eval _raw="serialNumbers1,serialNumbersAll,productsNames
111,,
222,,
333,,
444,,
,111,apple
,222,orange
,333,banana
,444,kiwi
,555,lemon
,666,vege
,777,potatoes
,888,sweet potato"
| multikv forceheader=1
| table serialNumbers1 serialNumbersAll productsNames
| eval n=if(isnull(serialNumbersAll), 1, 0)
| eval serialNumbers1=coalesce(serialNumbers1, serialNumbersAll)
| eval serialNumbersAll=coalesce(serialNumbers1, serialNumbersAll)
| stats values(productsNames) as productsNames values(n) as n by serialNumbers1 serialNumbersAll
| mvexpand n
| eval serialNumbers1=if(n=1, serialNumbers1, null())
| eval serialNumbersAll=if(n=0, serialNumbersAll, null())
| sort serialNumbers1
| fields - n

I am not sure in your example data where the names come from for 555 and higher as they are not in your first example data, so I assume you just left them out.

You say you are reaching a limit with eventstats... 

There is no real limit with eventstats, but it can be a slow command to run.

You could also use this instead - remove all lines following the 'table' command above and use this.

| eval common=coalesce(serialNumbers1,serialNumbersAll)
| sort common serialNumbersAll
| filldown productsNames
| sort serialNumbers1
| fields - common

 

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

This example will copy the productsNames fields to the events from the first index

| makeresults 
| eval _raw="serialNumbers1,serialNumbersAll,productsNames
111,,
222,,
333,,
444,,
,111,apple
,222,orange
,333,banana
,444,kiwi
,555,lemon
,666,vege
,777,potatoes
,888,sweet potato"
| multikv forceheader=1
| table serialNumbers1 serialNumbersAll productsNames
| eval n=if(isnull(serialNumbersAll), 1, 0)
| eval serialNumbers1=coalesce(serialNumbers1, serialNumbersAll)
| eval serialNumbersAll=coalesce(serialNumbers1, serialNumbersAll)
| stats values(productsNames) as productsNames values(n) as n by serialNumbers1 serialNumbersAll
| mvexpand n
| eval serialNumbers1=if(n=1, serialNumbers1, null())
| eval serialNumbersAll=if(n=0, serialNumbersAll, null())
| sort serialNumbers1
| fields - n

I am not sure in your example data where the names come from for 555 and higher as they are not in your first example data, so I assume you just left them out.

You say you are reaching a limit with eventstats... 

There is no real limit with eventstats, but it can be a slow command to run.

You could also use this instead - remove all lines following the 'table' command above and use this.

| eval common=coalesce(serialNumbers1,serialNumbersAll)
| sort common serialNumbersAll
| filldown productsNames
| sort serialNumbers1
| fields - common

 

manhalmoussa
Explorer

Thank you bowesmana. your solution is great.
if someone is using this solution, just be mindful that the sort command would limit your table to 10000 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

@manhalmoussa 

sort by default will limit to 10,000, but use 

| sort 0 fieldnames

that will sort the lot

 

Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...