Hello my fellow Splunkers,
i am trying to use a second index as a lookup for a field in the first index
index=products contains the products serialNumbers1
index=inventory contains the products serialNumbersAll and productsNames
serialNumbers1 is a subset of serialNumbersAll
i need to table serialNumbers1 and the equvelant productsNames
example:
(index=products OR index=inventory)
|table serialNumbers1 serialNumbersAll productsNames
we get
serialNumbers1 serialNumbersAll productsNames
111
222
333
444
111 apple
222 orange
333 banana
444 kiwi
555
666
777
888
the desired output is
serialNumbers1 serialNumbersAll productsNames
111 apple
222 orange
333 banana
444 kiwi
111 apple
222 orange
333 banana
444 kiwi
555 lemon
666 vege
777 potatoes
888 sweet potatoes
notes: i have a huge set of data more than 200K so using eventstats is not an option as it hits the limit, increasing the limit is not an option
also using a lookup table is not an option for me as well
This example will copy the productsNames fields to the events from the first index
| makeresults
| eval _raw="serialNumbers1,serialNumbersAll,productsNames
111,,
222,,
333,,
444,,
,111,apple
,222,orange
,333,banana
,444,kiwi
,555,lemon
,666,vege
,777,potatoes
,888,sweet potato"
| multikv forceheader=1
| table serialNumbers1 serialNumbersAll productsNames
| eval n=if(isnull(serialNumbersAll), 1, 0)
| eval serialNumbers1=coalesce(serialNumbers1, serialNumbersAll)
| eval serialNumbersAll=coalesce(serialNumbers1, serialNumbersAll)
| stats values(productsNames) as productsNames values(n) as n by serialNumbers1 serialNumbersAll
| mvexpand n
| eval serialNumbers1=if(n=1, serialNumbers1, null())
| eval serialNumbersAll=if(n=0, serialNumbersAll, null())
| sort serialNumbers1
| fields - n
I am not sure in your example data where the names come from for 555 and higher as they are not in your first example data, so I assume you just left them out.
You say you are reaching a limit with eventstats...
There is no real limit with eventstats, but it can be a slow command to run.
You could also use this instead - remove all lines following the 'table' command above and use this.
| eval common=coalesce(serialNumbers1,serialNumbersAll)
| sort common serialNumbersAll
| filldown productsNames
| sort serialNumbers1
| fields - common
This example will copy the productsNames fields to the events from the first index
| makeresults
| eval _raw="serialNumbers1,serialNumbersAll,productsNames
111,,
222,,
333,,
444,,
,111,apple
,222,orange
,333,banana
,444,kiwi
,555,lemon
,666,vege
,777,potatoes
,888,sweet potato"
| multikv forceheader=1
| table serialNumbers1 serialNumbersAll productsNames
| eval n=if(isnull(serialNumbersAll), 1, 0)
| eval serialNumbers1=coalesce(serialNumbers1, serialNumbersAll)
| eval serialNumbersAll=coalesce(serialNumbers1, serialNumbersAll)
| stats values(productsNames) as productsNames values(n) as n by serialNumbers1 serialNumbersAll
| mvexpand n
| eval serialNumbers1=if(n=1, serialNumbers1, null())
| eval serialNumbersAll=if(n=0, serialNumbersAll, null())
| sort serialNumbers1
| fields - n
I am not sure in your example data where the names come from for 555 and higher as they are not in your first example data, so I assume you just left them out.
You say you are reaching a limit with eventstats...
There is no real limit with eventstats, but it can be a slow command to run.
You could also use this instead - remove all lines following the 'table' command above and use this.
| eval common=coalesce(serialNumbers1,serialNumbersAll)
| sort common serialNumbersAll
| filldown productsNames
| sort serialNumbers1
| fields - common
Thank you bowesmana. your solution is great.
if someone is using this solution, just be mindful that the sort command would limit your table to 10000
sort by default will limit to 10,000, but use
| sort 0 fieldnames
that will sort the lot