Splunk Search

How to combine two searches with common value into one table

gballanti
Explorer

I need help regarding a join from events based on different sourcetype (same index) that are related by the same value in different fields.
The logical flow starts from a bar char that group/count similar fields. With drill down I pass the 'description' by a token to the search that has to combine the search into a table.
Below a simple example:

sourcetype_A
s1_field1 = Purchase OK
s1_field2 = 9
s1_field3  = tax value
s1_field4 = Completed

sourcetype_B
s2_field1 = 9
s2_field2 = Rome
s2_field3 = Fontana di Trevi

I need to obtain a table with fields "s1_field2, s1_field1,s2_field2,s2_field3"

I'm working around this kind of search without obtaining my outcome.

sourcetype="Sourcetype_A" s1_field4="Completed" | search s1_field1=$from_token$
| join s1_field2
    [ search sourcetype="Sourcetype_B" 
    | rename s2_field1 as s1_field2 ] 
| table s1_field2, s1_field1,s2_field2,s2_field3

Thank in advance for your help.

0 Karma
1 Solution

dmarling
Builder

You'll see this answer a lot on here. You should avoid joins except for some very specific use cases. Joins are inefficient and can cause truncation of your data results, since it has a default limitation of only running for 30 seconds and then auto finalizing. I suggest doing this instead:

(sourcetype="Sourcetype_A" s1_field4="Completed") OR sourcetype="Sourcetype_B" 
| eval joiner=if(sourcetype="Sourcetype_A", s1_field2, s2_field1)
| stats values(*) as * by joiner
| where s1_field2=s2_field1

Season the above query to taste by only putting the fields you want in the third line. What this is doing is pulling in both data sets and joining them together with a stats command. This is won't run into those timing limitations that a join would and be more performant.

I edited the answer so it only returns results where those two fields match.

If this comment/answer was helpful, please up vote it. Thank you.

View solution in original post

gballanti
Explorer

thanks to everyone , the solution proposed works as expected.

0 Karma

to4kawa
SplunkTrust
SplunkTrust
(sourcetype="snow:task_sla" dv_has_breached=false dv_stage=Completed dv_sla="FW-Guasto non bloccante fuori sede no A-SLA21") OR sourcetype="snow:incident" 
| eval joiner=if(sourcetype="snow:task_sla", dv_task, dv_number) 
| stats values(dv_task) as dv_task values(dv_number) as dv_number values(dv_sla) as dv_sla by joiner
| where dv_task==dv_number

Ask, and it shall be given you.

0 Karma

gballanti
Explorer

probably my previous comment is gone ... I wanted to thank you for the answers

0 Karma

gballanti
Explorer

just an update ...
the following query

(sourcetype="snow:task_sla" dv_has_breached=false dv_stage=Completed dv_sla="FW-Guasto non bloccante fuori sede no A-SLA21") OR sourcetype="snow:incident" | eval joiner=if(sourcetype="snow:task_sla", dv_task, dv_number) | stats values(dv_task) as dv_task values(dv_number) as dv_number values(dv_sla) as dv_sla by joiner

return an unexpected behavior

alt text

I have to get only the highlighted rows, where dv_task and dv_number matches.

0 Karma

dmarling
Builder

Yep, this is an easy fix. Modify your query by adding a filter at the end to only include when they match:

(sourcetype="snow:task_sla" dv_has_breached=false dv_stage=Completed dv_sla="FW-Guasto non bloccante fuori sede no A-SLA21") OR sourcetype="snow:incident" 
| eval joiner=if(sourcetype="snow:task_sla", dv_task, dv_number) 
| stats values(dv_task) as dv_task values(dv_number) as dv_number values(dv_sla) as dv_sla by joiner
| where dv_task=dv_number
If this comment/answer was helpful, please up vote it. Thank you.
0 Karma

gballanti
Explorer

Thanks to both of you for the prompt reply.

I used the Dustin's suggestion because is more clear for my little knowledge, later will try the "selfjoin" also.
To filter the columns I used table command as last row

(sourcetype="Sourcetype_A" s1_field4="Completed") OR sourcetype="Sourcetype_B"
| eval joiner=if(sourcetype="Sourcetype_A", s1_field2, s2_field1)
| stats values(*) as * by joiner
| table s1_field2, s1_field1,s2_field2,s2_field3

is it a good way to shows the fields ?

0 Karma

dmarling
Builder

You'll see this answer a lot on here. You should avoid joins except for some very specific use cases. Joins are inefficient and can cause truncation of your data results, since it has a default limitation of only running for 30 seconds and then auto finalizing. I suggest doing this instead:

(sourcetype="Sourcetype_A" s1_field4="Completed") OR sourcetype="Sourcetype_B" 
| eval joiner=if(sourcetype="Sourcetype_A", s1_field2, s2_field1)
| stats values(*) as * by joiner
| where s1_field2=s2_field1

Season the above query to taste by only putting the fields you want in the third line. What this is doing is pulling in both data sets and joining them together with a stats command. This is won't run into those timing limitations that a join would and be more performant.

I edited the answer so it only returns results where those two fields match.

If this comment/answer was helpful, please up vote it. Thank you.

View solution in original post

Anji02
Observer

Hi,
I tried in same way, but not working

index=my_index type=outer OR type=inner
|rename msg.* AS *
| eval joiner=if(type=outer, _id, source_id)
| stats values(*) as * by joiner
| where ticket_num=TicketNum

|table event_id, serial_id

Can you please help me here @dmarling 

0 Karma

dmarling
Builder

The joiner eval doesn't have double quotes around "outer" so it's trying to say when the value in field type equals the value in field outer, then use the value in field _id.  What you want to do is wrap outer with double quotes so the eval command knows you are looking for the field type to have a value of "outer."

index=my_index (type=outer OR type=inner)
| rename msg.* AS *
| eval joiner=if(type="outer", _id, source_id)
| stats values(*) as * by joiner
| where ticket_num=TicketNum
If this comment/answer was helpful, please up vote it. Thank you.
0 Karma

woodcock
Esteemed Legend

You need selfjoin, like this:

|makeresults
| eval raw="sourcetype=sourcetype_A,s1_field1=PurchaseOK,s1_field2=9,s1_field3=taxvalue,s1_field4=Completed sourcetype=sourcetype_B,s2_field1=9,s2_field2=Rome,s2_field3=FontanadiTrevi"
| makemv raw
| mvexpand raw
| rename raw AS _raw
| kv

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

| eval joiner=coalesce(s1_field1, s2_field1)
| selfjoin joiner
0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.