indexA
field1 | field2 | field3 |
A | 1 | 1 |
A | 1 | 2 |
A | 1 | 3 |
A | 2 | 5 |
B | 1 | 4 |
B | 2 | 3 |
B | 3 | 2 |
C | 1 | 6 |
C | 2 | 7 |
indexB
field4 | field5 | field6 |
A | 1 | 3 |
B | 2 | 4 |
C | 1 | 5 |
C | 1 | 6 |
I want to join these 2 indexes by 2 fields (field1=field4 AND field2=field5)
Result :
field1 | field2 | field3 | field6 |
A | 1 | 1 | 3 |
A | 1 | 2 | |
A | 1 | 3 | |
A | 2 | 5 | |
B | 1 | 4 | |
B | 2 | 3 | 4 |
B | 3 | 2 | |
C | 1 | 6 | 5 |
6 | |||
C | 2 | 7 |
You can do this using stats - example with your data
| makeresults
| eval _raw="field1 field2 field3
A 1 1
A 1 2
A 1 3
A 2 5
B 1 4
B 2 3
B 3 2
C 1 6
C 2 7"
| multikv forceheader=1
| table field1 field2 field3
| eval index="A"
| append [
| makeresults
| eval _raw="field4 field5 field6
A 1 3
B 2 4
C 1 5
C 1 6"
| multikv forceheader=1
| table field4 field5 field6
| eval index="B"
]
| eval f1=coalesce(field1, field4)
| eval f2=coalesce(field2, field5)
| stats values(field3) as f3 values(field6) as f6 by f1 f2
It's best to avoid 'join' construct in Splunk - so by creating the common fields f1 and f2 which represent the fields you want to join by, you can then do stats.
Your search would do a search like this (using 'if' rather than coalesce)
search index_a_dataset OR index_b_dataset
| eval f1=if(index="A", field1, field4)
| eval f2=if(index="B", field2, field5)
| stats values(field3) as f3 values(field6) as f6 by f1 f2
You can do this using stats - example with your data
| makeresults
| eval _raw="field1 field2 field3
A 1 1
A 1 2
A 1 3
A 2 5
B 1 4
B 2 3
B 3 2
C 1 6
C 2 7"
| multikv forceheader=1
| table field1 field2 field3
| eval index="A"
| append [
| makeresults
| eval _raw="field4 field5 field6
A 1 3
B 2 4
C 1 5
C 1 6"
| multikv forceheader=1
| table field4 field5 field6
| eval index="B"
]
| eval f1=coalesce(field1, field4)
| eval f2=coalesce(field2, field5)
| stats values(field3) as f3 values(field6) as f6 by f1 f2
It's best to avoid 'join' construct in Splunk - so by creating the common fields f1 and f2 which represent the fields you want to join by, you can then do stats.
Your search would do a search like this (using 'if' rather than coalesce)
search index_a_dataset OR index_b_dataset
| eval f1=if(index="A", field1, field4)
| eval f2=if(index="B", field2, field5)
| stats values(field3) as f3 values(field6) as f6 by f1 f2
>Your search would do a search like this (using 'if' rather than coalesce)
search index_a_dataset OR index_b_dataset | eval f1=if(index="A", field1, field4) | eval f2=if(index="B", field2, field5) | stats values(field3) as f3 values(field6) as f6 by f1 f2
can I "join" to use this search sentence without "join" command?
| eval f1=if(index="A", field1, field4) | eval f2=if(index="B", field2, field5)
Can I rewrite
| eval f2=if(index="B", field2, field5)
to
| eval f2=if(index="A", field2,field5)
?
I tried the way you taught me.
index="indexA"
| table field1, field2, field3
| append [
| search index="indexB"
| table field4, field5, field6
]
| eval f1=coalesce(field1,field4)
| eval f2=coalesce(field2,field5)
| stats values(field3) as f3 values(field6) as f6 by f1 f2
But if f6 is not NULL, f3 is NULL and id f3 is not NULL, f6 is NULL 😞
( can only take one field's figure)
Without seeing f1 and f2 in this context it's not easy to see what the problem is, however, it would seem like that f1 and f2 fields are not common between the two data sets.
Note that using
search A
append search B
is bot as good as doing search A OR search B - given this simple example.
Can you provide a data sample to make it easier to provide the solution.
This is a sample.
(Both of indexes have other fields. I took out only 3 fields what I needed.)
indexA.PROCESS_NO = indexB.KOUTEI_NO
indexA.TICKET_ID = indexB.WORK_NO
indexA.csv
KOUTEI_NO | WORK_NO | INTERVAL_DIFF |
1 | F00380006 | 24 |
5 | F00280002 | 21 |
2 | F00380005 | 37 |
5 | F00390001 | 92 |
6 | F00430009 | 23 |
1 | F00380006 | 33 |
3 | F00120006 | 32 |
5 | F00280002 | 47 |
3 | F00120005 | 32 |
2 | F00380005 | 38 |
6 | F00430009 | 35 |
3 | F00120006 | 25 |
1 | F00380006 | 16 |
4 | F00380002 | 14 |
1 | F00380006 | 3 |
2 | F00380005 | 35 |
4 | F00380002 | 6 |
4 | F00380002 | 5 |
1 | F00120010 | 23 |
4 | F00380002 | 21 |
3 | F00380003 | 3 |
2 | F00380005 | 31 |
1 | F00120010 | 9 |
3 | F00380003 | 21 |
4 | F00380002 | 14 |
5 | F00280001 | 64 |
1 | F00120010 | 12 |
4 | F00380002 | 7 |
6 | F00430009 | 20 |
1 | F00120009 | 34 |
3 | F00120003 | 5 |
2 | F00380004 | 2 |
3 | F00120012 | 12 |
1 | F00380009 | 18 |
2 | F00380008 | 29 |
3 | F00120012 | 5 |
4 | F00120008 | 15 |
3 | F00120012 | 136 |
5 | F00280006 | 34 |
1 | F00120016 | 7 |
6 | F00280003 | 17 |
1 | F00120016 | 57 |
indexB.csv
PROCESS_NO | TICKET_ID | REASON_NO |
4 | F00120008 | 63 |
1 | F00120010 | 14 |
Here is the search based on that data
| makeresults
| eval _raw="KOUTEI_NO WORK_NO INTERVAL_DIFF
1 F00380006 24
5 F00280002 21
2 F00380005 37
5 F00390001 92
6 F00430009 23
1 F00380006 33
3 F00120006 32
5 F00280002 47
3 F00120005 32
2 F00380005 38
6 F00430009 35
3 F00120006 25
1 F00380006 16
4 F00380002 14
1 F00380006 3
2 F00380005 35
4 F00380002 6
4 F00380002 5
1 F00120010 23
4 F00380002 21
3 F00380003 3
2 F00380005 31
1 F00120010 9
3 F00380003 21
4 F00380002 14
5 F00280001 64
1 F00120010 12
4 F00380002 7
6 F00430009 20
1 F00120009 34
3 F00120003 5
2 F00380004 2
3 F00120012 12
1 F00380009 18
2 F00380008 29
3 F00120012 5
4 F00120008 15
3 F00120012 136
5 F00280006 34
1 F00120016 7
6 F00280003 17
1 F00120016 57"
| multikv forceheader=1
| table KOUTEI_NO WORK_NO INTERVAL_DIFF
| append [
| makeresults
| eval _raw="PROCESS_NO TICKET_ID REASON_NO
4 F00120008 63
1 F00120010 14"
| multikv forceheader=1
| table PROCESS_NO TICKET_ID REASON_NO
]
| eval f1=coalesce(KOUTEI_NO,PROCESS_NO)
| eval f2=coalesce(WORK_NO,TICKET_ID)
| stats values(INTERVAL_DIFF) as INTERVAL_DIFF values(REASON_NO) as REASON_NO by f1 f2
I have first set up the data with the 2 makeresults commands and then used the two coalesce statements to create f1 and f2 and then the stats command does the final aggregation
I could do what I wanted to do!
I owe it to you. Thank you very much.