I have two tables.
I need to compare the values of two columns in each table.
In result, I want to receive rows from the first table only with fields, which faced in the second table.
As a result, I need to receive only two strings №2 and №3 from the first table.
I can't use subsearch because in the second table more than 2 million strings... And I need to start this search every minute.
Have you any ideas?
Thanks in advance.
Assuming there are no duplicates in Table 1, you could do something like this run-anywhere search:
| makeresults | eval col1="v11", col2="v12", col3="v13", col4="v14" | append [ | makeresults | eval col1="v21", col2="v22", col3="v23", col4="v24" ] | append [ | makeresults | eval col1="v31", col2="v32", col3="v33", col4="v34" ] | append [ | makeresults | eval col1="v41", col2="v42", col3="v43", col4="v44" ] | append [ | makeresults | eval col1="v21", col2="v22" ] | append [ | makeresults | eval col1="v31", col2="v32" ] | stats values(col3) as col3 values(col4) as col4 count by col1 col2 | where count > 1 | mvexpand col3 | mvexpand col4
Rockin' cool, @jacobevans! Thanks for the answer.
But when we adapted it for our data, we faced the new issue: the long-time request, where appending >2M events from the table 2 is taking most of the time.
Is any decision (may be architectural) for this one?
What does your base search and append look like? Ideally you would not be appending anything. It should look something like this:
(index=Table1 OR index=Table2)
It sounds like you may come from a SQL background like myself. In Splunk, you don't want to join/intersect/append/sub search. It is always preferred to make the base search as broad (yet still small) as possible and then only to further refine the results from there. Don't even think of two separate queries as different tables. Think of them as different subsets of a single parent search.