I am trying parse data from three tables. In one table I have MAC_ADDR and HOST_NAME info, the second table has MAC_ADDR IP_ADDR NEIGHBOR_ADDR PORT and the third has IF_MAC DEVICE_NAME.
The field names are as above. I use join for the first two table the following way:
search router_table | join mac_addr [ search dhcp_table ] | table mac_addr host_name neighbor_mac ip_addr port
Now I want to search table 3 having fields IF_MAC and DEVICE_NAME where I want to search (if_mac=neighbor_mac) and append device_name. I tried appendcols but I can't pass the neighbor_mac as an argument to the third subsearch. Can anyone help me figure out a way to add the result of the third search?
if you use join command, every time you run your search, you can go to take a coffe!
Splunk isn't a DB and join command must be used only if there isn't any othe solution because it's very very slow and takes many resources!
This is an usual errorof all the people (as me!) coming from relational DBs experience.
You could use stats BY key to group values.
In your case you have to use this approach two times because you have to do two joins:
index=tableA OR index=tableB | stats values(HOST_NAME) AS HOST_NAME values(IP_ADDR) AS IP_ADDR values(NEIGHBOR_ADDR) AS NEIGHBOR_ADDR values(PORT) AS PORT dc(index) AS dc_index BY MAC_ADDR | where dc_index=2 | append [ search index=indexC | rename IF_MAC AS NEIGHBOR_ADDR | eval dc_index=1 | fields NEIGHBOR_ADDR DEVICE_NAME dc_index ] | stats values(HOST_NAME) AS HOST_NAME values(IP_ADDR) AS IP_ADDR values(MAC_ADDR) AS MAC_ADDR values(PORT) AS PORT sum(dc_index) AS dc_index BY NEIGHBOR_ADDR | where dc_index=3
In this way, with the first stats you have the same result of your first join (with very less time of execution!).
Then you take only the results from both the tables (the first where condition).
Then you add the third table.
Then you make the second join (always using stats).
and use the last where condition to take only the ones present in all tables.
If NEIGHBOR_ADDR from the first stats has more than one value, you have to add
| mvexpand NEIGHBOR_ADDR
after the first stats.
This solution has only one limit: the last search (index=indexC | rename IF_MAC AS NEIGHBOR_ADDR | eval dc_index=1 | fields NEIGHBOR_ADDR DEVICE_NAME dc_index) must have less than 50,000 results, otherwise it doesn't give complete results.
Can you help me understand the use of dc_index? A bit confused on that.
in the first stats "dc_index=2" assures that the key (MAC_ADDR) is present in both the indexes.
In the second assures that it's presnt also in the third index.
In this way, you can have the first two searches in the main search (so you don't have the limitation of 50,000 results for the subsearch and you have more performaces) and check the presence of the key in both of them.
what if I use the a lookup? I thought of creating a dataset and using it. Will that be a faster search instead?
you could add this:
| join type=outer neighbor_mac [ search table3 | eval neighbor_mac=if_mac]