Thanks for the clarification!
join is the most efficient method that I know of for joining the two data sets in a "real-time" manner.
If you had a specific time period of data, you could use a lookup table and that would be more resource efficient. For example, if you were conducting a forensic investigation into a system and had a timeline of processes that ran with PID and PPID then you could run one query to create the lookup table and one query to get your results.
It would look like this:
CREATE LOOKUP TABLE
| makeresults
| eval sourcetype="foo",ComputerName = "homepc", FileName="example.exe",PID="3333",PPID="2222"
| append
[| makeresults
| eval sourcetype="foo",ComputerName = "homepc", FileName="parent.exe",PID="2222",PPID="1111"]
| append [| makeresults
| eval sourcetype="foo",ComputerName = "homepc", FileName="grandparent.exe",PID="1111",PPID="0"]
| stats values(FileName) AS FileName by PID
| outputlookup foo_bar_data.csv
SEARCH FOR DATA
| makeresults
| eval sourcetype="foo",ComputerName = "homepc", FileName="example.exe",PID="3333",PPID="2222"
| append
[| makeresults
| eval sourcetype="foo",ComputerName = "homepc", FileName="parent.exe",PID="2222",PPID="1111"]
| append [| makeresults
| eval sourcetype="foo",ComputerName = "homepc", FileName="grandparent.exe",PID="1111",PPID="0"]
| lookup foo_bar_data.csv PID AS PPID OUTPUTNEW FileName AS Parent_FileName
This might be a better approach depending on your exact use case.
... View more