I'm trying to create a query for a phishing mail tracking dashboard. The problem that i'm facing is, that a correlation has to be done trough the different index's and sources to get all the relevant information for one event.
One step is to search for a specific sender in the "filterlog" sourcetype, take the corresponding "message_session_id" from the results and run a new search on the same index and sourcetype to get all relevant data for a specific email. Unfortunately the infos like attachment file name, file type and action are distributed trough different events, which are identifiable trough the unique "message_session_id".
Therefore, this query gives me the results that i need:
The next step would be, to search on a second index B in the sourcetype "sentrion" for additional data that is not available on index A. The common filed on both indexes would be the "message_id" field. At the end, i want to get one event out off the data of the different indexes and data sources for one email. But i did not figured it out, how to keep the results from the subsearches. I tried join, union and append without success. Every time when i use join or append, the query gives me back way more results that expected.
I tried something like this, but without success, because i'm losing the results from the subsearches and i didn't figured it out, how to integrate the return function properly, to search for all the related infos for one specific email in the "filterlog" sourcetype based on the "message_session_id":
[ search index=msgbackbone sourcetype=filterlog:pps:extmail
[ search index=B sourcetype=sentrion sender="A@gmail.com" | fields + message_id ] | fields + message_session_id ]
| stats values(_time) as _time, values(recipient) as recipient, values(subject) as subject, values(qid) as qid, values(action) as action, values(file_name) as file_name, values(file_type) as file_type by message_id sender
What would be the right way to address this? Should i use join or append to keep the results? Or should i use the transaction function?
Please excuse, my knowledge is quite limited in Splunk, some hints would be very appreciated.
I cannot test the solution because I haven't you data, but i think that you should try something like this:
(index=A sourcetype=filterlog) OR (index=B sourcetype=sentrion) [search index=A sourcetype=filterlog src_user="A@gmail.com" | fields message_session_id]
| stats earliest(_time) AS _time, values(recipient) as recipient, values(subject) as subject, values(qid) as qid, values(action) as action, values(file_name) as file_name, values(file_type) as file_type by message_id sender
I suggest to use earliest (or latest) for _time to have only one value for timestamp.
Thank you very much for your fast reply! Looks good so far, what is missing now, are the results from the sourcetype "sentrion" in Index B. In the end results that i get, i can only see events from the sourcetype "filterlog" in Index A, but not the results from Index B. Is there a way to make that possible? To keep the results from Index B?
at first, check if the fieldnames are correct and message_id has the exact fieldname in both the indexes (fieldnames are case sensitive!): I see sometimes message_session_id and sometimes message_id !
in this case, you have to rename the wrong one, having the same fieldname (probably this is the problem!).
Then verify if someone of the message_ids of indexA are in indexB to be sure to have some result.
You can check this using something like this: