Splunk Search

Subsearch event correlation trough multiple index's and sourcetypes and keep results.

Meloknight
New Member

Hi guys,

I'm trying to create a query for a phishing mail tracking dashboard. The problem that i'm facing is, that a correlation has to be done trough the different index's and sources to get all the relevant information for one event.

One step is to search for a specific sender in the "filterlog" sourcetype, take the corresponding "message_session_id" from the results and run a new search on the same index and sourcetype to get all relevant data for a specific email. Unfortunately the infos like attachment file name, file type and action are distributed trough different events, which are identifiable trough the unique "message_session_id".

Therefore, this query gives me the results that i need:

 index=A sourcetype=filterlog
    [search index=A sourcetype=filterlog src_user="A@gmail.com" | return $message_session_id]

The next step would be, to search on a second index B in the sourcetype "sentrion" for additional data that is not available on index A. The common filed on both indexes would be the "message_id" field. At the end, i want to get one event out off the data of the different indexes and data sources for one email. But i did not figured it out, how to keep the results from the subsearches. I tried join, union and append without success. Every time when i use join or append, the query gives me back way more results that expected.

I tried something like this, but without success, because i'm losing the results from the subsearches and i didn't figured it out, how to integrate the return function properly, to search for all the related infos for one specific email in the "filterlog" sourcetype based on the "message_session_id":

     index=A sourcetype=filterlog
        [ search index=msgbackbone sourcetype=filterlog:pps:extmail 
        [ search index=B sourcetype=sentrion sender="A@gmail.com" | fields + message_id ] | fields + message_session_id ]
| stats values(_time) as _time, values(recipient) as recipient, values(subject) as subject, values(qid) as qid, values(action) as action, values(file_name) as file_name, values(file_type) as file_type by message_id sender

What would be the right way to address this? Should i use join or append to keep the results? Or should i use the transaction function?

Please excuse, my knowledge is quite limited in Splunk, some hints would be very appreciated.

Thanks in advance and Cheers,
Joel

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi Meloknight,
I cannot test the solution because I haven't you data, but i think that you should try something like this:

(index=A sourcetype=filterlog) OR (index=B sourcetype=sentrion) [search index=A sourcetype=filterlog src_user="A@gmail.com" | fields message_session_id]
| stats earliest(_time) AS _time, values(recipient) as recipient, values(subject) as subject, values(qid) as qid, values(action) as action, values(file_name) as file_name, values(file_type) as file_type by message_id sender

I suggest to use earliest (or latest) for _time to have only one value for timestamp.

Bye.
Giuseppe

0 Karma

Meloknight
New Member

Hi Giuseppe,

Thank you very much for your fast reply! Looks good so far, what is missing now, are the results from the sourcetype "sentrion" in Index B. In the end results that i get, i can only see events from the sourcetype "filterlog" in Index A, but not the results from Index B. Is there a way to make that possible? To keep the results from Index B?

Regards,
Joel

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi Meloknight,
at first, check if the fieldnames are correct and message_id has the exact fieldname in both the indexes (fieldnames are case sensitive!): I see sometimes message_session_id and sometimes message_id !
in this case, you have to rename the wrong one, having the same fieldname (probably this is the problem!).

Then verify if someone of the message_ids of indexA are in indexB to be sure to have some result.
You can check this using something like this:

(index=B sourcetype=sentrion) [search index=A sourcetype=filterlog src_user="A@gmail.com" | fields message_session_id]

In addition check the number of results of the subsearch because there's a limit of 50,000 values in subsearch results:

 index=A sourcetype=filterlog src_user="A@gmail.com"
| stata count BY message_id

Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...