Splunk Search

Query to create a table to include fields from 2 different sourcetypes where values of field1 from sourcetype 1 is equal to field2 from sourcetype 2

dmenon84
Path Finder

I have 2 sourcetype sourcetype="pan:traffic" and sourcetype="pan:threat"

I want to write a splunk query to find events from sourcetype="pan:traffic" where session_end_reason="threat" and get the src_ip from there and then match it with dest_ip in sourcetype="pan:threat"

here are my logs -

sourcetype="pan:threat" - my dest_ip is 1.1.2.2 and my src_ip value is 46.30.46.196 and GrmBL2Lnhwx.dll is the filename of the executable that user requested for download

<13>Apr 12 08:17:02 UDC-5060.gadm : 1,2016/04/12 08:17:01,0008C100946,THREAT,file,1,2016/04/12 08:17:01,46.30.46.196,1.1.2.2,0.0.0.0,0.0.0.0,URL_Global_Unknown_Continue-EXE,,,flash,vsys1,Untrust,Trust,ethernet1/24,ethernet1/23,Panorama-Logging,2016/04/12 08:17:01,34250303,1,80,65077,0,0,0x0,tcp,forward,"GrmBL2Lnhwx.dll",Microsoft PE File(52060),any,low,server-to-client,18489594034,0x0,RU,US,0,,0,,,3,,,,,,,,0

Now our firewall is configured such that when a user is initiating an executable download it will bring up a splash screen asking user if they are sure file is safe and want to continue. If they hit continue then session_end_reason has value of "threat" in sourcetype="pan:traffic" logs.

Log from sourcetype="pan:traffic" and now my src_ip is 1.1.2.2 and dest_ip is 46.30.46.196 so the values got swapped

<14>Apr 12 08:18:29 UDC-5060.gadm : 1,2016/04/12 08:18:29,0008C100946,TRAFFIC,end,1,2016/04/12 08:18:29,1.1.2.2,46.30.46.196,0.0.0.0,0.0.0.0,URL_Global_Unknown_Continue-EXE,,,flash,vsys1,Trust,Untrust,ethernet1/23,ethernet1/24,Panorama-Logging,2016/04/12 08:18:29,34250303,1,65077,80,0,0,0xf,tcp,allow,85026,2496,82530,84,2016/04/12 08:12:11,96,business-and-economy,0,128310938069,0x0,US,RU,0,24,60,threat

i want to find out the IP address of users who were prompted with a splash screen and decided to click continue . Here is the query I have written it is returning results but they are not exactly accurate.

sourcetype="pan:traffic" session_end_reason="threat" | join src_ip [search sourcetype="pan:threat" | rename dest_ip as src_ip ] | table _time,dest_ip,dest_location,src_ip,src_location,filename

Appreciate your help !

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

The search should look something like this:

  (sourcetype="pan:traffic" session_end_reason="threat") OR sourcetype="pan:threat"
| eval match_ip = if(sourcetype=="pan:traffic",src_ip,dest_ip)
| eval match_location = if(sourcetype=="pan:traffic",src_location,dest_location)
| eval other_ip = if(sourcetype=="pan:traffic",dest_ip,src_ip)
| eval other_location = if(sourcetype=="pan:traffic",dest_location,src_location)
| stats min(_time) as _time list(match_location) as match_location list(other_ip) as other_ip list(other_location) as other_location list(filename) list(sourcetype) as sourcetype by match_ip
| search sourcetype="pan:traffic" sourcetype="pan:threat"

That last bit is key, throw out things from only one sourcetype - non-matches.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

The search should look something like this:

  (sourcetype="pan:traffic" session_end_reason="threat") OR sourcetype="pan:threat"
| eval match_ip = if(sourcetype=="pan:traffic",src_ip,dest_ip)
| eval match_location = if(sourcetype=="pan:traffic",src_location,dest_location)
| eval other_ip = if(sourcetype=="pan:traffic",dest_ip,src_ip)
| eval other_location = if(sourcetype=="pan:traffic",dest_location,src_location)
| stats min(_time) as _time list(match_location) as match_location list(other_ip) as other_ip list(other_location) as other_location list(filename) list(sourcetype) as sourcetype by match_ip
| search sourcetype="pan:traffic" sourcetype="pan:threat"

That last bit is key, throw out things from only one sourcetype - non-matches.

lguinn2
Legend

This might be better. It will certainly be faster

(sourcetype="pan:traffic" session_end_reason="threat") OR sourcetype="pan:threat"
| eval match_ip = if(sourcetype=="pan:traffic",src_ip,dest_ip)
| stats list(dest_ip) list(dest_location) list(src_ip) list(src_location) list(filename) by match_ip _time sourcetype
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

It feels to me as if you should drop _time sourcetype from the list of group by fields, else no matches are ever going to be grouped together.

0 Karma

lguinn2
Legend

excellent point, because the chances of multiple events occurring at exactly the same is very low

0 Karma

dmenon84
Path Finder

Hi,

This search giving me lots of false positives, for example sourcetype="pan:traffic" session_end_reason="threat" returns only 15 events from say 8:00 AM - 9:00 AM today but the entire search returns 1000+ events including the case where session_end_reason!="threat"

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...