Splunk Search

Query to create a table to include fields from 2 different sourcetypes where values of field1 from sourcetype 1 is equal to field2 from sourcetype 2

dmenon84
Path Finder

I have 2 sourcetype sourcetype="pan:traffic" and sourcetype="pan:threat"

I want to write a splunk query to find events from sourcetype="pan:traffic" where session_end_reason="threat" and get the src_ip from there and then match it with dest_ip in sourcetype="pan:threat"

here are my logs -

sourcetype="pan:threat" - my dest_ip is 1.1.2.2 and my src_ip value is 46.30.46.196 and GrmBL2Lnhwx.dll is the filename of the executable that user requested for download

<13>Apr 12 08:17:02 UDC-5060.gadm : 1,2016/04/12 08:17:01,0008C100946,THREAT,file,1,2016/04/12 08:17:01,46.30.46.196,1.1.2.2,0.0.0.0,0.0.0.0,URL_Global_Unknown_Continue-EXE,,,flash,vsys1,Untrust,Trust,ethernet1/24,ethernet1/23,Panorama-Logging,2016/04/12 08:17:01,34250303,1,80,65077,0,0,0x0,tcp,forward,"GrmBL2Lnhwx.dll",Microsoft PE File(52060),any,low,server-to-client,18489594034,0x0,RU,US,0,,0,,,3,,,,,,,,0

Now our firewall is configured such that when a user is initiating an executable download it will bring up a splash screen asking user if they are sure file is safe and want to continue. If they hit continue then session_end_reason has value of "threat" in sourcetype="pan:traffic" logs.

Log from sourcetype="pan:traffic" and now my src_ip is 1.1.2.2 and dest_ip is 46.30.46.196 so the values got swapped

<14>Apr 12 08:18:29 UDC-5060.gadm : 1,2016/04/12 08:18:29,0008C100946,TRAFFIC,end,1,2016/04/12 08:18:29,1.1.2.2,46.30.46.196,0.0.0.0,0.0.0.0,URL_Global_Unknown_Continue-EXE,,,flash,vsys1,Trust,Untrust,ethernet1/23,ethernet1/24,Panorama-Logging,2016/04/12 08:18:29,34250303,1,65077,80,0,0,0xf,tcp,allow,85026,2496,82530,84,2016/04/12 08:12:11,96,business-and-economy,0,128310938069,0x0,US,RU,0,24,60,threat

i want to find out the IP address of users who were prompted with a splash screen and decided to click continue . Here is the query I have written it is returning results but they are not exactly accurate.

sourcetype="pan:traffic" session_end_reason="threat" | join src_ip [search sourcetype="pan:threat" | rename dest_ip as src_ip ] | table _time,dest_ip,dest_location,src_ip,src_location,filename

Appreciate your help !

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

The search should look something like this:

  (sourcetype="pan:traffic" session_end_reason="threat") OR sourcetype="pan:threat"
| eval match_ip = if(sourcetype=="pan:traffic",src_ip,dest_ip)
| eval match_location = if(sourcetype=="pan:traffic",src_location,dest_location)
| eval other_ip = if(sourcetype=="pan:traffic",dest_ip,src_ip)
| eval other_location = if(sourcetype=="pan:traffic",dest_location,src_location)
| stats min(_time) as _time list(match_location) as match_location list(other_ip) as other_ip list(other_location) as other_location list(filename) list(sourcetype) as sourcetype by match_ip
| search sourcetype="pan:traffic" sourcetype="pan:threat"

That last bit is key, throw out things from only one sourcetype - non-matches.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

The search should look something like this:

  (sourcetype="pan:traffic" session_end_reason="threat") OR sourcetype="pan:threat"
| eval match_ip = if(sourcetype=="pan:traffic",src_ip,dest_ip)
| eval match_location = if(sourcetype=="pan:traffic",src_location,dest_location)
| eval other_ip = if(sourcetype=="pan:traffic",dest_ip,src_ip)
| eval other_location = if(sourcetype=="pan:traffic",dest_location,src_location)
| stats min(_time) as _time list(match_location) as match_location list(other_ip) as other_ip list(other_location) as other_location list(filename) list(sourcetype) as sourcetype by match_ip
| search sourcetype="pan:traffic" sourcetype="pan:threat"

That last bit is key, throw out things from only one sourcetype - non-matches.

lguinn2
Legend

This might be better. It will certainly be faster

(sourcetype="pan:traffic" session_end_reason="threat") OR sourcetype="pan:threat"
| eval match_ip = if(sourcetype=="pan:traffic",src_ip,dest_ip)
| stats list(dest_ip) list(dest_location) list(src_ip) list(src_location) list(filename) by match_ip _time sourcetype
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

It feels to me as if you should drop _time sourcetype from the list of group by fields, else no matches are ever going to be grouped together.

0 Karma

lguinn2
Legend

excellent point, because the chances of multiple events occurring at exactly the same is very low

0 Karma

dmenon84
Path Finder

Hi,

This search giving me lots of false positives, for example sourcetype="pan:traffic" session_end_reason="threat" returns only 15 events from say 8:00 AM - 9:00 AM today but the entire search returns 1000+ events including the case where session_end_reason!="threat"

0 Karma
Get Updates on the Splunk Community!

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...