Splunk Search

Using the results of one search to perform another (dynamic search results)?

rdunn
Engager

I'm relatively new to Splunk, so I'm pretty sure I'm going about this the wrong way but I have to think it's possible. Here goes...

Goal:
I'm working with two log sources: dhcpsrvlog and web_threat. dhcpsrvlog contains two pieces of data I'm interested in; dest_ip and dest_host. The web_threat logs however do not contain a hostname, just an IP. I want to take the dest_host information for specific IPs and cross reference it with the src_ip field within the web_threat logs. Essentially, I want to leverage the DHCP data to link current IPs for hosts of interest to specific web_threat logs.

Part 1:
I have successfully created the search below which displays the most recent DHCP lease for host(s) of interest using a predefined lookup table. I know it's probably not optimal, but I'm working with what I know:

sourcetype=dhcpsrvlog description=Assign OR description=Renew [|inputlookup hostwatchlist] | dedup dest_host sortby -_time | table time, dest_host, dest_ip, description | rename time AS "Time of Lease" dest_host AS "Hostname" dest_ip AS "Assigned IP" description AS "Type of Lease"

This properly displays the information I need for situational awareness -- when did the host(s) I'm interested last receive/renew an IP lease and what is the IP?

Part 2:
Now, what I'd like to do is take the IP from the Assigned IP column for each host in the resulting table and use that data to perform a secondary search for specific types of web traffic within my web_threat logs.

For example, lets say I'm monitoring a host that was involved in a recent malware investigation, it's hostname is host.network.local which I manually input into my lookup table. I have the most recent DHCP log from my search above and it says the host's IP is 1.2.3.4. Based on the observed malware let's assume I'm interested in outbound web activity to badsite[d]com from the affected host. In order to find any such traffic from this host I'd need to perform this search:

sourcetype=web_threat src_ip=1.2.3.4 dst_hostname=badsite.com

Question:
How can I take the results from my existing dhcpsrvlog search providing me with the Assigned IP and dynamically search the web_threat logs for this traffic as seen in the above search? I'd like the results to display the dest_host field from the DHCP logs if possible, but it's not a requirement. The primary goal is for the search to dynamically change if/when the host of interest is assigned a new IP (i.e. insert the new Assigned IP into the src_ip condition of the search).

I have considered using join in an attempt to relate the two logs using internal searches, but I haven't had any success (admittedly, join is still a bit confusing). I'm confident this can be done but I'm either missing something very simple, or what I am attempting requires tokens and/or data models, neither of which I've had time to play with.

Any assistance or recommendations are greatly appreciated!

1 Solution

musskopf
Builder

Hi Rdunn,

I believe you're in the right way... join would be the command here. Let's see if I can build a search command using the information you provide in your question. First let's see if I'm taking the right assumptions:

The search below will list the DHCP events, containing thedest_ip(s)** used by the Host(s) you're investigating:**

sourcetype=dhcpsrvlog description=Assign OR description=Renew [|inputlookup hostwatchlist] | dedup dest_host sortby -_time

Now to join it with the web_threat events you need first to have the a field name in common. Apparently the joining key here is the Host's IP, in this case dest_ip from DHCP log and src_ip from Webthreat logs.

Here how you could join both:

sourcetype=dhcpsrvlog description=Assign OR description=Renew [|inputlookup hostwatchlist] | dedup dest_host sortby -_time
| join type=inner max=0 dest_ip
[
search sourcetype=web_threat dst_hostname=badsite.com | rename src_ip AS dest_ip
]
| table _time, dest_host, dest_ip, dst_hostname

*Note that I renamed the field src_ip.

Just to give you a bit of explanation, the join command needs two things to match events between the searches: same field name and same field content - both are CASE-SENSITIVE! The type=right tells to list only values that have matches... you could use type=left if you wish to list events from DHCP logs independent if they have or don't have matching events from the join operation. The max=0 tells to match all events, the default is 1. Also the join command by default overwrite the fields, in other word, the search inside the join will return, for example _time field and this will be used at the output, overwriting the _time from the first search.

As a suggestion you might want have the dst_hostname as another lookup to make you like easier instead to add it directly to the search.

Anyway, hope I was able to help.

Cheers

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

join kind of works, but is much less efficient than using a subsearch with a return in most cases. You are already using a subsearch in one of your searches. It's okay to have a subsearch within a subsearch. Something that should work for you would be:

sourcetype=web_threat dst_hostname="badsite.com" [ sourcetype=dhcpsrvlog description=Assign OR description=Renew [ inputlookup hostwatchlist] | dedup dest_host | return 10000 src_ip=dest_ip ]

return 10000 says to query for up to 10000 results from the subsearch (this is the max), and src_ip=dest_ip says to use the value of dest_ip as src_ip in the outer search.

This does lose you the dest_host, but you could recover that by adding a | lookup hostwatchlist dest_ip AS src_ip OUTPUT dest_host AS src_host (which also renames dest_host to src_host, which makes more sense in this context. So:

sourcetype=web_threat 
dst_hostname="badsite.com"
[ sourcetype=dhcpsrvlog
  description=Assign OR description=Renew 
  [ inputlookup hostwatchlist]
  | dedup dest_host
  | return 10000 src_ip=dest_ip ]
| lookup hostwatchlist dest_ip AS src_ip OUTPUT dest_host AS src_host

However, this will lose you the "type of lease", and you can't get that back as easily. If you need it, then a join will do it, or you should do it in two steps by writing the first results to a lookup.

musskopf
Builder

Hi Rdunn,

I believe you're in the right way... join would be the command here. Let's see if I can build a search command using the information you provide in your question. First let's see if I'm taking the right assumptions:

The search below will list the DHCP events, containing thedest_ip(s)** used by the Host(s) you're investigating:**

sourcetype=dhcpsrvlog description=Assign OR description=Renew [|inputlookup hostwatchlist] | dedup dest_host sortby -_time

Now to join it with the web_threat events you need first to have the a field name in common. Apparently the joining key here is the Host's IP, in this case dest_ip from DHCP log and src_ip from Webthreat logs.

Here how you could join both:

sourcetype=dhcpsrvlog description=Assign OR description=Renew [|inputlookup hostwatchlist] | dedup dest_host sortby -_time
| join type=inner max=0 dest_ip
[
search sourcetype=web_threat dst_hostname=badsite.com | rename src_ip AS dest_ip
]
| table _time, dest_host, dest_ip, dst_hostname

*Note that I renamed the field src_ip.

Just to give you a bit of explanation, the join command needs two things to match events between the searches: same field name and same field content - both are CASE-SENSITIVE! The type=right tells to list only values that have matches... you could use type=left if you wish to list events from DHCP logs independent if they have or don't have matching events from the join operation. The max=0 tells to match all events, the default is 1. Also the join command by default overwrite the fields, in other word, the search inside the join will return, for example _time field and this will be used at the output, overwriting the _time from the first search.

As a suggestion you might want have the dst_hostname as another lookup to make you like easier instead to add it directly to the search.

Anyway, hope I was able to help.

Cheers

View solution in original post

rdunn
Engager

Couldn't comment on the accepted answer (apparently I need more points) so I'm commenting here.

Thanks for the assistance, musskopf! With a small amount of tweaking I was able to get exactly what I wanted. Here is the search I ended up with:

 sourcetype=dhcpsrvlog description=Assign OR description=Renew [|inputlookup HostWatchList] | dedup dest_host sortby -_time | join type=inner max=0 dest_ip [search sourcetype=pan_threat dst_hostname=badsite.com | rename src_ip AS dest_ip] | dedup dst_hostname sortby -_time | table _time, dest_host, dest_ip, dst_hostname

I added the dedup dst_hostname sortby -_time portion to the search you provided. This simply provides the most recent hit per destination as opposed to the entire list. I may decide it is necessary to see them all later, but for now I just want to see unique domains contacted. join worked like a champ for what I needed. Thanks again!

Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.