I have a search created below to only detect local ip intel specified manually by the user:
| tstats min(_time) as firstSeen max(_time) as lastSeen count from datamodel="Threat_Intelligence"."Threat_Activity" where Threat_Activity.threat_key=local_ip_intel by Threat_Activity.weight Threat_Activity.threat_match_value Threat_Activity.threat_match_field Threat_Activity.src Threat_Activity.dest Threat_Activity.orig_sourcetype Threat_Activity.threat_collection Threat_Activity.threat_collection_key | rename Threat_Activity.* as * | join type=left threat_match_value [| inputlookup local_ip_intel.csv | rename ip as threat_match_value description as desc | fields threat_match_value desc]
My goal here is to specify a description next to each local ip threat match to ease up the analysis and specify a reason as to why the intel was inserted there in the first place. The search works properly and when sourcetypes are searched the results do actually show up where they match the local ip intel, however it shows more than what was specified in the local ip list which it is seen trying to match ip's that are not even in the list and so no description can be joined in the result.
PS: The SubSearch does show me all the correct IPs I have manually added
Would appreciate anyone showing me where I am actually going wrong.
Don't use joins in Splunk unless there is no alternative - (almost always is).
You search is searching all data. Your join is enriching that data from the lookup with desc.
So, if you want only those IPs where threat_match_value is one from the lookup, then you can either filter in the tstats with a where statement+subsearch
| tstats... where [ | inputlookup local_ip_intel.csv | fields ip | rename ip as Threat_Activity.threat_match_value ]
or lookup/filter after the tstats
| tstats ... | rename Threat_Activity.* as * | lookup local_ip_intel.csv ip as threat_match_value OUTPUT description as desc | where isnotnull(desc)
Yes, it truly makes sense and did come up to my mind as I was modifying the query in which I did end up changing it to the final method you were mentioning. However, I was wondering why even after specifying the threat key I have incidents triggering notables on IPs that are not even in the specified list or threat key.
So, you are saying that there is a row of data in the results, where the threat_match_value field is an IP address that does NOT exist in the lookup, but there IS a desc field?
The Join at the end of the search cannot add a description to the related threat match because the threat match IP does NOT exist in the lookup (threat key), yet still even though the IP is not included in the local_ip_intel threat key, results do show up in my query.
Hopefully it is clearer this time
Can you clarify what fields are in your lookup - you refer to threat_key in your comment, but it seems like you have two fields ip and description.
Please also post an example of the data
Yes, the lookup does have 3 fields actually, the ip, short description and weight just like a regular splunk threat intel lookup list.
Im sorry but I cannot post sensitive data.
I am a little confused, as you say you are doing a join, but earlier you said you changed to use the second form of my previous post.
Anyway, let me give an example
if your lookup contains 3 fields based on your reply (I have assumed when you call the field "short description", it is "description" from your original post?
and you are saying that your first query PRIOR to any lookup gives you a table with two rows, where there is a field "threat_match_value" of 10.1.2.3 in the first row and 10.2.3.4 in the second. If you then do the following
| lookup local_ip_intel.csv ip as threat_match_value OUTPUT description as desc | where isnotnull(desc)
then you are saying that you still have two rows, with the first one containing a field called desc corresponding to the lookup ip of 10.1.2.3 and the second row of 10.2.3.4 still remains after that even though 10.2.3.4 is NOT in the lookup.
Hello, the join and specifying subsearch methods are valid options to solve this issue. My main question is why ips that are NOT specified in my threat_key (lookup) are being triggered in the first place.
This problem I feel is more related to the Threat Activity Data Model POSSIBLY still holding old IP intel that is not included in the lookup. Im looking ways on how to troubleshoot that specifically.