Against my events, I am trying to match a long list (2000 records) of malicious URL strings (e.g., hereisavirus.com) stored in a CSV file. One caveat - I do not have a "field" for URL in my events, so I am not able to use inputlookup and cross directly with a generated field.
Is there simple way to search the whole event in Splunk using a CSV file?
Thank you.
You could extract the URL into a field and then use (in)lookup to compare. Here is a very generic way you could extract the URL into a field
your base search | rex field=_raw "(?<URL>https?:\/\/(?:www\.|(?!www))[^\s\.]+\.[^\s]{2,}|www\.[^\s]+\.[^\s]{2,})" | lookup viruslist.csv URL AS URL OUTPUT someotherfield
This is not guaranteed to catch ALL URL patterns. Will need to see sample events to improve the probability of a match
You could extract the URL into a field and then use (in)lookup to compare. Here is a very generic way you could extract the URL into a field
your base search | rex field=_raw "(?<URL>https?:\/\/(?:www\.|(?!www))[^\s\.]+\.[^\s]{2,}|www\.[^\s]+\.[^\s]{2,})" | lookup viruslist.csv URL AS URL OUTPUT someotherfield
This is not guaranteed to catch ALL URL patterns. Will need to see sample events to improve the probability of a match
Thank you, sundareshr.
So, I had created a custom Field extraction using the wizard:
^[^/\n]*/\d+\s+\d+\s+\w+\s+(?P[^ ]+)
When I run my base search, the field shows up.
I can also list my lookup table with the following command:
| inputlookup CCIC_URL.csv | rename Bad_URLs as destination_url | fields + destination_url
However, when I put them together using this search string:
base search | [| inputlookup CCIC_URL.csv | rename Bad_URLs as destination_url | fields + destination_url] | table _time, destination_url
I get the following error:
Redex: invalid UTF-8 string
The search job has failed due to an error.
Any thoughts on this issue?
Nevermind - figured it out. My data had characters that weren't translating correctly, when inputlookup looks for literals.