Splunk Search

How to match a list of URL strings from a CSV file against indexed data if there is no extracted URL field in my events?

ejwade
Contributor

Against my events, I am trying to match a long list (2000 records) of malicious URL strings (e.g., hereisavirus.com) stored in a CSV file. One caveat - I do not have a "field" for URL in my events, so I am not able to use inputlookup and cross directly with a generated field.

Is there simple way to search the whole event in Splunk using a CSV file?

Thank you.

0 Karma
1 Solution

sundareshr
Legend

You could extract the URL into a field and then use (in)lookup to compare. Here is a very generic way you could extract the URL into a field

your base search | rex field=_raw "(?<URL>https?:\/\/(?:www\.|(?!www))[^\s\.]+\.[^\s]{2,}|www\.[^\s]+\.[^\s]{2,})" | lookup viruslist.csv URL AS URL OUTPUT someotherfield

This is not guaranteed to catch ALL URL patterns. Will need to see sample events to improve the probability of a match

View solution in original post

0 Karma

sundareshr
Legend

You could extract the URL into a field and then use (in)lookup to compare. Here is a very generic way you could extract the URL into a field

your base search | rex field=_raw "(?<URL>https?:\/\/(?:www\.|(?!www))[^\s\.]+\.[^\s]{2,}|www\.[^\s]+\.[^\s]{2,})" | lookup viruslist.csv URL AS URL OUTPUT someotherfield

This is not guaranteed to catch ALL URL patterns. Will need to see sample events to improve the probability of a match

0 Karma

ejwade
Contributor

Thank you, sundareshr.

So, I had created a custom Field extraction using the wizard:

^[^/\n]*/\d+\s+\d+\s+\w+\s+(?P[^ ]+)

When I run my base search, the field shows up.

I can also list my lookup table with the following command:

| inputlookup CCIC_URL.csv | rename Bad_URLs as destination_url | fields + destination_url

However, when I put them together using this search string:

base search | [| inputlookup CCIC_URL.csv | rename Bad_URLs as destination_url | fields + destination_url] | table _time, destination_url

I get the following error:

Redex: invalid UTF-8 string

The search job has failed due to an error.

Any thoughts on this issue?

0 Karma

ejwade
Contributor

Nevermind - figured it out. My data had characters that weren't translating correctly, when inputlookup looks for literals.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...