topic Re: Does Enterprise Security Threat Intelligence download feeds support normal web page input in Security

Does Enterprise Security Threat Intelligence download feeds support normal web page input

splunkmachine — Wed, 30 Sep 2020 00:59:51 GMT

We have a Tor threat intelligence feed that we require to add to Splunk Enterprise.

The intelligence feed is from dan . me . uk / tornodes

The format of the page is typically html followed by a starting tag _BEGIN_TOR_NODE_LIST

Does Splunk Enterprise Threat Intelligence download feeds support a HTML type of input ?

TOR Node List

This page contains a full TOR nodelist (updated at Mon Jun 17 19:31:39 BST 2019) in the format below.
There are tags of BEGIN_TOR_NODE_LIST and END_TOR_NODE_LIST for easy scripting use of this page.

You can also fetch https://www.dan.me.uk/torlist/ (FULL) or https://www.dan.me.uk/torlist/?exit (EXIT only) for a list of ips only, one per line - updated every 30 minutes. Ideal for constructing your own tor banlists.

NOTE: This is a FULL list including more than just exit nodes. If you only wish to block exit nodes you NEED to process the list to include only flags E and/or X!
You WILL upset people if you block the full list as many nodes do not permit exit.

|||||||
Total number of nodes is: 7680

192.2.1.200|hidden|9001|0|RV|64883|Tor 0.3.5.8|
192.0.1.168|hidden2|80|0|EFRDV|195782|Tor 0.3.5.7|decsription goes here

Internal IP's given for privacy reasons

Trying to use regular expressions to extract the fields fails

(?<ip>^\d{1,3}.\d{1,3}\.\d{1,3}.\d{1,3})\|(?<name>\w+)\|(?<directoryPort>\d+)\|(?<routerPort>\d+)\|(?<flags>\w+)\|(?<uptime>\d+)\|(?<version>\w+\s+\S+)\|(?<contactInfo>[a-zA-Z&]\w+.*)?\<br\s+\/\>

I've tried stating the number of lines to skip on the page and tried changing delimiter but it still comes back with parsing failure in the threat management log.

Re: Does Enterprise Security Threat Intelligence download feeds support normal web page input

DavidHourani — Tue, 18 Jun 2019 08:43:21 GMT

Hi @splunkmachine,

Yes sir you can !

Here's the guide on how to add a webpage as a threat intel source for ES :
https://docs.splunk.com/Documentation/ES/5.3.0/Admin/Downloadthreatfeed#Add_a_URL-based_threat_source

Let me know if you're stuck somewhere when walking through it.

Cheers,
David

Re: Does Enterprise Security Threat Intelligence download feeds support normal web page input

splunkmachine — Tue, 18 Jun 2019 19:02:30 GMT

Hi David

I've added quite a few URL based intelligence feeds which are typically a web page of IP's however, as my original post yes I'm stuck as I get parsing errors.

I've followed the instructions.
Here's the guide on how to add a webpage as a threat intel source for ES :
https://docs.splunk.com/Documentation/ES/5.3.0/Admin/Downloadthreatfeed#Add_a_URL-based_threat_source

I've tried the following to extract the fields.
(?^\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})|(?\w+)|(?\d+)|(?\d+)|(?\w+)|(?\d+)|(?\w+\s+\S+)|(?[a-zA-Z&]\w+.*)?\

And listed the fields

I've tried using regular expressions to extract the fields, I've also tried to use a separator.
The download feed consists of 8 fields seperated by '|' symbol which start at line 155 in the web page.
The web page consists of html and each line consisting of the six fields has the following html
'

The fields are:
|||||||

Eight field is optional.

I've tested listing the fields in the notation as documented:
:$,.$
ip:$1,description:domain_blocklist

Checking the threat management log I see parsing failure.

Re: Does Enterprise Security Threat Intelligence download feeds support normal web page input

DavidHourani — Tue, 18 Jun 2019 19:23:00 GMT

What about this as a regex :

  (?<ip>^\d{1,3}.\d{1,3}\.\d{1,3}.\d{1,3})\|(?<name>\w+)\|(?<directoryPort>\d+)\|(?<routerPort>\d+)\|(?<flags>\w+)\|(?<uptime>\d+)\|(?<version>\w+\s+\S+)\|(?<description>[a-zA-Z&]\w.+)

Is it giving you anything ?

Re: Does Enterprise Security Threat Intelligence download feeds support normal web page input

splunkmachine — Wed, 19 Jun 2019 20:48:12 GMT

Hi David

I tried your suggestion above which I tried also originally still parsing errors.

I went back to my originally
regex: (?^\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})|(?\w+)|(?\d+)|(?\d+)|(?\w+)|(?\d+)|(?\w+\s+\S+)|(?[a-zA-Z&]\w+.*)?\

and removed delimiter this time setting fields
to ip:$1, description:"DAN_TOR-$3-$4-$5"

This worked!

Re: Does Enterprise Security Threat Intelligence download feeds support normal web page input

gbeatty — Tue, 03 Sep 2019 20:40:15 GMT

Is there any chance you could post the full configurations for this?