We have a Tor threat intelligence feed that we require to add to Splunk Enterprise.
The intelligence feed is from dan . me . uk / tornodes
The format of the page is typically html followed by a starting tag _BEGIN_TOR_NODE_LIST
Does Splunk Enterprise Threat Intelligence download feeds support a HTML type of input ?
TOR Node List
This page contains a full TOR nodelist (updated at Mon Jun 17 19:31:39 BST 2019) in the format below.
There are tags of BEGIN_TOR_NODE_LIST and END_TOR_NODE_LIST for easy scripting use of this page.
You can also fetch https://www.dan.me.uk/torlist/ (FULL) or https://www.dan.me.uk/torlist/?exit (EXIT only) for a list of ips only, one per line - updated every 30 minutes. Ideal for constructing your own tor banlists.
NOTE: This is a FULL list including more than just exit nodes. If you only wish to block exit nodes you NEED to process the list to include only flags E and/or X!
You WILL upset people if you block the full list as many nodes do not permit exit.
|||||||
Total number of nodes is: 7680
192.2.1.200|hidden|9001|0|RV|64883|Tor 0.3.5.8|
192.0.1.168|hidden2|80|0|EFRDV|195782|Tor 0.3.5.7|decsription goes here
Internal IP's given for privacy reasons
Trying to use regular expressions to extract the fields fails
(?<ip>^\d{1,3}.\d{1,3}\.\d{1,3}.\d{1,3})\|(?<name>\w+)\|(?<directoryPort>\d+)\|(?<routerPort>\d+)\|(?<flags>\w+)\|(?<uptime>\d+)\|(?<version>\w+\s+\S+)\|(?<contactInfo>[a-zA-Z&]\w+.*)?\<br\s+\/\>
I've tried stating the number of lines to skip on the page and tried changing delimiter but it still comes back with parsing failure in the threat management log.
Hi @splunkmachine,
Yes sir you can !
Here's the guide on how to add a webpage as a threat intel source for ES :
https://docs.splunk.com/Documentation/ES/5.3.0/Admin/Downloadthreatfeed#Add_a_URL-based_threat_sourc...
Let me know if you're stuck somewhere when walking through it.
Cheers,
David
Hi David
I've added quite a few URL based intelligence feeds which are typically a web page of IP's however, as my original post yes I'm stuck as I get parsing errors.
I've followed the instructions.
Here's the guide on how to add a webpage as a threat intel source for ES :
https://docs.splunk.com/Documentation/ES/5.3.0/Admin/Downloadthreatfeed#Add_a_URL-based_threat_sourc...
I've tried the following to extract the fields.
(?^\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})|(?\w+)|(?\d+)|(?\d+)|(?\w+)|(?\d+)|(?\w+\s+\S+)|(?[a-zA-Z&]\w+.*)?\
And listed the fields
I've tried using regular expressions to extract the fields, I've also tried to use a separator.
The download feed consists of 8 fields seperated by '|' symbol which start at line 155 in the web page.
The web page consists of html and each line consisting of the six fields has the following html
'
The fields are:
|||||||
Eight field is optional.
I've tested listing the fields in the notation as documented:
:$,.$
ip:$1,description:domain_blocklist
Checking the threat management log I see parsing failure.
What about this as a regex :
(?<ip>^\d{1,3}.\d{1,3}\.\d{1,3}.\d{1,3})\|(?<name>\w+)\|(?<directoryPort>\d+)\|(?<routerPort>\d+)\|(?<flags>\w+)\|(?<uptime>\d+)\|(?<version>\w+\s+\S+)\|(?<description>[a-zA-Z&]\w.+)
Is it giving you anything ?
Hi David
I tried your suggestion above which I tried also originally still parsing errors.
I went back to my originally
regex: (?^\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})|(?\w+)|(?\d+)|(?\d+)|(?\w+)|(?\d+)|(?\w+\s+\S+)|(?[a-zA-Z&]\w+.*)?\
and removed delimiter this time setting fields
to ip:$1, description:"DAN_TOR-$3-$4-$5"
This worked!
Is there any chance you could post the full configurations for this?