Splunk Enterprise Security

Splunk Enterprise Security: Does anyone have an extraction regex example for the threat intelligence download manager?

panovattack
Communicator

Does anyone have an example of how to use the extraction regex in the threat intelligence download manager?

0 Karma

gijsvdwoerd
New Member

Hi Kchamplin/splunk community,

great stuff explaining how to on board 3rd party threat's. I have a question however, we managed to set-up with the above steps and the threat is now visible in the threat artifact list but it doesn't read the csv file. When i search the _internal logs is see the messages: status="no_checkpoint_data" any ideas why i get this message?

0 Karma

gijsvdwoerd
New Member

Hi Kchamplin,

thanks for your fast response, in the threat artifact list the count is 0 for the stanza that I added together with the status="no_checkpoint_data" therefore I derived that the csv is not read (correctly). Below the stanza from the inputs.conf:

[threatlist://mythreatlistname]
delim_regex = ,
description = mythreatlistname
disabled = 0
fields = discription:$7,ip:$24
ignore_regex = (^#|^\s*$)
interval = 300
retries = 3
retry_interval = 60
skip_header_lines = 0
timeout = 30
type = threatlist
url = lookup://ts_lookup_mythreatlistname
weight = 1
0 Karma

robert_miller
Path Finder

Hey @gijsvdwoerd, did your error go away after making the change suggested by kchamplin? I am seeing this error also appearing and only for any threatlists that are local.

0 Karma

kchamplin_splun
Splunk Employee
Splunk Employee

Ah there it is - you need to set your "skip_header_lines" to 1, not 0 - if you're using a lookup.

0 Karma

kchamplin_splun
Splunk Employee
Splunk Employee

Hey @gijsvdwoerd
Can you explain a bit more when you say "it doesn't read the csv file"? If you share your inputs.conf stanza that contains your threat intel input, (the stanza looks like [threatlist://yourthreatlistname] ) I can see what might be happening. The checkpointing reference may be an artifact of the threat download mod-input, and could possibly be ignored if the actual IOCs are showing up in the proper threatlist.

0 Karma

kchamplin_splun
Splunk Employee
Splunk Employee

The Threat Intelligence Framework provides a modular input (Threat Intelligence Downloads) that handles the majority of configurations typically needed for downloading intelligence files & data. To access this modular input, you simply need to create a stanza in your Inputs.conf file called “threatlist”.

A very basic example is as follows:

[threatlist://myfirstintel]
delim_regex =
description = My First Threat Tofu
disabled = true
extract_regex = ^(\S+)\t+(\S+)\t+\S+\t+\S+\t*(\S*)
fields = ip:"$1-$2",description:"$3"
ignore_regex = (^#|^\s*$|^Start)
type = threatlist
url = http://home.dishwishy.com/threatofu.txt

This represents an example of “ip_intel” (IP address and domain) threat intelligence. Keep in mind that this is not CSV based so we will create the extraction regexs. Each stanza key provides the following functionality:

url: the download location for the intelligence data. Note that both GET and POST are supported (see appendix).
type: this can be a custom name, and is typically an arbitrary value. There are additional types that will be discussed later.
ignore_regex: regular expression that is used to ignore specific lines of text, for example in the case of commented lines that you do not want Threat Framework to look at. In this case it ignores any lines that start with the “#” character and other anchored elements (for this example, each item is pipe separated).
fields: This maps the specific threat intelligence fields (specific to the area of intelligence) to the extracted fields. The values prefixed with “$” reference the capture groups from the “extract_regex” key.
extract_regex: This is the regular expression used to extract the specific values to match with the Threat Framework expected values. This is used if there is not a consistent delimiter in the source file. NOTE: In the above sample, the data is tab separated, but is not consistent enough to use the “delim_regex”.
disabled: enabled or disabled – this can be set via the Splunk GUI. It’s best practice to leave as “disabled” and allow the end-user to enable it. Note: This process will create a new “local” directory with its own inputs.conf that will then show as enabled.
description: A description of what the intel and and what it does (shows in the mod-input)
delim_regex: This can be used in lieu of the “extract_regex” key if your intel is well formed/already delimited such as CSV files. This then auto extracts to the “$” prefixed values.

NOTE: the following key defaults for the “threatlist” stanza:
delim_regex - (defaults to ",") – NOTE: you need to
extract_regex - (no default)
fields - (defaults to "description:$1,ip:$2")
ignore_regex - (defaults to "(^#|^\s*$)")
skip_header_lines - (defaults to 0)

To understand how this is then parsed and ingested into Splunk ES, please see the following diagram. NOTE: This is not comma-separated (CSV) source data, so we are relying on the “extract_regex” to create the capture groups; which are highlighted in red, green and blue.
alt text

The capture groups denoted in red, blue and green are mapped in left to right order to the field mappings. In other words, $1 = red, $2 = blue, $3 = green. Those tokens are then used to map to the fields that exist in the Threat Intelligence Framework for both KVStore and CSV backed lookups.

shakedunay
New Member

Regarding the "fields" input:

  1. where can I view the parsed description(in the above example "myfirstintel.todu") for a specific IP? I want to view it for a created "notable event" \ some other place.
  2. can I add more fields besides "ip" and "description"? can I add custom fields? where are those fields defined?
0 Karma

panovattack
Communicator

Thanks! If the intelligence I am trying to parse does not have multi-lines, is there a way to new lines with REGEX?

The each record comes in as {record},{record} as one long string...its seems that the splunk threat intelligence capability wants a document with new lines.

0 Karma

kchamplin_splun
Splunk Employee
Splunk Employee

That's right, it expects line-oriented data. This might require you to take an intermediary step of writing the data to a lookup (KVStore or flat-file) first and then listing the input for the framework as a lookup.
Try having a read of this: https://splunk.box.com/s/vuyojuqsxra1c9s9g5uqa7cesymx8viz
That should maybe shed some light on options to deal with that case.

0 Karma