Splunk Search

Field extraction help - gnmap and troubleshooting

kore
Explorer

Hi there,
Hoping someone can point me in the right direction.

I'm trying to parse greppable nmap (*.gnmap) outputs for the repeated ports fields.
I've seen a few attempts at this around; the best so far being for a live search
http://splunk-base.splunk.com/answers/22979/line_breaker-for-nmap-output
So far, my attempts to convert the live search to a transform are unsuccessful.

Sample gnmap output:

Host: 10.0.0.1 (host) Ports: 21/open|filtered/tcp//ftp///, 22/open/tcp//ssh//OpenSSH 5.9p1 Debian 5ubuntu1 (protocol 2.0)/, 23/closed/tcp//telnet///, 80/open/tcp//http//Apache httpd 2.2.22 ((Ubuntu))/, 10000/closed/tcp//snet-sensor-mgmt/// OS: Linux 2.6.32 - 3.2 Seq Index: 257 IP ID Seq: All zeros

From my transforms.conf:

[ports]
REGEX = [^ ]* Ports:\s\([0-9]\{1,5\}\/[^/]*\/[^/]*\/\/[^/]*\/\/[^/]*\([\/]\)\)
DELIMS = ","
REPEAT_MATCH = true
FORMAT = port::$1 status::$2 proto::$3 daemon::$4 desc::$5

(The regex works using sed in separating the ports fields)

I see none of the fields, post indexing however, and am unable to locate how to troubleshoot this further. Other fields, such as hostname and ip address successfully extract with other transforms.

The btool is not very informative for this context, and I do see that, as of ~2 years ago, troubleshooting field extractions was a requested feature http://splunk-base.splunk.com/answers/157/feature-request-troubleshootingdebugging-for-field-extract....

Is anyone able to give me a nudge or pointer towards troubleshooting?

Thanks for any help!

0 Karma
1 Solution

kore
Explorer

Okay that was a headache, but satisfying nonetheless - in hindsight (as it always is), it was actually much more straightforward than then numerous avenues I looked into.

I was able to extract all services, ports, daemons and banners using the following setup below.
In addition, I found it useful to separate out subdomains also. Unfortunately the regexes will not work for all domains/subdomains, but YMMV.

/path/to/greppable/nmap/output.gnmap

inputs.conf

[monitor:///path/to/greppable/nmap/*.gnmap]
index = nmap
sourcetype = nmap
queue = parsingQueue
disabled = 0

props.conf

[nmap]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
TRANSFORMS-nmap = NMAPsetnull,NMAPsetparsing
EXTRACT-ip = (?i)Host: (?P<ip>[^ ]+)
EXTRACT-hostname = (?i)^[^\(]*\((?P<hostname>[^\)]+)
EXTRACT-subdomain = (?i)\(.*?\.(?P<subdomain>\w+\.\w+\.\w+\.\w+)(?=\))
EXTRACT-domain = (?i)\..*?\.(?P<domain>\w+\.\w+\.\w+)(?=\))
REPORT-ports = ports

transforms.conf

[NMAPsetnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
[NMAPsetparsing]
REGEX = Ports:
DEST_KEY = queue
FORMAT = indexQueue
[ports]
REGEX = \s(?<port>\d+)/(?<state>[^/]+)/(?<proto>[^/]+)//(?<daemon>[^/]*)//(?<banner>[^/]*)/
DEFAULT_VALUE = null
MV_ADD = TRUE

One specific note regarding greppable NMAP output that you should take care with. A very small number of services discovered by NMAP and dumped into greppable NMAP output are formatted incorrectly. e.g.:

2049/open/tcp//nfs (nfs V2-4)/(nfs:100003*2-4)/2-4 (rpc #100003)/
111/open/tcp//rpcbind/N//

It's enough to skew your results.
The backslashes are in the incorrect spot when compared with all the other NMAP discovered services.
The format should be:

2049/open/tcp//nfs (nfs V2-4)//(nfs:100003*2-4)/2-4 (rpc #100003)/
111/open/tcp//rpcbind//N/

If your network does have services that are formatted by NMAP like this, you may wish to run a find/replace over your gnmap files something like follows:

sed -i 's"2049\/open\/tcp\/\/nfs "2049\/open\/tcp\/\/nfs\//"pg' *.gnmap 
sed -i 's"111\/open\/tcp\/\/rpcbind\/N\/\/"111\/open\/tcp\/\/rpcbind\/\/N\/"pg' *.gnmap
sed -i 's"111\/open\/tcp\/\/rpcbind\s"111\/open\/tcp\/\/rpcbind\/\/"pg' *.gnmap

Which corrects the three anomalous services I discovered on the networks I look at. You may find more (drop a note here!) or you may need to adjust those regexes for sed to parse them properly.

View solution in original post

0 Karma

kore
Explorer

Okay that was a headache, but satisfying nonetheless - in hindsight (as it always is), it was actually much more straightforward than then numerous avenues I looked into.

I was able to extract all services, ports, daemons and banners using the following setup below.
In addition, I found it useful to separate out subdomains also. Unfortunately the regexes will not work for all domains/subdomains, but YMMV.

/path/to/greppable/nmap/output.gnmap

inputs.conf

[monitor:///path/to/greppable/nmap/*.gnmap]
index = nmap
sourcetype = nmap
queue = parsingQueue
disabled = 0

props.conf

[nmap]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
TRANSFORMS-nmap = NMAPsetnull,NMAPsetparsing
EXTRACT-ip = (?i)Host: (?P<ip>[^ ]+)
EXTRACT-hostname = (?i)^[^\(]*\((?P<hostname>[^\)]+)
EXTRACT-subdomain = (?i)\(.*?\.(?P<subdomain>\w+\.\w+\.\w+\.\w+)(?=\))
EXTRACT-domain = (?i)\..*?\.(?P<domain>\w+\.\w+\.\w+)(?=\))
REPORT-ports = ports

transforms.conf

[NMAPsetnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
[NMAPsetparsing]
REGEX = Ports:
DEST_KEY = queue
FORMAT = indexQueue
[ports]
REGEX = \s(?<port>\d+)/(?<state>[^/]+)/(?<proto>[^/]+)//(?<daemon>[^/]*)//(?<banner>[^/]*)/
DEFAULT_VALUE = null
MV_ADD = TRUE

One specific note regarding greppable NMAP output that you should take care with. A very small number of services discovered by NMAP and dumped into greppable NMAP output are formatted incorrectly. e.g.:

2049/open/tcp//nfs (nfs V2-4)/(nfs:100003*2-4)/2-4 (rpc #100003)/
111/open/tcp//rpcbind/N//

It's enough to skew your results.
The backslashes are in the incorrect spot when compared with all the other NMAP discovered services.
The format should be:

2049/open/tcp//nfs (nfs V2-4)//(nfs:100003*2-4)/2-4 (rpc #100003)/
111/open/tcp//rpcbind//N/

If your network does have services that are formatted by NMAP like this, you may wish to run a find/replace over your gnmap files something like follows:

sed -i 's"2049\/open\/tcp\/\/nfs "2049\/open\/tcp\/\/nfs\//"pg' *.gnmap 
sed -i 's"111\/open\/tcp\/\/rpcbind\/N\/\/"111\/open\/tcp\/\/rpcbind\/\/N\/"pg' *.gnmap
sed -i 's"111\/open\/tcp\/\/rpcbind\s"111\/open\/tcp\/\/rpcbind\/\/"pg' *.gnmap

Which corrects the three anomalous services I discovered on the networks I look at. You may find more (drop a note here!) or you may need to adjust those regexes for sed to parse them properly.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...