My employer recently stood up the Tenable connector to Splunk and are looking to take full advantage of it. My experience in Splunk is very limited but here are the main points.
Here is the search I've been tinkering with:
index=nessus plugin_family_type="active" severity="High" OR severity="Critical" state="Open" OR state="Reopened" | rex field=netbiosName "\\\(?[^.]+)" | eval unique_id=if(like(plugin_family,"%Windows%"),CombinedName+plugin_id, ip+plugin_id) | eval hostvalue=if(like(CombinedName, "*"),CombinedName, ip) | stats dc(unique_id) as totalvulns, dc(hostvalue) as totalhosts | eval totalvulns=totalvulns-1, vuln_per_host=totalvulns/totalhosts | fields totalvulns totalhosts vuln_per_host
Question 1 for you to think about; I am Teaching to Fish
1. How is your scans setup, frequency (active, passive, agent).?
2. What is the scan window you are looking to search and roll-up numbers?
3. What connector is used to get scan results in?
(a) Splunk Add-on for Tenable (per scan is brought in lacks some fields, scan name, 18 months since I used/tested)
(b) Tenable Add-on for Splunk (analysis data is brought in lacks some fields, 18 months since I used/tested)
(c) A variant of the first, some US Gov customers is using; also al fields would need to rollup clean depending on example.
If there is a chance you are to have an asset scanned more than once within the time window of the search to be run; you need to be able to find the latest scan per asset within the time window and only compute based on the latest scan.
I am going to show you how I do this with my connector; notice the field alias will be different than your example:
# Here is my code to extract variables for hostname and domainName, I used assetName to allow me to see the new names versus the existing fields so I can validate effectiveness of my grouping
I also make sure lowercase as stats need exact match in the by clause, I take nothing to chance.
I did use your example as the base search:
If you run STIG/Complaince scan in addition to Vuln scanning ; you need to consider them a separate bucket and base search should be further refined to exclude them as a separate scoring methodology search should be done to track them (If not an issue; forget I said it, don't need to overwhelm you
index=nessus plugin_family_type="active" severity="High" OR severity="Critical" state="Open" OR state="Reopened"
| rex field=dnsName "^(?<dns_short>[a-z\-0-9A-Z_]+)|\.(?<domain_short>[a-z\-0-9A-Z_]+)\."
| rex field=netbiosName "(?<netbdomain_short>[a-z\-0-9A-Z_]+)[\\\](?<netbname_short>[a-z\-0-9A-Z_]+)"
| eval assetName = lower( coalesce(netbname_short, dns_short))
| eval domainName = lower( coalesce(netbdomain_short, domain_short))
You mentioned fall back to IP; unless you have statics avoid all possible. Get all the IoT crap in DNS, use dhcp reservations, static ip's for that stuff; (it is going to skew your numbers. Consider ignoring it until you can figure out how best to track it. Also if you have some cloud stuff they are going to have ip dash notation named. ip-10-2-4-2 (10.2.4.2) example. to deal with IP if you must convert them to dash notation for a name versus putting an ip in the assetName field at least cloud names and bad scans dont results in potential for two entries for the same asset (if by fortunate to be static ip or very long lease times and scanned more than once)
you would have set assetName with fall-back to ip last resort
| eval ipHostname = "ip-" . (replace(ip, "\.", "-"))
| eval assetName = lower( coalesce(netbname_short, dns_short, ipHostname))
The the question of more than one scan comes in: Need a timestamp lastSeen, last_seen this is per asset/ per scan as unique
| eval dtVULN=strftime(lastSeen,"%Y-%m-%d")
| eventstats latest(dtVULN) as dtVULN by assetName
I then push through Stats to score (totals per host) I do this to table out a report to say get the top 100 vulnerable hosts?
What good is a total number of Vulns without some context and something actionable (place to start, What is your High Count, Low Count, How is your average compared to some threadhold?
| stats dedup_splitvals=true values(dnsName) as hostFQDN, values(ip) as ip, latest(ScanName) as ScanName, latest(lastSeen) as lastSeen sum(cvssV3BaseScore) as scoreCVSS, count(plugin_id) as TotalVulns, count(eval(LIKE(severityName,"Critical"))) as iCritical, count(eval(LIKE(severityName,"High"))) as iHigh, by assetName
| table hostFQDN, assetName, ip, ScanName, lastSeen, scoreCVSS, TotalVulns, iCritical, iHigh
| sort -scoreCVSS
As you requested; Totals for latest scan per asset totaled. If you don't have overlap scans then last seen and eventstats assetName, dtVULN (last_seen) is not necessary. Otherwise counting devices twice can occur.
| stats dedup_splitvals=true count(plugin_id) as totalVuln, dc(plugin_id) as distinctVulnTotal, dc(assetName) as hostTotal, sum(eval(LIKE(severityName,"Critical"))) as totalCritical, sum(eval(LIKE(severityName,"High"))) as totalHigh
| eval avgPerHost = ( totalVuln / hostTotal )
| table totalVulns, hostTotal, avgPerHost