Splunk Search

Need help with a search/subsearch not providing the expected results

pablobarquin
Explorer

Hello there! I need help with a search that is not providing the expected results. Let me share the details and background information:

This search provides the list of the Windows server's IPs found by a network discovery scan:

 

index=tenable sourcetype="tenable:sc:vuln" repository=DISCOVERY pluginID=11936 
| rex "(?i)Remote operating system : (?P<os>[\D\d]+(?=Confidence level))" 
| rex "(?i)Confidence level : (?P<os_confidencial_level>[\d]+)" 
| makemv delim="\n" os 
| search os=*windows*server* 
| table ip dnsName os os_confidencial_level 
| dedup ip dnsName os

 

It delivers a total of 28806 IPs.

This another search provides the list of the Windows server's IPs located in the CMDB:

 

index=snow_ci sourcetype=cmdb_ci_server SYS_CLASS_NAME="Windows Server" OPERATIONAL_STATUS!=Retired NOT IP_ADDRESS IN ("0.0.0.0", "255.255.255.255", "127.0.0.1", "169.254.*") earliest=-24h 
| regex IP_ADDRESS="^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$" 
| dedup IP_ADDRESS 
| rename IP_ADDRESS as ip 
| table ip

 

I get a total of 22845 IPs.

This means that ideally the number of Windows servers in the shadow should be 28806 - 22845 = 5961

So I'm trying to get a similar value with this final search:

 

index=tenable repository=DISCOVERY sourcetype="tenable:sc:vuln" pluginID=11936 
| rex "(?i)Remote operating system : (?P<os>[\D\d]+(?=Confidence level))" 
| rex "(?i)Confidence level : (?P<os_confidencial_level>[\d]+)" 
| makemv delim="\n" os 
| search os=*windows*server* 
| search NOT 
    [ search index=snow_ci sourcetype=cmdb_ci_server SYS_CLASS_NAME="Windows Server" OPERATIONAL_STATUS!=Retired NOT IP_ADDRESS IN ("0.0.0.0", "255.255.255.255", "127.0.0.1", "169.254.*") earliest=-24h 
    | regex IP_ADDRESS="^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$" 
    | dedup IP_ADDRESS 
    | rename IP_ADDRESS as ip 
    | fields ip ] 
| table ip dnsName os os_confidencial_level 
| dedup ip dnsName os

 

But unfortunately I'm not getting the expected results. I should get the IPs included in the first search but NOT in the second one, not sure why but I'm getting many results (21025) with IPs from the subsearch too.

While troubleshooting I have tried this: if at the end of the whole search we look for the IPs that are removed with the subsearch, if the subsearch is working fine, we should get 0 results, which is exactly what I get!

 

index=tenable repository=DISCOVERY sourcetype="tenable:sc:vuln" pluginID=11936 
| rex "(?i)Remote operating system : (?P<os>[\D\d]+(?=Confidence level))" 
| rex "(?i)Confidence level : (?P<os_confidencial_level>[\d]+)" 
| makemv delim="\n" os 
| search os=*windows*server* 
| search NOT 
    [ search index=snow_ci sourcetype=cmdb_ci_server SYS_CLASS_NAME="Windows Server" OPERATIONAL_STATUS!=Retired NOT IP_ADDRESS IN ("0.0.0.0", "255.255.255.255", "127.0.0.1", "169.254.*") earliest=-24h 
    | regex IP_ADDRESS="^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$" 
    | dedup IP_ADDRESS 
    | rename IP_ADDRESS as ip 
    | fields ip ] 
| table ip dnsName os os_confidencial_level 
| dedup ip dnsName os 
| search 
    [ search index=snow_ci sourcetype=cmdb_ci_server SYS_CLASS_NAME="Windows Server" OPERATIONAL_STATUS!=Retired NOT IP_ADDRESS IN ("0.0.0.0", "255.255.255.255", "127.0.0.1", "169.254.*") earliest=-24h 
    | regex IP_ADDRESS="^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$" 
    | dedup IP_ADDRESS 
    | rename IP_ADDRESS as ip 
    | fields ip ]

 

So what is the issue here?

This is driving me crazy so any help will be really appreciated.

Thanks!

Labels (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

@pablobarquin 

Often a way to manage this type of dual data source search is to search both data sets, perform some kind of test on the data to set an indicator and then aggregate the two data sets on the common data and then filter on the indicator, so this could be

(index=tenable repository=DISCOVERY sourcetype="tenable:sc:vuln" pluginID=11936) OR (index=snow_ci sourcetype=cmdb_ci_server SYS_CLASS_NAME="Windows Server" OPERATIONAL_STATUS!=Retired NOT IP_ADDRESS IN ("0.0.0.0", "255.255.255.255", "127.0.0.1", "169.254.*") earliest=-24h)
| rex "(?i)Remote operating system : (?P<os>[\D\d]+(?=Confidence level))" 
| rex "(?i)Confidence level : (?P<os_confidencial_level>[\d]+)" 
| makemv delim="\n" os 
| search (index=tenable AND os=*windows*server*) OR index=snow_ci
| eval IP_ADDRESS=if(index="tenable", "__N/A__", IP_ADDRESS)
| regex IP_ADDRESS="^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$|__N/A__" 
| eval ip=if(index="tenable", ip, IP_ADDRESS)
| stats dc(index) as indexCount values(index) as indexes by ip dnsName os os_confidencial_level 
| where indexCount=1

This would handle the regex case on IP_ADDRESS by allowing tenable data through with the __N/A__ tag and then use stats for the dedup and aggregation. What you are looking for is the count of unique indexCount=1, i.e. it is ONLY in tenable OR only in snow_ci.

You have the indexes values which will tell you which one it is - you can filter as needed.

Hope this helps.

View solution in original post

0 Karma

lekanneer
Loves-to-Learn Lots

This is why I developed a solution to overcome the problem of lookups and/or indexes and/or complex searches. My solution is using a Neo4j graph database in between the source of the information and Splunk. The good thing of the graph database is that it is capable of searching for relationships or the lack of relationships. So you can even combine several sources to give you the context that you need.

My solution is based on automatically getting in ALL CMDB CI's and relationships into the graph database which then can be search from Splunk. And for that I developed new search commands.

I wrote a post about that recently: SOCs: why they struggle with context 

0 Karma

pablobarquin
Explorer

@bowesmana, many thanks for your prompt reply 🙂

I have tried with your suggestion and it only works (it delivers the expected 11137 results) if I just leave the ip field in the stats command:

(index=tenable repository=DISCOVERY sourcetype="tenable:sc:vuln" pluginID=11936) OR (index=snow_ci sourcetype=cmdb_ci_server SYS_CLASS_NAME="Windows Server" OPERATIONAL_STATUS!=Retired NOT IP_ADDRESS IN ("0.0.0.0", "255.255.255.255", "127.0.0.1", "169.254.*") earliest=-24h) 
| rex "(?i)Remote operating system : (?P<os>[\D\d]+(?=Confidence level))" 
| rex "(?i)Confidence level : (?P<os_confidencial_level>[\d]+)" 
| makemv delim="\n" os 
| search (index=tenable AND os=*windows*server*) OR index=snow_ci 
| eval IP_ADDRESS=if(index="tenable", "__N/A__", IP_ADDRESS) 
| regex IP_ADDRESS="^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$|__N/A__" 
| eval ip=if(index="tenable", ip, IP_ADDRESS) 
| stats dc(index) as indexCount values(index) as indexes by ip 
| where indexCount=1 and indexes="tenable"

In the moment I add the remaining fields (or even one more like the dnsName), it delivers many results (29498). Those fields exist in the tenable index but not in the snow_ci. 

Any idea why this is happening?

Thanks again for your help!

0 Karma

bowesmana
SplunkTrust
SplunkTrust

@pablobarquin 

Your original search was just producing a table, however, in mine I did 'stats' BY and there are two data types, so this is where you have to understand your data to see how to solve it.

You can just add values(X) as X into the stats command and you'll get a list of the seen values of whatever field X you specify rather than splitting by those fields.

 

0 Karma

pablobarquin
Explorer

@bowesmana 

Many thanks for your help here! Finally, I'm getting the expected results with the following query:

(index=tenable repository=DISCOVERY sourcetype="tenable:sc:vuln" pluginID=11936) OR (index=snow_ci sourcetype=cmdb_ci_server SYS_CLASS_NAME="Windows Server" OPERATIONAL_STATUS!=Retired NOT IP_ADDRESS IN ("0.0.0.0", "255.255.255.255", "127.0.0.1", "169.254.*") earliest=-24h) 
| rex "(?i)Remote operating system : (?P<os>[\D\d]+(?=Confidence level))" 
| rex "(?i)Confidence level : (?P<os_confidencial_level>[\d]+)" 
| makemv delim="\n" os 
| search (index=tenable AND os=*windows*server*) OR index=snow_ci 
| eval IP_ADDRESS=if(index="tenable", "__N/A__", IP_ADDRESS) 
| regex IP_ADDRESS="^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$|__N/A__" 
| eval ip=if(index="tenable", ip, IP_ADDRESS) 
| stats dc(index) as indexCount values(index) as indexes values(dnsName) as dnsName values(os) as os values(os_confidencial_level) as os_confidencial_level by ip 
| where indexCount=1 and indexes="tenable" 
| fields - indexCount indexes 
| sort 0 - os_confidencial_level

I have accepted your first answer as the solution to the problem 🙂

bowesmana
SplunkTrust
SplunkTrust

@pablobarquin 

Often a way to manage this type of dual data source search is to search both data sets, perform some kind of test on the data to set an indicator and then aggregate the two data sets on the common data and then filter on the indicator, so this could be

(index=tenable repository=DISCOVERY sourcetype="tenable:sc:vuln" pluginID=11936) OR (index=snow_ci sourcetype=cmdb_ci_server SYS_CLASS_NAME="Windows Server" OPERATIONAL_STATUS!=Retired NOT IP_ADDRESS IN ("0.0.0.0", "255.255.255.255", "127.0.0.1", "169.254.*") earliest=-24h)
| rex "(?i)Remote operating system : (?P<os>[\D\d]+(?=Confidence level))" 
| rex "(?i)Confidence level : (?P<os_confidencial_level>[\d]+)" 
| makemv delim="\n" os 
| search (index=tenable AND os=*windows*server*) OR index=snow_ci
| eval IP_ADDRESS=if(index="tenable", "__N/A__", IP_ADDRESS)
| regex IP_ADDRESS="^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$|__N/A__" 
| eval ip=if(index="tenable", ip, IP_ADDRESS)
| stats dc(index) as indexCount values(index) as indexes by ip dnsName os os_confidencial_level 
| where indexCount=1

This would handle the regex case on IP_ADDRESS by allowing tenable data through with the __N/A__ tag and then use stats for the dedup and aggregation. What you are looking for is the count of unique indexCount=1, i.e. it is ONLY in tenable OR only in snow_ci.

You have the indexes values which will tell you which one it is - you can filter as needed.

Hope this helps.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...