Splunk Search

How do you exclude and format unique specific fields from multivalued fields to be used in a subsearch?

nickcardenas
Path Finder

Hello all,

I'm having some trouble formatting and dealing with multivalued fields.

My use case is as follows:

  • I have sourcetype-A that returns known malicious indicators (through multi-valued fields)
  • I have sourcetype-B that has DNS query logs from hosts
  • I'd like to make a search where I compile a list of "known malicious" domains from sourcetype-A, and in a subsearch (with soucetype-B being the main base search), compare all queries against the list created from the subsearch to see if a host queried a "malicious domain" (different index from sourcetype-A)

A sample log for sourcetype A looks like this:

                     Field                    Values
Event 1              indicator                x.xxx.x.xx
                                              hash
                                              someDomain.com
                                              http://DomainA.com
                                              supermalicious.com

Event 2              indicator                someDomain.com
                                              www.domainA.com
                                              someEmailAddress@domain.com
                                              http://helpmepls.com

When I use | eval indicator=mvfilter(match(indicator, "\.")) and |stats values(indicator), I receive somewhat of expected results (hashes are now gone and values are deduped across all events), but I still have the issue of having to exclude everything else that's not a domain or a URL.

I was thinking of using something like a URL parser app for Splunk to help with the formatting issues, but for that, I don't think I'm able to get by using |stats values(indicators)

Expected results:

someDomain.com
domainA.com
supermalicious.com
helpmepls.com

I'd appreciate if someone could point me in the correct direction or tell me if this is even possible through Splunk.

Thanks!

0 Karma
1 Solution

woodcock
Esteemed Legend

Like this:

| makeresults 
| eval raw="10.123.4.56,hash,someDomain.com,http://DomainA.com,supermalicious.com someDomain.com,www.domainA.com,someEmailAddress@domain.com,http://helpmepls.com" 
| makemv raw 
| mvexpand raw 
| rename raw AS _raw 
| rex max_match=0 "(?<indicator>[^,]+)" 

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

| rex field=indicator mode=sed "s%^[^:/]+://%% s/^www\.//"
| eval indicator=mvfilter(match(indicator, "\.") AND NOT match(indicator, "(^\d+\.\d+\.\d+\.\d+$)|@"))
| eval indicator=lower(indicator)
| stats values(indicator)

View solution in original post

woodcock
Esteemed Legend

Like this:

| makeresults 
| eval raw="10.123.4.56,hash,someDomain.com,http://DomainA.com,supermalicious.com someDomain.com,www.domainA.com,someEmailAddress@domain.com,http://helpmepls.com" 
| makemv raw 
| mvexpand raw 
| rename raw AS _raw 
| rex max_match=0 "(?<indicator>[^,]+)" 

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

| rex field=indicator mode=sed "s%^[^:/]+://%% s/^www\.//"
| eval indicator=mvfilter(match(indicator, "\.") AND NOT match(indicator, "(^\d+\.\d+\.\d+\.\d+$)|@"))
| eval indicator=lower(indicator)
| stats values(indicator)

nickcardenas
Path Finder

Brilliant! This works as expected! I'll need to tinker with the regex to also omit IP addresses with specified ports such as123.123.123.2:8080 but once I add this, the provided answer will do exactly what I'm looking for.

Thank you so much!

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...