Splunk Search

Compare words in a string against all other words in another string

Logginz
New Member

Hi there, 

I'm new to Splunk, but I've been making some progress.

I'm trying to compare traffic going from one zone to another zone, and to filter out expected traffic.

For example, I have the hosts Dr Pepper, Pepsi, coke, sprite, and I need to see if theyre talking to each other when they shouldn't be

These all have various hosts that could be something like "pepsi-public-dmz", or "test_drpepper_internet_transit"

However a few of them could also contain 2 of the variables, such as "coke-combo-pepsi" 

I need to determine a way of searching the variable strings, and comparing the values found against any values found in the other string

 

I've managed to do this by using case and like, to determine if a string has the word in there ,then comparing it to the the src_zone using the code below:

| eval dest_test=case(like(dest_zone ,"%coke%") ,"coke", like(dest_zone ,"%pepsi%") ,"pepsi", like(dest_zone ,"%pepper%") ,"pepper", like(dest_zone ,"%sprite%") ,"sprite", 1<2, "Not found1")

| eval src_test=case(like(src_zone ,"%coke%") ,"coke", like(src_zone ,"%pepsi%") ,"pepsi", like(src_zone ,"%dr-pepper%") ,"dr-pepper", like(src_zone ,"%sprite%") ,"sprite", 1<2, "Not found2")

| eval outcome = if(src_test == dest_test, "match", "no match")

| eval  concat_z2z = if(outcome == "no match" , (dest_zone . " : " . src_zone), "Expected"

| where concat_z2z != "Expected"

| table concat_z2z

This provides a list of all traffic that is not going from something marked "pepsi", to another host marked "pepsi", but as it searches in order with case, it doesn't find when something has 2 of the keywords in it.

 

If I was using Python id do a for/while loop to look through all the variables of the keywords, but I cannot figure it out for the life of me here.

The final bit, this isn't exactly scalable either, as you'd have to edit the list each time a new host provider was added.

Help?

PS. I realize the code is messy, as I said, I'm still new to this.

Labels (5)
0 Karma

to4kawa
Ultra Champion
| makeresults
| eval _raw="id1:coke-combo-pepsi:pepsi-public-dmz,id2:sprite_internet_transit:test_drpepper_internet_transit,id3:sprite_drpepper_internet_transit:coke-combo-pepsi,id4:coke-combo-pepsi:sprite_drpepper_internet_transit,id5:pepsi-public-dmz:pepsi_internet_transit,id4:coke-combo-pepsi:sprite_coke_internet_transit"
| makemv delim="," _raw
| stats count by _raw
| rex "(?<id>.*?):(?<src>.*?):(?<dest>.*)"
| table id src dest
| rename COMMENT as "this is sample"
| rename COMMENT as "this is matching values"
| appendpipe [|eval val=split("pepsi,coke,dr_pepper,sprite",",") ]
| rename COMMENT as "from here, the logic"
| where isnotnull(val)
| mvexpand val
| eval matched=if(match(src,val) OR match(dest,val) , 1  , 0)
| where matched=1
| stats values(val) as matched_val by id src dest

how about this?

0 Karma

bowesmana
SplunkTrust
SplunkTrust

One possible avenue you could explore is to do the following

1. Have a lookup file containing the host words, e.g.

name,val
pepsi,1
coke,2
dr_pepper,3
sprite,4

then do something along the lines of

| makeresults
| eval t="id1:coke-combo-pepsi:pepsi-public-dmz,id2:sprite_internet_transit:test_drpepper_internet_transit,id3:sprite_drpepper_internet_transit:coke-combo-pepsi,id4:coke-combo-pepsi:sprite_drpepper_internet_transit,id5:pepsi-public-dmz:pepsi_internet_transit,id4:coke-combo-pepsi:sprite_coke_internet_transit"
| makemv delim="," t
| mvexpand t
| rex field=t "(?&lt;id&gt;[^:]*):(?&lt;src&gt;[^:]*):(?&lt;dest&gt;.*)"
| fields - t
| eval n_src=src, n_dest=dest
| makemv tokenizer="([^-_]+)[-_]?" n_src
| makemv tokenizer="([^-_]+)[-_]?" n_dest
| mvexpand n_src
| lookup hosts.csv name as n_src OUTPUT val as src_val
| where !isnull(src_val)
| mvexpand n_dest
| lookup hosts.csv name as n_dest OUTPUT val as dest_val
| where !isnull(dest_val)
| table _time, id, src, dest, n_src, n_dest, src_val, dest_val
| stats values(n_src) as n_src values(n_dest) as n_dest values(*_val) as *_val by id, src, dest
| where isnull(mvfind(n_src, n_dest))

 NB: replace the &lt; and &gt; with the chevrons - seems to cause a problem with this forum if chevrons used.

Anyway, that will give you a list of entities that do not talk to the correct domain, but I am not sure what you want to do when the dual token (pepsi-combo-coke) talks to pepsi or coke - are both allowed or only one?

Anyway, hopefully this gives you something to get started with

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...