Splunk Search

Limiting a query to 'known good' values plus 'others'

gavinsopra
Engager

I have a query similar to the following which we are using to capture information about email traffic between certain (internal) email domains

 

index="exchdomains"  | stats count by SenderDomain,recipientDomain 
| xyseries SenderDomain,recipientDomain,count

 

This builds a nice table, however:

Part 1
How do I limit the query to only certain values of 'SenderDomain' and 'recipientDomain' without having to type all the domains (there are about 8 ATM) into the query? Can I use a CSV lookup for this?

Part 2
Assuming part one is achievable how do I add a 'catchall' row where I count messages sent from domains NOT in my list of 'SenderDomain' values of interest? And similarly add a 'catchall' column for 'recipientDomain' not in a list of 'known good'.

 

Labels (3)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

If your problem is resolved, then please click the "Accept as Solution" button to help future readers.

---
If this reply helps you, an upvote would be appreciated.

View solution in original post

0 Karma

richgalloway
SplunkTrust
SplunkTrust

If you run the subsearch by itself 

| inputlookup SenderRecipDomains.csv | return Domain

you should get a result like this

((Domain=dept1.org.com) OR (Domain=dept2.org.com) OR ...)

 So the final search becomes

index="exchdomains" ((Domain=dept1.org.com) OR (Domain=dept2.org.com) OR ...)

Which should work fine as long as your exchdomains index has a field called "Domain".  If there is no such field then the subsearch needs a rename command to make the field name match what's in the index.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

gavinsopra
Engager

Curiouser and curiouser.

As stated before this query

| inputlookup SenderRecipDomains.csv

Returns this result

Domain DomainName
dept1.org.com Dept1
dept2.org.com Dept2
dept3.org.com Dept3
dept4.org.com Dept4
dept5.org.com Dept5
deptA.remote.com RemoteDeptA
deptB.remote.com RemoteDeptB
deptC.remote.com RemoteDeptC

So if I run this query

| inputlookup SenderRecipDomains.csv | return Domain

I get this

Domain="dept1.org.com"

However, it appears that 'return' requires a "count" which leads to this query and result

| inputlookup SenderRecipDomains.csv | return 2 Domain

(Domain="dept1.org.com") OR (Domain="dept2.org.com")

So I am running this to get all the data from my CSV

| inputlookup SenderRecipDomains.csv | return 9999 Domain

(Domain="dept1.org.com") OR (Domain="dept2.org.com") OR (Domain="dept3.org.com") OR (Domain="dept4.org.com") OR (Domain="dept5.org.com") OR (Domain="deptA.remote.com") OR (Domain="deptB.remote.com") OR (Domain="deptC.remote.com")

Which means that part 1 is now working - many thanks.

I have just started looking at part 2.

I have started with 

index="exchdomains"
| inputlookup SenderRecipDomains.csv Domain as SenderDomain OUTPUT Domain as foundSDomain

But I get the error "Error in 'inputlookup' command: Invalid argument: 'domain'"

0 Karma

gavinsopra
Engager

My bad.  I was using 'inputlookup' when I should have used 'lookup'

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

If your problem is resolved, then please click the "Accept as Solution" button to help future readers.

---
If this reply helps you, an upvote would be appreciated.

View solution in original post

0 Karma

richgalloway
SplunkTrust
SplunkTrust

To limit a query to values in a lookup file, read the lookup in a subsearch, like this:

index="exchdomains" [ | inputlookup mylookup.csv | return foo ]
| stats count by SenderDomain,recipientDomain 
| xyseries SenderDomain,recipientDomain,count

The above query will read only those events that have the appropriate field values in the CSV so there is no need for a catch-all.  An alternative query to create a catch-all might look like this:

index="exchdomains"  
| lookup mylookup.csv domain as SenderDomain OUTPUT domain as foundSDomain
| lookup mylookup.csv domain as recipientDomain OUTPUT domain as foundRDomain
| fillnull value="unknown" foundSDomain foundRDomain
| stats count by foundSDomain, foundRDomain 
| xyseries foundSDomain,foundRDomain,count

 

---
If this reply helps you, an upvote would be appreciated.
0 Karma

gavinsopra
Engager

Richgallowy,

Thanks for the response but I am having some difficulty getting it to work.

I have added a lookup table which returns the following

| inputlookup SenderRecipDomains.csv

Domain				DomainName
dept1.org.com		Dept1
dept2.org.com		Dept2
dept3.org.com		Dept3
dept4.org.com		Dept4
dept5.org.com		Dept5
deptA.remote.com	RemoteDeptA
deptB.remote.com	RemoteDeptB
deptC.remote.com	RemoteDeptC

If I run the following query with a very tight date/time range I get 47 events

index="exchdomains"

If I run this query

index="exchdomains" [ | inputlookup SenderRecipDomains.csv | return Domain ]

or this query

index="exchdomains" [ | inputlookup SenderRecipDomains.csv | fields Domain ]

against the same date/time range I get 0 events.

Have I configured the lookup incorrectly?

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.