Splunk Search

Limiting a query to 'known good' values plus 'others'

gavinsopra
Engager

I have a query similar to the following which we are using to capture information about email traffic between certain (internal) email domains

 

index="exchdomains"  | stats count by SenderDomain,recipientDomain 
| xyseries SenderDomain,recipientDomain,count

 

This builds a nice table, however:

Part 1
How do I limit the query to only certain values of 'SenderDomain' and 'recipientDomain' without having to type all the domains (there are about 8 ATM) into the query? Can I use a CSV lookup for this?

Part 2
Assuming part one is achievable how do I add a 'catchall' row where I count messages sent from domains NOT in my list of 'SenderDomain' values of interest? And similarly add a 'catchall' column for 'recipientDomain' not in a list of 'known good'.

 

Labels (3)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

If your problem is resolved, then please click the "Accept as Solution" button to help future readers.

---
If this reply helps you, Karma would be appreciated.

View solution in original post

0 Karma

richgalloway
SplunkTrust
SplunkTrust

If you run the subsearch by itself 

| inputlookup SenderRecipDomains.csv | return Domain

you should get a result like this

((Domain=dept1.org.com) OR (Domain=dept2.org.com) OR ...)

 So the final search becomes

index="exchdomains" ((Domain=dept1.org.com) OR (Domain=dept2.org.com) OR ...)

Which should work fine as long as your exchdomains index has a field called "Domain".  If there is no such field then the subsearch needs a rename command to make the field name match what's in the index.

---
If this reply helps you, Karma would be appreciated.
0 Karma

gavinsopra
Engager

Curiouser and curiouser.

As stated before this query

| inputlookup SenderRecipDomains.csv

Returns this result

Domain DomainName
dept1.org.com Dept1
dept2.org.com Dept2
dept3.org.com Dept3
dept4.org.com Dept4
dept5.org.com Dept5
deptA.remote.com RemoteDeptA
deptB.remote.com RemoteDeptB
deptC.remote.com RemoteDeptC

So if I run this query

| inputlookup SenderRecipDomains.csv | return Domain

I get this

Domain="dept1.org.com"

However, it appears that 'return' requires a "count" which leads to this query and result

| inputlookup SenderRecipDomains.csv | return 2 Domain

(Domain="dept1.org.com") OR (Domain="dept2.org.com")

So I am running this to get all the data from my CSV

| inputlookup SenderRecipDomains.csv | return 9999 Domain

(Domain="dept1.org.com") OR (Domain="dept2.org.com") OR (Domain="dept3.org.com") OR (Domain="dept4.org.com") OR (Domain="dept5.org.com") OR (Domain="deptA.remote.com") OR (Domain="deptB.remote.com") OR (Domain="deptC.remote.com")

Which means that part 1 is now working - many thanks.

I have just started looking at part 2.

I have started with 

index="exchdomains"
| inputlookup SenderRecipDomains.csv Domain as SenderDomain OUTPUT Domain as foundSDomain

But I get the error "Error in 'inputlookup' command: Invalid argument: 'domain'"

0 Karma

gavinsopra
Engager

My bad.  I was using 'inputlookup' when I should have used 'lookup'

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

If your problem is resolved, then please click the "Accept as Solution" button to help future readers.

---
If this reply helps you, Karma would be appreciated.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

To limit a query to values in a lookup file, read the lookup in a subsearch, like this:

index="exchdomains" [ | inputlookup mylookup.csv | return foo ]
| stats count by SenderDomain,recipientDomain 
| xyseries SenderDomain,recipientDomain,count

The above query will read only those events that have the appropriate field values in the CSV so there is no need for a catch-all.  An alternative query to create a catch-all might look like this:

index="exchdomains"  
| lookup mylookup.csv domain as SenderDomain OUTPUT domain as foundSDomain
| lookup mylookup.csv domain as recipientDomain OUTPUT domain as foundRDomain
| fillnull value="unknown" foundSDomain foundRDomain
| stats count by foundSDomain, foundRDomain 
| xyseries foundSDomain,foundRDomain,count

 

---
If this reply helps you, Karma would be appreciated.
0 Karma

gavinsopra
Engager

Richgallowy,

Thanks for the response but I am having some difficulty getting it to work.

I have added a lookup table which returns the following

| inputlookup SenderRecipDomains.csv

Domain				DomainName
dept1.org.com		Dept1
dept2.org.com		Dept2
dept3.org.com		Dept3
dept4.org.com		Dept4
dept5.org.com		Dept5
deptA.remote.com	RemoteDeptA
deptB.remote.com	RemoteDeptB
deptC.remote.com	RemoteDeptC

If I run the following query with a very tight date/time range I get 47 events

index="exchdomains"

If I run this query

index="exchdomains" [ | inputlookup SenderRecipDomains.csv | return Domain ]

or this query

index="exchdomains" [ | inputlookup SenderRecipDomains.csv | fields Domain ]

against the same date/time range I get 0 events.

Have I configured the lookup incorrectly?

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...