Splunk Search

How to change basic search to tstats with lookups and searchtime fields?

avoelk
Communicator

Hello !

Currently I'm trying to optimize splunk searches left by another colleague which are usually slow or very big. My first thought was to change the "basic searches" (searches that don't use tstats) to searches with tstats to see the most notable accelaration. The needed datamodels are already accelerated and the fields are normalized. bellow is one of those searches I would like to change into tstats.

 

index=* message_type=query NOT dns_request_queried_domain IN (>different_domainnames>)

| lookup1 ip_address as dns_request_client_ip output ip_address as dns_server_ip
| search dns_server_ip=no_matches
| lookup2 domain as dns_request_queried_domain output domain as cmdb_legit_domain
| search cmdb_legit_domain=no_matches
| lookup3 domain as dns_request_queried_domain output domain as wl_domain
| search wl_domain=no_matches 

| eval list="custom"
| `ut_parse_extended(dns_request_queried_domain,list)`
| search NOT ut_domain="None"

| lookup4 domain as ut_domain output domain as umbrella_domain
| lookup5 domain as ut_domain output domain as majestic_domain
| search umbrella_domain=no_matches AND majestic_domain=no_matches

| bucket _time span=5s
| stats count by _time, ut_domain, dns_request_client_ip
| search count>100
| sort -count

 

now, I struggle to "get" how to connect the way tstats works with the way the basic search works. as far as I've read and seen tstats only works with indexed fields but not fields that are being extracted at search time? so I guess my question is how could I use tstats and still incorporate the above fields and lookups into an optimized search ? I really struggle to understand how to really incorporate tstats in that case.

thanks so much for every hint or help 🙂

André

Labels (4)
Tags (3)
0 Karma
1 Solution

avoelk
Communicator

so , it seems this one works now. I've included the lookup to look for matches of ips: 

 

 

|tstats count summariesonly=t from datamodel=Network_Resolution.DNS 
where 
dns.message_type=query  
AND 
(NOT [inputlookup lu_cmdb_dns_servers |table ip_address |rename ip_address as "dns.dns_request_client_ip"]) 
by  dns.message_type, dns.query

 

 

the reason the one before didn't work seems to come down to the |fields i_address part. I didn't know that I have to generate a table in order for it to work. also, I can't sort with that field, so I can't use it after "by" since the match is  kind of a blacklist argument and there wouldn't be any value in the field. so when I don't address it, it shows me indeed less events than without this match blacklist. therefore it seems to work. 

I'll fiddle around with the rest but I guess that really was the big problem I had. if it seems I get lost again, I'll update this topic. 

View solution in original post

0 Karma

yuanliu
SplunkTrust
SplunkTrust

It is unreasonable to expect volunteers to read through all that pseudo code and turn it into one based on tstats. You want to describe your main objective (use case), and illustrate what you have tried, results, etc. I just want to point out that the restriction about indexed fields is related to the where clause in tstats, not every other component. Even that restriction has some potential workarounds if your data and search terms have certain characteristics. (I also strongly suggest that you enable auto formating in search window to make search commands more readable.)

Here are some steps you can try to gain better understanding of how you can potentially use tstats to improve search speed. (As always, tstats is not a substitution for every search.) Because you didn't format your sample search, I will use the word "command" to describe each segment separated by pipelines ("|"). I also assume that none of your

Combine the first command ("search" implied), and the first stats:

index=* message_type=query NOT dns_request_queried_domain IN (>different_domainnames>)
| bucket _time span=5s
| stats count by _time, ut_domain, dns_request_client_ip

(I took liberty to change index= to index=*) I understand that there are lots of transformations to get ut_domain and dns_request_client_ip, but bear with me, and pretend they are in your original data at search time. The above search MAY be substituted by

| tstats count where index=* by message_type dns_request_queried_domain ut_domain dns_request_client_ip _time span=5s
| where message_type=query AND NOT dns_request_queried_domain IN (>different_domainnames>)

One critical condition for the above to work is that every groupby field must have value in some events; just like in normal stats, you end up counting only those events that contain all those fields.

If, instead, some fields do not exist in some events but you want to count them, anyway (as you often do in normal stats), you can do this instead:

| tstats count latest(message_type) as message_type latest(dns_request_queried_domain) as dns_request_queried_domain latest(ut_domain) as ut_domain latest(dns_request_client_ip) as dns_request_client_ip where index=* _time span=5s
| where message_type=query AND NOT dns_request_queried_domain IN (>different_domainnames>)​

Note: latest is just one possible way to combine data that may or may not exist in a given event. Which stats function to use depends on your intention.

Because dns_request_client_ip is present after the above tstats, the first very lookup, lookup1 ip_address as dns_request_client_ip output ip_address as dns_server_ip, can be added back unchanged. You can go on to analyze all subsequent lookups and filters. If they require any field that is not returned in tstats, try to retrieve it using one of the above strategies.

Hope this helps.

avoelk
Communicator

Hello again, 

so it seems it doesn't work. first of all I've tried the following: 

|tstats count summariesonly=t from datamodel=Network_Resolution by _time

 

here it shows me a list of counts by _time as expected since _time is an indexed metadata field. BUT the moment I'm trying to do anything else besides using metadata fields, it doesn't show me anything.

examples that don't show anything: 

|tstats count summariesonly=t from datamodel=Network_Resolution by message_type
|tstats count from datamodel=Network_Resolution by message_type
|tstats count from datamodel=Network_Resolution by message_type | where message_type=query

 

So I kind of understand, why those queries don't show anything since tstats only uses indexed fields right? but what I can't wrap my head around is that the datamodel Network_Resolution is filled with a lot of fields for an acceleration timeframe of 2 days roughly. and when I use the query which is used to fill up the datamodel, I get all the fields necessary for my search to begin with. 

klököjlk.PNG

 and here the fields of the datamodel itself 

avoelk_0-1671620799517.png

 

but I need to use tstats and still be able to reference fields like message_type, query_type, src, dest etc. how can I do it? 


0 Karma

avoelk
Communicator

Ok, I've at least figured out the first problem. I can reference message_type with addressing the dataset, like here: 

 

|tstats count summariesonly=t from datamodel=Network_Resolution.DNS by _time, dns.message_type

Now I still don't know how to for example use a where to filter, for example like here (which doesn't give me any results): 

|tstats count summariesonly=t from datamodel=Network_Resolution.DNS by _time, dns.message_type |where dns.message_type=query 

I also tried to use variations again: 

|tstats count summariesonly=t where dns.message_type=query from datamodel=Network_Resolution.DNS by _time, dns.message_type 

 

0 Karma

avoelk
Communicator

Hello again,

I figured that out too. when I use the following, it works: 

|tstats count summariesonly=t from datamodel=Network_Resolution.DNS where dns.message_type=query by _time, dns.message_type 

the next problem I have is that I want to check in several lookups for matches of ips.  the original query did it like this : 

kljlkjlkjö.PNG

I tried to replicated this, at least with the first lookup like this: 

|tstats count summariesonly=t from datamodel=Network_Resolution.DNS where dns.message_type=query  AND (NOT [inputlookup lu_cmdb_dns_servers |fields ip_address |rename ip_address as dns.ip_address]) by _time, dns.message_type, dns.ip_address

but again, it doesn't show any results. I bet there is something wrong with my approach of using a lookup to match results? 

0 Karma

avoelk
Communicator

so , it seems this one works now. I've included the lookup to look for matches of ips: 

 

 

|tstats count summariesonly=t from datamodel=Network_Resolution.DNS 
where 
dns.message_type=query  
AND 
(NOT [inputlookup lu_cmdb_dns_servers |table ip_address |rename ip_address as "dns.dns_request_client_ip"]) 
by  dns.message_type, dns.query

 

 

the reason the one before didn't work seems to come down to the |fields i_address part. I didn't know that I have to generate a table in order for it to work. also, I can't sort with that field, so I can't use it after "by" since the match is  kind of a blacklist argument and there wouldn't be any value in the field. so when I don't address it, it shows me indeed less events than without this match blacklist. therefore it seems to work. 

I'll fiddle around with the rest but I guess that really was the big problem I had. if it seems I get lost again, I'll update this topic. 

0 Karma

avoelk
Communicator

Hello and thanks a lot for your reply.

Sorry for the bad format. I used the "html" option to edit my entry but it seems to not work. I obviously did something wrong there. also, I of course don't expect anything rather than tipps so thanks for your detailed explanation. I just added all that "pseudo code" cause the whole search really looks like that and I wanted to show that a lot of things need to be considered when thinking about switching to tstats. i.e. lookups, subsearches , evals etc. 

my intention was basically to make the first few lines of the search work (since this is usually the base search before the first pipe and accumulates the biggest chunk of events before stats and filtering happens), but I was unsure on how to "translate" it basically into the tstats syntax. 

your explanation helps a lot so thanks again and sry for giving you a hard time with all that code and bad format. I'll try it out asap.

cheers,

André 

0 Karma
Get Updates on the Splunk Community!

Video | Welcome Back to Smartness, Pedro

Remember Splunk Community member, Pedro Borges? If you tuned into Episode 2 of our Smartness interview series, ...

Detector Best Practices: Static Thresholds

Introduction In observability monitoring, static thresholds are used to monitor fixed, known values within ...

Expert Tips from Splunk Education, Observability in Action, Plus More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...