How can I optimize my query ?

vrmandadi · ‎11-09-2022

I have the following query with multiple joins and using max=0 which is not giving me all results as I think the size becomes more and its unable to take the load.How best can I optimize my query

index=syslog (process=*epilog* "*slurm-epilog: START user*") OR (process=*prolog* "*slurm-prolog: END user*") host IN (preo*) timeformat="%Y-%m-%dT%H:%M:%S.%6N"
| rex field=_raw "(?<epilog_start>[^ ]+)\-\d{2}\:\d{2}.*slurm\-epilog\:\s*START\suser\=(?<user>[^\s]+)\sjob\=(?<job_id>[^ ]+)"
| rex field=_raw "(?<prolog_end>[^ ]+)\-\d{2}\:\d{2}.*slurm\-prolog\:\s*END\suser\=(?<user>[^\s]+)\sjob\=(?<job_id>[^ ]+)"
| stats values(epilog_start) as epilog_start values(prolog_end) as prolog_end first(_time) as _time by host user job_id
| search prolog_end!=""
| search user=*
| search job_id=*
| eval current_time = now()
| join host max=0 type=left
[
search index=syslog "Linux version" host IN (preos*)
| rex field=_raw "(?<reboot_time>[^ ]+)\-\d{2}\:\d{2}.*kernel:\s.*Linux\sversion"
| stats count by reboot_time host
]
| convert timeformat="%Y-%m-%dT%H:%M:%S.%6N" mktime(prolog_end) as job_start
| convert timeformat="%Y-%m-%dT%H:%M:%S.%6N" mktime(epilog_start) as epilog_start
| convert timeformat="%Y-%m-%dT%H:%M:%S.%6N" mktime(reboot_time) as reboot_time
| eval diff_reboot = reboot_time - job_start
| eval reboot_time = if(diff_reboot<0, "", reboot_time)
| eval diff_reboot = if(diff_reboot<0, "", diff_reboot)
| eval job_end = if(epilog_start!="",epilog_start,if(diff_reboot>0,reboot_time,current_time))
| eval diff_reboot = if(diff_reboot!="",diff_reboot,10000000000)
| sort host user job_id prolog_end diff_reboot
| dedup host user job_id prolog_end
| join host max=0 type=left
[ 
search index=syslog (("NVRM: Xid*") process=kernel) host IN (preos*)
| rex field=_raw "(?<kernel_xid>[^ ]+)\-\d{2}\:\d{2}.*NVRM\:\sXid\s*\(PCI\:(?<PCIe_Bus_Id>[^ ]+)\)\:\s*(?<Error_Code>[^ ]+)\,\spid\=(?<pid>[^ ]+)\,\s*name\=(?<name>[^ ]+)\,\s(?<Log_Message>.*)"
| stats count by host kernel_xid PCIe_Bus_Id pid name Error_Code Log_Message
| search Error_Code="***"
]
| search kernel_xid!=""
| convert timeformat="%Y-%m-%dT%H:%M:%S.%6N" mktime(kernel_xid) as xid_time
| eval diff = xid_time - job_start
| eval diff2 = job_end - xid_time
| search diff>0 AND diff2>0
| eval job_start = strftime(job_start,"%Y-%m-%d %H:%M:%S.%3N") 
| eval job_end = if(job_end==current_time,"N/A",strftime(job_end,"%Y-%m-%d %H:%M:%S.%3N"))
| eval xid_time = strftime(xid_time,"%Y-%m-%d %H:%M:%S.%3N") 
| eval current_time = strftime(current_time,"%Y-%m-%d %H:%M:%S.%3N") 
| eval reboot_time = strftime(reboot_time,"%Y-%m-%d %H:%M:%S.%3N")
| join user job_id type=left
[ search index="slurm-jobs"
| stats count by  job_id job_name user nodelist time_start time_end state submit_line
]
| eval _time=strptime(xid_time,"%Y-%m-%d %H:%M:%S.%3N") 
| stats count by host user job_id job_start job_end xid_time Error_Code PCIe_Bus_Id pid name Log_Message job_name nodelist state submit_line 
| dedup host

Below are the sample logs

Events First search
2022-11-08T14:59:53.134550-08:00 preos slurm-epilog: START user=abc job=62112
2022-11-08T14:58:25.101203-08:00 preos slurm-prolog: END user=abc job=62112

Subsearch after join events:

2022-11-09T12:49:51.395174-08:00 preos kernel: [ 0.000000] Linux version hjhc (buildd@lcy02-amd64-032) (gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 (Ubuntu 5.15.0-52.58-generic 5.15.60)

Events from second join:

2022-11-09T12:35:15.422001-08:00 preos kernel: [ 99.166912] NVRM: Xid (PCI:00:2:00): 95, pid='<unknown>', name=<unknown>, Uncontained: FBHUB. RST: Yes, D-RST: No

Events from last join:
2022-11-09 20:50:02.000, mod_time="1668027082", job_id="62112", job_name="rfm_nvcp_bek_cd_job", user="abc", account="admin", partition="vi2", qos="all", resv="YES", timelimit_minutes="10", work_dir="/sbatch/logs/2022-11-09T11-30-39/preos/viking-hbm2/builtin/nvcomp_benchmark_cascaded", submit_line="sbatch rfm_nvcomp_benchmark_cascaded_job.sh", time_submit="2022-11-09 12:11:13", time_eligible="2022-11-09 12:11:13", time_start="2022-11-09 12:50:02", time_end="2022-11-09 12:51:22", state="COMPLETED", exit_code="0", nodes_alloc="1", nodelist="preos0093", submit_to_start_time="00:38:49", eligible_to_start_time="00:38:49", start_to_end_time="00:01:20"

Thanks in Advance

bowesmana · ‎11-09-2022

Basic premise for optimising join based searches is to combine all the searches from the joins into a single primary search and then use eval statements to make conditional setting of fields depending on the data type in the stream and using stats finally to aggregate as needed.

So, instead of 1 primary search and 3 subsearches, 2 of which are also looking at syslog, you should start by doing

((SEARCH1: index=syslog...) OR
 (SEARCH2: index=syslog...) OR
 (SEARCH3: index=syslog...) OR
 (SEARCH4: index=slurm-jobs...))

at that point you will have a single stream of events of 4 different types.

Then the task is to work out how to manipulate that data to make all the calculations needed. That is where you use eval and if conditions to manipulate the events.

If you take search 1 and search 4, you could do this

(index=syslog (process=*epilog* "*slurm-epilog: START user*") OR (process=*prolog* "*slurm-prolog: END user*") host IN (preo*)) OR
(index="slurm-jobs")
| rex field=_raw "(?<epilog_start>[^ ]+)\-\d{2}\:\d{2}.*slurm\-epilog\:\s*START\suser\=(?<user>[^\s]+)\sjob\=(?<job_id_1>[^ ]+)" 
| rex field=_raw "(?<prolog_end>[^ ]+)\-\d{2}\:\d{2}.*slurm\-prolog\:\s*END\suser\=(?<user>[^\s]+)\sjob\=(?<job_id_1>[^ ]+)" 
| eval job_id=coalesce(job_id, job_id_1)
| stats values(epilog_start) as epilog_start values(prolog_end) as prolog_end first(_time) as _time values(job_name) as job_name values(nodelist) as nodelist values(time_start) as time_start values(time_end) as time_end values(state) as state values(submit_line) as submit_line by host user job_id

i.e. this makes the rex statement extract job_id_1 from _raw, so will work on the syslog events, but not the slurm events, so then you coalesce job_id from either the extracted job_id_1 or the slurm job_id.

Adding values(X) as X will mean that the stats statement will carry forward the slurm job details for that job ID that is coming from the 3rd join.

The other thing to consider when using sort/dedup in the stream is - why are you sorting before the END of the query - there is generally never a need to sort until you really need to - note that you are not using sort 0, so you will only ever sort 10000 events.

Using dedup will remove events from the stream, so you may want a specific order for dedup to work with, but again, you can often dedup using stats, which is probably a better option.

In the same way, you should look at how you can manipulate the search 2 and 3 events in the same stream and modify your stats to work as you need it.

vrmandadi · ‎11-10-2022

@bowesmana Thank you for you reply.I tried using multiple OR conditions and I was unable to get any results.Below is the search I used.I am kinda new to doing joins and complex commands.Can you please let me know the search how it should be.

(index=syslog (process=*epilog* "*slurm-epilog: START user*")  OR (process=*prolog* "*slurm-prolog: END user*") OR (index=syslog "Linux version")  OR (index=syslog "NVRM: Xid*" process=kernel) 
host IN (preos*)) OR
(index="slurm-jobs" nodelist=preos*)
| rex field=_raw "(?<epilog_start>[^ ]+)\-\d{2}\:\d{2}.*slurm\-epilog\:\s*START\suser\=(?<user>[^\s]+)\sjob\=(?<job_id>[^ ]+)"
| rex field=_raw "(?<prolog_end>[^ ]+)\-\d{2}\:\d{2}.*slurm\-prolog\:\s*END\suser\=(?<user>[^\s]+)\sjob\=(?<job_id_1>[^ ]+)"
| rex field=_raw "(?<reboot_time>[^ ]+)\-\d{2}\:\d{2}.*kernel:\s.*Linux\sversion"
| rex field=_raw "(?<kernel_xid>[^ ]+)\-\d{2}\:\d{2}.*NVRM\:\sXid\s*\(PCI\:(?<PCIe_Bus_Id>[^ ]+)\)\:\s*(?<Error_Code>[^ ]+)\,\spid\=(?<pid>[^ ]+)\,\s*name\=(?<name>[^ ]+)\,\s(?<Log_Message>.*)"
| eval job_id=coalesce(job_id, job_id_1)
| stats values(epilog_start) as epilog_start values(prolog_end) as prolog_end first(_time) as _time values(reboot_time) as reboot_time values(kernel_xid) as kernel_xid values(PCIe_Bus_Id) as PCIe_Bus_Id values(pid) as pid values(name) as name values(Error_Code) as Error_Code values(Log_Message) as Log_Message values(job_name) as job_name values(nodelist) as nodelist values(time_start) as time_start values(time_end) as time_end values(state) as state values(submit_line) as submit_line by host user job_id
| convert timeformat="%Y-%m-%dT%H:%M:%S.%6N" mktime(prolog_end) as job_start
| convert timeformat="%Y-%m-%dT%H:%M:%S.%6N" mktime(epilog_start) as epilog_start
| convert timeformat="%Y-%m-%dT%H:%M:%S.%6N" mktime(reboot_time) as reboot_time
| eval diff_reboot = reboot_time - job_start
| eval reboot_time = if(diff_reboot<0, "", reboot_time)
| eval diff_reboot = if(diff_reboot<0, "", diff_reboot)
| eval job_end = if(epilog_start!="",epilog_start,if(diff_reboot>0,reboot_time,current_time))
| eval diff_reboot = if(diff_reboot!="",diff_reboot,10000000000)
| convert timeformat="%Y-%m-%dT%H:%M:%S.%6N" mktime(kernel_xid) as xid_time
| eval diff = xid_time - job_start
| eval diff2 = job_end - xid_time
| search diff>0 AND diff2>0
| eval job_start = strftime(job_start,"%Y-%m-%d %H:%M:%S.%3N") 
| eval job_end = if(job_end==current_time,"N/A",strftime(job_end,"%Y-%m-%d %H:%M:%S.%3N"))
| eval xid_time = strftime(xid_time,"%Y-%m-%d %H:%M:%S.%3N") 
| eval current_time = strftime(current_time,"%Y-%m-%d %H:%M:%S.%3N") 
| eval reboot_time = strftime(reboot_time,"%Y-%m-%d %H:%M:%S.%3N")

bowesmana · ‎11-10-2022

@vrmandadi

The best way to resolve this type of complex search is to work with a small data set, where you have an expected set of results.

Then build up the search piece by piece, looking at the results at each step to make sure you are getting what you expect.

It's impossible for me to diagnose the search without a good view of your data, so I would suggest

Find a small data set
Run the first part of the search including the rex statements and "table" out all the fields you need to exist before you are going to do the stats command
- Also include fields like index and sourcetype, so you can identify that the data is valid as you expect it.
- i.e. fields
  - epilog_start, user, job_id
  - prolog_end, user, job_id_1
  - reboot_time
  - kernel_xid, PCIe_Bus_Id, Error_Code, pid, name, Log_Message
Check after each step when you run the search that you get something that makes sense
Add the stats command and see what you get after that - does it still make sense
Then gradually build in the extra lines, piece by piece until you discover where your data is not working properly
Your first(_time) may now not give you what you wanted, as "first" will be the first message of any of the data types you are collecting - although you don't appear to use _time after the stats anyway
Check that the values() fields are returning what you need - for example, if you have multivalue fields after the stats, then some operations won't work as expected on the MV data
Make sure that host, user and job_id ALL exist before the stats for ALL events, otherwise those events will not make their way through the stats command

How can I optimize my query ?

join

other

stats

New Case Study Shows the Value of Partnering with Splunk Academic Alliance

How to Monitor Google Kubernetes Engine (GKE)

Index This | How can you make 45 using only 4?