Hi How can I tune this spl command?
this spl execute daily, and return something like this:
servername send receive customer ID status
Customer4 2021-21-11 12:12:39 2021-21-11 12:15:03 CUS.AaBB-APP1-12345_CUS 10 144.772000
Customer3 2021-21-11 12:09:58 2021-21-11 12:12:03 CUS.AaBB-APP1-98765_CUS 20 125.616000
here is statics belong this query:
events 72,070,802 (11/21/21 12:00:00.000 AM to 11/22/21 12:00:00.000 AM)
Size 2.09 GB
Statistics (248,138)
it take huge time to return result is there any way to tune query or any trick that return this result faster?
FYI: I try to use summer index but still take long time to return result.
Here is my query:
index="myindex" source="/data/product/*/customer*" (date_hour>=1 AND (date_hour<23 OR (date_hour=23 date_minute<30))) "Packet Processed" OR "Normal Packet Received"
| rex field=source "\/data\/(?<product>\w+)\/(?<date>\d+)\/(?<servername>\w+)"
| rex ID\[(?<ID>\d+)
| rex "^(?<timestamp>.{23}) INFO (?<customer>.*) \[AppServiceName\] (?<status>.*): M\[(?<Acode>.*)\] T\[(?<Bcode>\d+)\]"
| rex field=customer "_(?<customer2>.*)"
| eval customer2=coalesce(customer2,customer), customer=if(customer=customer2,null(),customer)
| eval sendTime=if(status="Packet Processed",strptime(timestamp,"%Y-%m-%d %H:%M:%S,%3Q"),null()), receiveTime=if(status="Normal Packet Received",strptime(timestamp,"%Y-%m-%d %H:%M:%S,%3Q"),null())
| eval AcodeSend=if(status="Packet Processed",Acode,null()),BcodeSend=if(status="Packet Processed",Bcode,null()),AcodeReceive=if(status="Normal Packet Received",Acode,null()),BcodeReceive=if(status="Normal Packet Received",Bcode,null())
| eval AcodeReceiveLookFor=AcodeSend+10,acr=coalesce(AcodeReceive,AcodeReceiveLookFor)
| fields - Acode _time timestamp status AcodeReceiveLookFor
| stats values(*) as *,count by customer2,acr,Bcode
| eval duration=receiveTime-sendTime , customer=coalesce(customer,customer2)
| eval status=case(isnull(AcodeSend),"No Send",isnull(AcodeReceive),"No receive")
| eventstats max(duration) as duration by customer2
| where count=2 OR (status="No receive" AND isnull(duration))
| eval status=coalesce(status,duration)
| search NOT status="No receive"
| search NOT status="No Send"
| search status>2
| eval send=strftime(sendTime, "%Y-%d-%m %H:%M:%S") | eval receive=strftime(receiveTime, "%Y-%d-%m %H:%M:%S")
| table servername send receive customer ID status
Any idea?
Thanks
The first rex statement is redundant - you don't use the fields.
You are setting status to "No receive" or "No Send". Then you are testing those string values later on and then you exclude those from results with the search filter. Why not set these to some negative numeric - I assume you will not get negative durations, so then
| eval status=case(isnull(AcodeSend),-999,isnull(AcodeReceive),-998)
| eventstats max(duration) as duration by customer2
| where count=2 OR (status=-998 AND isnull(duration))
| eval status=coalesce(status,duration)
| search status>2
However, I'm a little unsure of your where statement, which allows "No receive" case to pass through, then you search for NOT status="No receive" to remove it. Am I missing something?
It's worth looking at the job inspector to help understand where the time is spent. I suspect it might be useful to convert the initial status to a type, so that you do not have to do those 'if (status="blablabla") comparisons 6 times for every event. 3 will match quickly, but 3 will not, so instead do
| eval type=case(status="Packet Processed", 1,
status="Normal Packet Received", 2)
and then use type instead of status for the other tests. The string length for received events is 22 characters, * 3 * 72,000,000 events, so it will only satisfy the match when fully matched.
AcodeReceiveLookFor variable is not necessary - just use the AcodeSend+10 value in the coalesce directly
It looks like you are effectively running a report on the previous day, so schedule a search sometime between 23:30 and 01:00 the following day and save the results as a saved search, then you just need to ref(erence) the saved report.
Other things you might try are, having evaluated status and set sendTime and receiveTime appropriately, it might be slightly quicker to evaluate whether these are null or not instead of doing the same string compare again.
Also, has _time already been set correctly when the event was indexed? If so, you might not need to extract timestamp and parse it again.
Why extract product date and servername if you are not using them?