Splunk Search

need help to optimize splunk query from database feed

anantdeshpande
Path Finder

index=idx_cibca__prod:- Has data from database having all fields but not CUST_NAME ie why we used join
idx_cibca_lookups_prod: Has data from csv file for TOP 25 clients (Fields: COUNTRY_CODE,CUST_NAME,CUST_ID) and maximum 500 records.
We are using *eventstats max(_time) |dedup TRANSACTION_ID and where VALUE_DT=VAR_COMPARE_DT
because TRANSACTION_DT may get updated to any future date. So database will have only one record but Splunk will have two records (one with original date and one with new date). We want to skip those records whose date is updated to future date when searched by original date which is passed as token from dropdown.

Objective: Want count of transactions on the basis of individual status for only TOP 25 clients from csv file

Issue: Performance of the query is very slow and taking 4 min as there are 50 lakhs (5 millions) records in database for 10 days data.
We want to reduce data drastically before join.

index=idx_cibca_*_prod sourcetype=LVBT_*_DB_MASTER PAUYROLL_IND="Y" earliest=-10d@d
|join CUST_ID [search index=idx_cibca_lookups_prod source="*TOP_25_CLIENT_V*.CSV" CUST_NAME="*" |fields CUST_ID CUST_NAME] 
|eventstats max(_time) |dedup TRANSACTION_ID
|eval Current_Date=upper(strftime(now(),"%d-%b-%Y")) 
|eval VAR_DROPDOWN_DT="$Tran_Dt$" 
|eval VAR_COMPARE_DT=if(VAR_DROPDOWN_DT="now",Current_Date,VAR_DROPDOWN_DT) 
|where VALUE_DT=VAR_COMPARE_DT 
|table COUNTRY_CODE CUST_NAME CUST_ID PAYROLL_IND TRANSACTION_ID EXTRACT_STATUS POST_STATUS STATUS VAR_COMPARE_DT 
|eval "PAYROLL_STATUS"=if((like(EXTRACT_STATUS,"%ACK%")) OR (like(EXTRACT_STATUS,"%CMP%")) OR (like(EXTRACT_STATUS,"%SND%")) OR (like(EXTRACT_STATUS,"%ACK%")) AND (like(POST_STATUS,"%ACK%")),"SUCCESSFULL","PENDING")
|search PAYROLL_STATUS ="PENDING" 
|eval STATUS=if(((like(EXTRACT_STATUS,"%SUS%")) AND (like(POST_STATUS="%OFA%")),"Pending In OFAC"),if((like(EXTRACT_STATUS,"%RVP%")) AND (like(POST_STATUS="%RVP%")),"Pending In Fraud"))
|stats count by CUST_NAME COUNTRY_CODE CUST_ID PAYROLL_STATUS PAYRILL_IND STATUS VAR_COMPARE_DT
Tags (1)

richgalloway
SplunkTrust
SplunkTrust

If the idx_cibca_lookups_prod index only contains 500 records, consider writing it to a lookup file and using lookup instead of join.

---
If this reply helps you, Karma would be appreciated.

dwaddle
SplunkTrust
SplunkTrust

To add to Rich's excellent advice ... repeat the manta, "Splunk is not an SQL database" several times. The join command is useful in a few limited cases, but most of the time there are profoundly better ways of solving your problem than trying to use join.

But beyond that I'll admit I'm having trouble following your search - especially without some sample data

0 Karma

anantdeshpande
Path Finder

Sorry but we are not allowed to use lookup.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

That's too bad. Joins are expensive and getting rid of them could have earned you a bonus. 🙂

There are some small optimizations you can make.
Move the dedup command before eventstats.
Consider moving the 'search PAYROLL_STATUS ="PENDING"' command and it's associated eval as close to the beginning of the search as you can. With some tweaking you may be able to include it in the base search and reduce the number of rows read.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...