Splunk Search

Why is the join command running for a long time?

sarit_s
Communicator

Hello,

I have a query that is running for a long time, is it because of the join part?

What is the best way to replace it?

Thanks.

Query:

(index=ssys_*_pj OR index=other) AND (RequestedHeadVoltagesSupport OR RequestedHeadVoltagesModel )
source=***
index!=ssys_internal*
source=*_HM.cfg
| `SerialNumber`
| search SerialNumber=$SerialNumber$
| search SerialNumber!="pjserialnumber"
|`Region`
| search Region=$Region$
| rex field=source "PerMachine_(?<WizardMode>[^\.]+)" 
| eval WizardModeRenamed=if(WizardMode=="HM" OR WizardMode=="HS","HS/HM",WizardMode) 
| rex "RequestedHeadVoltagesSupport=(?<head0>[^,]+),(?<head1>[^,]+)," 
| rex "RequestedHeadVoltagesModel=[^,]+,[^,]+,(?<head2>[^,]+),(?<head3>[^,]+),(?<head4>[^,]+),(?<head5>[^,]+),(?<head6>[^,]+),(?<head7>[^,]+)"
| stats latest(*) as * max(_time) as fileDate latest(source) as source by SerialNumber Region WizardMode
| table fileDate Region SerialNumber WizardModeRenamed WizardMode head* source
| eval id=Region+"@"+SerialNumber+"@"+WizardModeRenamed+"@"+WizardMode+"@"+fileDate+"@"+source
| table id head* 
| rename head* as *
| untable id head HeadVoltage
| eval id=split(id,"@")
| eval Region=mvindex(id,0)
| eval SerialNumber=mvindex(id,1)
| eval WizardModeRenamed=mvindex(id,2)
| eval WizardMode=mvindex(id,3)
| eval HeadVoltageDate=mvindex(id,4)
| eval source=mvindex(id,5)
|`PrinterType`

| join Region SerialNumber PrinterType head  type=left
    [search (index=ssys_*_pj OR index=other) NOT source=*Bio_Mimics*  
source=***    
("Serial Number") OR ("Maintenance counter \"H*")
index!=ssys_internal*
|`SerialNumber`
| search SerialNumber=$SerialNumber$
| search SerialNumber!="pjserialnumber"
|`PrinterType`
|`Region`
| search Region=$Region$
| eval head_lifetime_date=if(isnotnull(head_lifetime),_time,null())
| fields Region SerialNumber PrinterType head HeadSerialNumber HeadType head_lifetime head_lifetime_date
| stats latest(*) as * by Region SerialNumber PrinterType head
| search HeadSerialNumber="$HeadSerialNumber$" HeadType="$HeadType$"]
| eval hsn_prefix=substr(HeadSerialNumber,1,5)
| eval flag=if(hsn_prefix>=11808 AND hsn_prefix<=11902 AND HeadVoltage>=29.5,"red",0)
| table Region PrinterType SerialNumber head HeadSerialNumber HeadType WizardMode HeadVoltage HeadVoltageDate head_lifetime head_lifetime_date source flag
| rename head_lifetime as last_known_head_lifetime head_lifetime_date as last_known_head_lifetime_date
| convert ctime(HeadVoltageDate) ctime(last_known_head_lifetime_date) timeformat="%F %T"
0 Karma

MuS
Legend

Hi sarit_s,

As a striating point combine your two base search into one search:

( index=ssys_*_pj OR index=other) AND (RequestedHeadVoltagesSupport OR RequestedHeadVoltagesModel )  source=*** index!=ssys_internal* source=*_HM.cfg ) OR ( (index=ssys_*_pj OR index=other) NOT source=*Bio_Mimics*  source=*** ("Serial Number") OR ("Maintenance counter \"H*") index!=ssys_internal* )
| stats values(*) AS * by Region SerialNumber PrinterType 

then use all the eval, macros, table and so on on this result set. Much faster and you don't hit hidden limits of join.

You can read more about this here https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo...

Hope this helps ...

cheers, MuS

0 Karma

sarit_s
Communicator

Hi MuS
Thanks !
the query returns no results 😞

0 Karma

MuS
Legend

Hi there,

look the only one that can make that work is you.

I would recommend to take baby steps and start with a simple base search like this:

( index=ssys_*_pj OR index=other) AND (RequestedHeadVoltagesSupport OR RequestedHeadVoltagesModel )  source=*** index!=ssys_internal* source=*_HM.cfg ) OR ( (index=ssys_*_pj OR index=other) NOT source=*Bio_Mimics*  source=*** ("Serial Number") OR ("Maintenance counter \"H*") index!=ssys_internal* )

from there work your way forward by adding a fields to the search to only return what you really need. After that add all the eval,macros, and additional search filters until you get what you need.

cheers, MuS

0 Karma

sarit_s
Communicator

thanks Mus.
i tried this :

( index=ssys_*_pj OR index=other) AND (RequestedHeadVoltagesSupport OR RequestedHeadVoltagesModel )  (source=*** index!=ssys_internal* source=*_HM.cfg ) OR ( (index=ssys_*_pj OR index=other) NOT source=*Bio_Mimics*  source=*** ("Serial Number") OR ("Maintenance counter \"H*") index!=ssys_internal* )
| `SerialNumber`
| search SerialNumber=*
| search SerialNumber!="pjserialnumber"
|`Region`
| search Region=*
| rex field=source "PerMachine_(?<WizardMode>[^\.]+)" 
| eval WizardModeRenamed=if(WizardMode=="HM" OR WizardMode=="HS","HS/HM",WizardMode) 
| rex "RequestedHeadVoltagesSupport=(?<head0>[^,]+),(?<head1>[^,]+)," 
| rex "RequestedHeadVoltagesModel=[^,]+,[^,]+,(?<head2>[^,]+),(?<head3>[^,]+),(?<head4>[^,]+),(?<head5>[^,]+),(?<head6>[^,]+),(?<head7>[^,]+)"
| stats latest(*) as * max(_time) as fileDate latest(source) as source by SerialNumber Region WizardMode
| table fileDate Region SerialNumber WizardModeRenamed WizardMode head* source
| eval id=Region+"@"+SerialNumber+"@"+WizardModeRenamed+"@"+WizardMode+"@"+fileDate+"@"+source
| table id head* 
| rename head* as *
| untable id head HeadVoltage
| eval id=split(id,"@")
| eval Region=mvindex(id,0)
| eval SerialNumber=mvindex(id,1)
| eval WizardModeRenamed=mvindex(id,2)
| eval WizardMode=mvindex(id,3)
| eval HeadVoltageDate=mvindex(id,4)
| eval source=mvindex(id,5)
|`PrinterType`
| eval head_lifetime_date=if(isnotnull(head_lifetime),_time,null())
| fields Region SerialNumber PrinterType head HeadSerialNumber HeadType head_lifetime head_lifetime_date
| stats latest(*) as * by Region SerialNumber PrinterType head
| search HeadSerialNumber="*" HeadType="*"
| eval hsn_prefix=substr(HeadSerialNumber,1,5)
| eval flag=if(hsn_prefix>=11808 AND hsn_prefix<=11902 AND HeadVoltage>=29.5,"red",0)
| table Region PrinterType SerialNumber head HeadSerialNumber HeadType WizardMode HeadVoltage HeadVoltageDate head_lifetime head_lifetime_date source flag
| rename head_lifetime as last_known_head_lifetime head_lifetime_date as last_known_head_lifetime_date
| convert ctime(HeadVoltageDate) ctime(last_known_head_lifetime_date) timeformat="%F %T"

it is also slow but not returning any results also there are 3,222 events

0 Karma

sarit_s
Communicator

any idea ?

0 Karma

koshyk
Super Champion

your query can be vastly improved imo. Please give us sample data and final output you expect

0 Karma

sarit_s
Communicator

this is the link to image of the output

https://imgur.com/vXTvKZT

0 Karma

sarit_s
Communicator

any idea ?

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...