Splunk Search

Why is the join command running for a long time?

sarit_s
Communicator

Hello,

I have a query that is running for a long time, is it because of the join part?

What is the best way to replace it?

Thanks.

Query:

(index=ssys_*_pj OR index=other) AND (RequestedHeadVoltagesSupport OR RequestedHeadVoltagesModel )
source=***
index!=ssys_internal*
source=*_HM.cfg
| `SerialNumber`
| search SerialNumber=$SerialNumber$
| search SerialNumber!="pjserialnumber"
|`Region`
| search Region=$Region$
| rex field=source "PerMachine_(?<WizardMode>[^\.]+)" 
| eval WizardModeRenamed=if(WizardMode=="HM" OR WizardMode=="HS","HS/HM",WizardMode) 
| rex "RequestedHeadVoltagesSupport=(?<head0>[^,]+),(?<head1>[^,]+)," 
| rex "RequestedHeadVoltagesModel=[^,]+,[^,]+,(?<head2>[^,]+),(?<head3>[^,]+),(?<head4>[^,]+),(?<head5>[^,]+),(?<head6>[^,]+),(?<head7>[^,]+)"
| stats latest(*) as * max(_time) as fileDate latest(source) as source by SerialNumber Region WizardMode
| table fileDate Region SerialNumber WizardModeRenamed WizardMode head* source
| eval id=Region+"@"+SerialNumber+"@"+WizardModeRenamed+"@"+WizardMode+"@"+fileDate+"@"+source
| table id head* 
| rename head* as *
| untable id head HeadVoltage
| eval id=split(id,"@")
| eval Region=mvindex(id,0)
| eval SerialNumber=mvindex(id,1)
| eval WizardModeRenamed=mvindex(id,2)
| eval WizardMode=mvindex(id,3)
| eval HeadVoltageDate=mvindex(id,4)
| eval source=mvindex(id,5)
|`PrinterType`

| join Region SerialNumber PrinterType head  type=left
    [search (index=ssys_*_pj OR index=other) NOT source=*Bio_Mimics*  
source=***    
("Serial Number") OR ("Maintenance counter \"H*")
index!=ssys_internal*
|`SerialNumber`
| search SerialNumber=$SerialNumber$
| search SerialNumber!="pjserialnumber"
|`PrinterType`
|`Region`
| search Region=$Region$
| eval head_lifetime_date=if(isnotnull(head_lifetime),_time,null())
| fields Region SerialNumber PrinterType head HeadSerialNumber HeadType head_lifetime head_lifetime_date
| stats latest(*) as * by Region SerialNumber PrinterType head
| search HeadSerialNumber="$HeadSerialNumber$" HeadType="$HeadType$"]
| eval hsn_prefix=substr(HeadSerialNumber,1,5)
| eval flag=if(hsn_prefix>=11808 AND hsn_prefix<=11902 AND HeadVoltage>=29.5,"red",0)
| table Region PrinterType SerialNumber head HeadSerialNumber HeadType WizardMode HeadVoltage HeadVoltageDate head_lifetime head_lifetime_date source flag
| rename head_lifetime as last_known_head_lifetime head_lifetime_date as last_known_head_lifetime_date
| convert ctime(HeadVoltageDate) ctime(last_known_head_lifetime_date) timeformat="%F %T"
0 Karma

MuS
Legend

Hi sarit_s,

As a striating point combine your two base search into one search:

( index=ssys_*_pj OR index=other) AND (RequestedHeadVoltagesSupport OR RequestedHeadVoltagesModel )  source=*** index!=ssys_internal* source=*_HM.cfg ) OR ( (index=ssys_*_pj OR index=other) NOT source=*Bio_Mimics*  source=*** ("Serial Number") OR ("Maintenance counter \"H*") index!=ssys_internal* )
| stats values(*) AS * by Region SerialNumber PrinterType 

then use all the eval, macros, table and so on on this result set. Much faster and you don't hit hidden limits of join.

You can read more about this here https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo...

Hope this helps ...

cheers, MuS

0 Karma

sarit_s
Communicator

Hi MuS
Thanks !
the query returns no results 😞

0 Karma

MuS
Legend

Hi there,

look the only one that can make that work is you.

I would recommend to take baby steps and start with a simple base search like this:

( index=ssys_*_pj OR index=other) AND (RequestedHeadVoltagesSupport OR RequestedHeadVoltagesModel )  source=*** index!=ssys_internal* source=*_HM.cfg ) OR ( (index=ssys_*_pj OR index=other) NOT source=*Bio_Mimics*  source=*** ("Serial Number") OR ("Maintenance counter \"H*") index!=ssys_internal* )

from there work your way forward by adding a fields to the search to only return what you really need. After that add all the eval,macros, and additional search filters until you get what you need.

cheers, MuS

0 Karma

sarit_s
Communicator

thanks Mus.
i tried this :

( index=ssys_*_pj OR index=other) AND (RequestedHeadVoltagesSupport OR RequestedHeadVoltagesModel )  (source=*** index!=ssys_internal* source=*_HM.cfg ) OR ( (index=ssys_*_pj OR index=other) NOT source=*Bio_Mimics*  source=*** ("Serial Number") OR ("Maintenance counter \"H*") index!=ssys_internal* )
| `SerialNumber`
| search SerialNumber=*
| search SerialNumber!="pjserialnumber"
|`Region`
| search Region=*
| rex field=source "PerMachine_(?<WizardMode>[^\.]+)" 
| eval WizardModeRenamed=if(WizardMode=="HM" OR WizardMode=="HS","HS/HM",WizardMode) 
| rex "RequestedHeadVoltagesSupport=(?<head0>[^,]+),(?<head1>[^,]+)," 
| rex "RequestedHeadVoltagesModel=[^,]+,[^,]+,(?<head2>[^,]+),(?<head3>[^,]+),(?<head4>[^,]+),(?<head5>[^,]+),(?<head6>[^,]+),(?<head7>[^,]+)"
| stats latest(*) as * max(_time) as fileDate latest(source) as source by SerialNumber Region WizardMode
| table fileDate Region SerialNumber WizardModeRenamed WizardMode head* source
| eval id=Region+"@"+SerialNumber+"@"+WizardModeRenamed+"@"+WizardMode+"@"+fileDate+"@"+source
| table id head* 
| rename head* as *
| untable id head HeadVoltage
| eval id=split(id,"@")
| eval Region=mvindex(id,0)
| eval SerialNumber=mvindex(id,1)
| eval WizardModeRenamed=mvindex(id,2)
| eval WizardMode=mvindex(id,3)
| eval HeadVoltageDate=mvindex(id,4)
| eval source=mvindex(id,5)
|`PrinterType`
| eval head_lifetime_date=if(isnotnull(head_lifetime),_time,null())
| fields Region SerialNumber PrinterType head HeadSerialNumber HeadType head_lifetime head_lifetime_date
| stats latest(*) as * by Region SerialNumber PrinterType head
| search HeadSerialNumber="*" HeadType="*"
| eval hsn_prefix=substr(HeadSerialNumber,1,5)
| eval flag=if(hsn_prefix>=11808 AND hsn_prefix<=11902 AND HeadVoltage>=29.5,"red",0)
| table Region PrinterType SerialNumber head HeadSerialNumber HeadType WizardMode HeadVoltage HeadVoltageDate head_lifetime head_lifetime_date source flag
| rename head_lifetime as last_known_head_lifetime head_lifetime_date as last_known_head_lifetime_date
| convert ctime(HeadVoltageDate) ctime(last_known_head_lifetime_date) timeformat="%F %T"

it is also slow but not returning any results also there are 3,222 events

0 Karma

sarit_s
Communicator

any idea ?

0 Karma

koshyk
Super Champion

your query can be vastly improved imo. Please give us sample data and final output you expect

0 Karma

sarit_s
Communicator

this is the link to image of the output

https://imgur.com/vXTvKZT

0 Karma

sarit_s
Communicator

any idea ?

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...