Splunk Search

Is there an alternative to JOIN when records are more than 50000?

srujan9292
Explorer

Scenario - I have two indexes: index1 and index2.

Inner Query: I need to compare two indexes (Index1 and Index2) with Group Number and CORP_ENT_CD combination. If there was a match, i would extract DCN's of those matching columns.

Outer Query: I need to compare index1 and the result from above set and need to display unmatched columns.

Query:

index=index1
| rex "DCN (?<DCN>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)" 
 |join type=left DCN [ search 
     index=index1
      | rex "DCN (?<DCN>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)" 
      |eval dummy=GRP_NBR + CORP_ENT_CD
          |join dummy [search index=index2 earliest=-15d@d
          |rename "Group Number" as Group 
          |search Group=*
          |eval dummy= Group + CORP_ENT_CD 
                                  ]
      |fields DCN |table DCN 
      |eval matched="Yes"
                                  ]
 |fillnull value="No" matched
 |search matched="No"
---------

Is there any other way to increase the performance of this solution?

Tags (1)
0 Karma

woodcock
Esteemed Legend

Try this:

index=index1
| rex "DCN (?<DCN>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)" 
| search NOT [search index=index1 OR (index=index2 earliest=-15d@d)
| rex "DCN (?<DCN2>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)" 
| eval joiner=colesce($Group Number$, GRP_NBR) + CORP_ENT_CD
| eval DCN=coalesce(DCN, DCN2)
| stats values(DCN) AS DCN dc(index) AS indexCount BY joiner
| search indexCount>1
| table DCN
| mvexpand DCN
| dedup DCN]

niketn
Legend

@srujan9292, Try out union command or multisearch command which will not be impacted by sub-search limitation (depending on ow you are correlating, for example multisearch needs the searches to be correlated to be streaming).

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

baldwintm
Path Finder

Does this get what you want?:

index=index1 OR index=index2
| rex "DCN (?<DCN>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)"  | eval GRP_NBR=coalesce(GRP_NBR, Group Number)
| eval dummy=GRP_NBR + CORP_ENT_CD | fillnull value="" DCN
| stats count values (DCN) by dummy
| where count=1
0 Karma

kariras06
Explorer

This has helped me evaluate the different alternatives available in the past... Take a look to see if one may be helpful for your situation. Hope it helps! http://docs.splunk.com/Documentation/Splunk/7.1.3/Search/Abouteventcorrelation

Sukisen1981
Champion

I am facing the same issue as described here..any suggestions?

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...