Splunk Search

Is there an alternative to JOIN when records are more than 50000?

srujan9292
Explorer

Scenario - I have two indexes: index1 and index2.

Inner Query: I need to compare two indexes (Index1 and Index2) with Group Number and CORP_ENT_CD combination. If there was a match, i would extract DCN's of those matching columns.

Outer Query: I need to compare index1 and the result from above set and need to display unmatched columns.

Query:

index=index1
| rex "DCN (?<DCN>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)" 
 |join type=left DCN [ search 
     index=index1
      | rex "DCN (?<DCN>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)" 
      |eval dummy=GRP_NBR + CORP_ENT_CD
          |join dummy [search index=index2 earliest=-15d@d
          |rename "Group Number" as Group 
          |search Group=*
          |eval dummy= Group + CORP_ENT_CD 
                                  ]
      |fields DCN |table DCN 
      |eval matched="Yes"
                                  ]
 |fillnull value="No" matched
 |search matched="No"
---------

Is there any other way to increase the performance of this solution?

Tags (1)
0 Karma

woodcock
Esteemed Legend

Try this:

index=index1
| rex "DCN (?<DCN>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)" 
| search NOT [search index=index1 OR (index=index2 earliest=-15d@d)
| rex "DCN (?<DCN2>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)" 
| eval joiner=colesce($Group Number$, GRP_NBR) + CORP_ENT_CD
| eval DCN=coalesce(DCN, DCN2)
| stats values(DCN) AS DCN dc(index) AS indexCount BY joiner
| search indexCount>1
| table DCN
| mvexpand DCN
| dedup DCN]

niketn
Legend

@srujan9292, Try out union command or multisearch command which will not be impacted by sub-search limitation (depending on ow you are correlating, for example multisearch needs the searches to be correlated to be streaming).

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

baldwintm
Path Finder

Does this get what you want?:

index=index1 OR index=index2
| rex "DCN (?<DCN>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)"  | eval GRP_NBR=coalesce(GRP_NBR, Group Number)
| eval dummy=GRP_NBR + CORP_ENT_CD | fillnull value="" DCN
| stats count values (DCN) by dummy
| where count=1
0 Karma

kariras06
Explorer

This has helped me evaluate the different alternatives available in the past... Take a look to see if one may be helpful for your situation. Hope it helps! http://docs.splunk.com/Documentation/Splunk/7.1.3/Search/Abouteventcorrelation

Sukisen1981
Champion

I am facing the same issue as described here..any suggestions?

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

What Is Splunk? Here’s What You Can Do with Splunk

Hey Splunk Community, we know you know Splunk. You likely leverage its unparalleled ability to ingest, index, ...

Level Up Your .conf25: Splunk Arcade Comes to Boston

With .conf25 right around the corner in Boston, there’s a lot to look forward to — inspiring keynotes, ...

Manual Instrumentation with Splunk Observability Cloud: How to Instrument Frontend ...

Although it might seem daunting, as we’ve seen in this series, manual instrumentation can be straightforward ...