Splunk Search

Is there an alternative to JOIN when records are more than 50000?

srujan9292
Explorer

Scenario - I have two indexes: index1 and index2.

Inner Query: I need to compare two indexes (Index1 and Index2) with Group Number and CORP_ENT_CD combination. If there was a match, i would extract DCN's of those matching columns.

Outer Query: I need to compare index1 and the result from above set and need to display unmatched columns.

Query:

index=index1
| rex "DCN (?<DCN>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)" 
 |join type=left DCN [ search 
     index=index1
      | rex "DCN (?<DCN>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)" 
      |eval dummy=GRP_NBR + CORP_ENT_CD
          |join dummy [search index=index2 earliest=-15d@d
          |rename "Group Number" as Group 
          |search Group=*
          |eval dummy= Group + CORP_ENT_CD 
                                  ]
      |fields DCN |table DCN 
      |eval matched="Yes"
                                  ]
 |fillnull value="No" matched
 |search matched="No"
---------

Is there any other way to increase the performance of this solution?

Tags (1)
0 Karma

woodcock
Esteemed Legend

Try this:

index=index1
| rex "DCN (?<DCN>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)" 
| search NOT [search index=index1 OR (index=index2 earliest=-15d@d)
| rex "DCN (?<DCN2>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)" 
| eval joiner=colesce($Group Number$, GRP_NBR) + CORP_ENT_CD
| eval DCN=coalesce(DCN, DCN2)
| stats values(DCN) AS DCN dc(index) AS indexCount BY joiner
| search indexCount>1
| table DCN
| mvexpand DCN
| dedup DCN]

niketnilay
Legend

@srujan9292, Try out union command or multisearch command which will not be impacted by sub-search limitation (depending on ow you are correlating, for example multisearch needs the searches to be correlated to be streaming).

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

baldwintm
Path Finder

Does this get what you want?:

index=index1 OR index=index2
| rex "DCN (?<DCN>.*)-SL:(?<ServiceLn>.*)-TS:(?<timestamp>\d+)"  | eval GRP_NBR=coalesce(GRP_NBR, Group Number)
| eval dummy=GRP_NBR + CORP_ENT_CD | fillnull value="" DCN
| stats count values (DCN) by dummy
| where count=1
0 Karma

kariras06
Explorer

This has helped me evaluate the different alternatives available in the past... Take a look to see if one may be helpful for your situation. Hope it helps! http://docs.splunk.com/Documentation/Splunk/7.1.3/Search/Abouteventcorrelation

Sukisen1981
Champion

I am facing the same issue as described here..any suggestions?

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!