HI query joining 2 searches on left join.
Its matching some rows and not matching some rows although the column where I join on is clearly seen in both searches.
index=sky sourcetype=sky_trade_murex_timestamp
| rex field=_raw "trade_id=\"(?<trade_id>\d+)\""
| rex field=_raw "mx_status=\"(?<mx_status>[^\"]+)\""
| rex field=_raw "sky_id=\"(?<sky_id>\d+)\""
| rex field=_raw "event_id=\"(?<event_id>\d+)\""
| rex field=_raw "operation=\"(?<operation>[^\"]+)\""
| rex field=_raw "action=\"(?<action>[^\"]+)\""
| rex field=_raw "tradebooking_sgp=\"(?<tradebooking_sgp>[^\"]+)\""
| rex field=_raw "portfolio_name=\"(?<portfolio_name>[^\"]+)\""
| rex field=_raw "portfolio_entity=\"(?<portfolio_entity>[^\"]+)\""
| rex field=_raw "trade_type=\"(?<trade_type>[^\"]+)\""
| rename trade_id as NB
| dedup NB
| eval NB = tostring(trim(NB))
| table sky_id, NB, event_id, mx_status, operation, action, tradebooking_sgp, portfolio_name, portfolio_entity, trade_type
| join type=left NB
[ search index=sky sourcetype=mx_to_sky
| rex field=_raw "(?<NB>\d+);(?<TRN_STATUS>[^;]+);(?<NOMINAL>[^;]+);(?<CURRENCY>[^;]+);(?<TRN_FMLY>[^;]+);(?<TRN_GRP>[^;]+);(?<TRN_TYPE>[^;]*);(?<BPFOLIO>[^;]*);(?<SPFOLIO>[^;]*)"
| eval NB = tostring(trim(NB))
| table TRN_STATUS, NB, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO]
| table sky_id, NB, event_id, mx_status, operation, action, tradebooking_sgp, portfolio_name, portfolio_entity, trade_type, TRN_STATUS, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO
This above is my source code
And the raw data is
Time Event
27/12/2024
17:05:39.000
32265376;DEAD;3887.00000000;XAU;CURR;FXD;FXD;CM TR GLD AUS;X_CMTR XAU SWAP
host = APPSG002SIN0117source = D:\SkyNet\data\mx_trade_report\MX2_TRADE_STATUS_20241227_200037.csvsourcetype = mx_to_sky
Time Event
27/12/2024
18:05:36.651
2024-12-27 18:05:36.651, system="murex", id="645131777", sky_id="645131777", trade_id="32265483", event_id="100023788", mx_status="DEAD", operation="NETTING", action="insertion", tradebooking_sgp="2024/12/26 01:02:01.0000", eventtime_sgp="2024/12/26 01:01:51.7630", sky_to_mq_latency="-9.-237", portfolio_name="I CREDIT INC", portfolio_entity="ANZSEC INC", trade_type="BondTrade"
host = APPSG002SIN0032source = sky_trade_murex_timestamp sourcetype = sky_trade_murex_timestamp
Hi all,
this i get about
| join type=left NB
[ search index=sky sourcetype=mx_to_sky
| rex field=_raw "(?<NB>[^;]+);(?<TRN_STATUS>[^;]+);(?<NOMINAL>[^;]+);(?<CURRENCY>[^;]+);(?<TRN_FMLY>[^;]+);(?<TRN_GRP>[^;]+);(?<TRN_TYPE>[^;]*);(?<BPFOLIO>[^;]*);(?<SPFOLIO>[^;]*)"
| table TRN_STATUS, NB, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO]
| table sky_id, NB, event_id, mx_status, operation, action, tradebooking_sgp, portfolio_name, portfolio_entity, trade_type
| table sky_id, NB, event_id, mx_status, operation, action, tradebooking_sgp, portfolio_name, portfolio_entity, trade_type, TRN_STATUS, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO
index=sky sourcetype=mx_to_sky
| rex field=_raw "(?<NB>[^;]+);(?<TRN_STATUS>[^;]+);(?<NOMINAL>[^;]+);(?<CURRENCY>[^;]+);(?<TRN_FMLY>[^;]+);(?<TRN_GRP>[^;]+);(?<TRN_TYPE>[^;]*);(?<BPFOLIO>[^;]*);(?<SPFOLIO>[^;]*)"
| table TRN_STATUS, NB, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO
This below i get about 1,065,810 events in a day
Assuming both your joined searches produce proper results (it's up to you to check it - we don't know), the easiest and most straightforward way to avoid join altogether is to use multisearch to run those in parallel and then stats the results. This way you're not prone to hit join's limits and since your searches are streaming ones you can use multisearch and you're not limited by subsearch limits which you might hit when using append.
| multisearch [
search index=sky sourcetype=sky_trade_murex_timestamp
| rex field=_raw "trade_id=\"(?<trade_id>\d+)\""
| rex field=_raw "mx_status=\"(?<mx_status>[^\"]+)\""
| rex field=_raw "sky_id=\"(?<sky_id>\d+)\""
| rex field=_raw "event_id=\"(?<event_id>\d+)\""
| rex field=_raw "operation=\"(?<operation>[^\"]+)\""
| rex field=_raw "action=\"(?<action>[^\"]+)\""
| rex field=_raw "tradebooking_sgp=\"(?<tradebooking_sgp>[^\"]+)\""
| rex field=_raw "portfolio_name=\"(?<portfolio_name>[^\"]+)\""
| rex field=_raw "portfolio_entity=\"(?<portfolio_entity>[^\"]+)\""
| rex field=_raw "trade_type=\"(?<trade_type>[^\"]+)\""
| rename trade_id as NB
| dedup NB
| eval NB = tostring(trim(NB))
| table sky_id, NB, event_id, mx_status, operation, action, tradebooking_sgp, portfolio_name, portfolio_entity, trade_type ]
[ search index=sky sourcetype=mx_to_sky
| rex field=_raw "(?<NB>\d+);(?<TRN_STATUS>[^;]+);(?<NOMINAL>[^;]+);(?<CURRENCY>[^;]+);(?<TRN_FMLY>[^;]+);(?<TRN_GRP>[^;]+);(?<TRN_TYPE>[^;]*);(?<BPFOLIO>[^;]*);(?<SPFOLIO>[^;]*)"
| eval NB = tostring(trim(NB))
| table TRN_STATUS, NB, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO ]
| stats values(*) as * by NB
Thanks!! I get this error though
Ahhhh... The "table" command is a transforming command. So you can't use it in either search. Use "fields" instead.
ah yes i removed those and the search continued. But it was so laggy as there were many events, i did not get a proper search result without it hanging
And the problem is these columns are empty for some and populated for some. For those empty, I clearly checked the NB is matching in both searches
TRN_STATUS, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO
Your regex assumes (insists!) that the event has 9 fields separated by (8) semi-colons - your sample data has only 8 fields separated by 7 semi-colons.
Sorry how do i rectify it?
32265376;DEAD;3887.00000000;XAU;CURR;FXD;FXD;CM TR GLD AUS;X_CMTR XAU SWAP
Did u mean this
Sorry I miscounted, it does look right - the issue here is that the trade_id does not match the first field in the mx_to_sky event
No problem. But why is it that some rows are populated and some are not then? i.e some match and some don't match. I renamed trade_id as NB then left=join NB so it should join 2 of them together but why does it not work for some rows or columns although clearly it matches in both searches?
If there are results for the trade_id/NB in both searches, then it is possible that the rex has not extracted the fields as you expect. Please share the two events which don't match
Its quite random but here is one
2024-12-27 23:05:09.917, system="murex", id="645437844", sky_id="645437844", trade_id="31791027", event_id="100038914", mx_status="live", operation="nooperation", action="fixing", tradebooking_sgp="2024/12/27 08:42:39.0000", eventtime_sgp="2024/12/27 08:42:33.6400", sky_to_mq_latency="-5.-360", portfolio_name="A ZZ AUD LTR", portfolio_entity="ANZBG MELB", trade_type="BasisSwap"
31791027;LIVE;17500000.00000000;AUD;IRD;IRS;;A ZZ AUD LTR;X_GCM_IRD_AUCNZ
index=sky sourcetype=sky_trade_murex_timestamp
| rex field=_raw "trade_id=\"(?<trade_id>\d+)\""
| rex field=_raw "mx_status=\"(?<mx_status>[^\"]+)\""
| rex field=_raw "sky_id=\"(?<sky_id>\d+)\""
| rex field=_raw "event_id=\"(?<event_id>\d+)\""
| rex field=_raw "operation=\"(?<operation>[^\"]+)\""
| rex field=_raw "action=\"(?<action>[^\"]+)\""
| rex field=_raw "tradebooking_sgp=\"(?<tradebooking_sgp>[^\"]+)\""
| rex field=_raw "portfolio_name=\"(?<portfolio_name>[^\"]+)\""
| rex field=_raw "portfolio_entity=\"(?<portfolio_entity>[^\"]+)\""
| rex field=_raw "trade_type=\"(?<trade_type>[^\"]+)\""
| eval trade_id = replace(trade_id, "\"", "")
| rename trade_id as NB
| table sky_id, NB, event_id, mx_status, operation, action, tradebooking_sgp, portfolio_name, portfolio_entity, trade_type
| join type=left NB
[ search index=sky sourcetype=mx_to_sky
| rex field=_raw "(?<NB>[^;]+);(?<TRN_STATUS>[^;]+);(?<NOMINAL>[^;]+);(?<CURRENCY>[^;]+);(?<TRN_FMLY>[^;]+);(?<TRN_GRP>[^;]+);(?<TRN_TYPE>[^;]*);(?<BPFOLIO>[^;]*);(?<SPFOLIO>[^;]*)"
| table TRN_STATUS, NB, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO]
| table sky_id, NB, event_id, mx_status, operation, action, tradebooking_sgp, portfolio_name, portfolio_entity, trade_type
| table sky_id, NB, event_id, mx_status, operation, action, tradebooking_sgp, portfolio_name, portfolio_entity, trade_type, TRN_STATUS, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO
For that NB i get all columns and empty for TRN_STATUS, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO
Subsearches are limited to 50,000 events which could account for the missing matches - try reducing the timeframes - do the matches appear then?
Thank you both, I need for >50k/>10k events
I am thinking of using appendcols but they are not able to join like this.
Any other work around?
both searches amass quite a huge number of events >50k >10k and I need to search for today which would be alot.
As has been said here many times, it is best to avoid using join - this is a classic case of why join should be avoided.
Try something like this
index=sky sourcetype=sky_trade_murex_timestamp OR sourcetype=mx_to_sky
``` Parse sky_trade_murex_timestamp events (note that trade_id is put directly into the NB field) ```
| rex field=_raw "trade_id=\"(?<NB>\d+)\""
| rex field=_raw "mx_status=\"(?<mx_status>[^\"]+)\""
| rex field=_raw "sky_id=\"(?<sky_id>\d+)\""
| rex field=_raw "event_id=\"(?<event_id>\d+)\""
| rex field=_raw "operation=\"(?<operation>[^\"]+)\""
| rex field=_raw "action=\"(?<action>[^\"]+)\""
| rex field=_raw "tradebooking_sgp=\"(?<tradebooking_sgp>[^\"]+)\""
| rex field=_raw "portfolio_name=\"(?<portfolio_name>[^\"]+)\""
| rex field=_raw "portfolio_entity=\"(?<portfolio_entity>[^\"]+)\""
| rex field=_raw "trade_type=\"(?<trade_type>[^\"]+)\""
``` Parse mx_to_sky events ```
| rex field=_raw "(?<NB>[^;]+);(?<TRN_STATUS>[^;]+);(?<NOMINAL>[^;]+);(?<CURRENCY>[^;]+);(?<TRN_FMLY>[^;]+);(?<TRN_GRP>[^;]+);(?<TRN_TYPE>[^;]*);(?<BPFOLIO>[^;]*);(?<SPFOLIO>[^;]*)"
``` Reduce to just the fields of interest ```
| table sky_id, NB, event_id, mx_status, operation, action, tradebooking_sgp, portfolio_name, portfolio_entity, trade_type, TRN_STATUS, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO
``` "Join" events by NB using stats ```
| stats values(*) as * by NB
im gonna try this thanks!!
I think i got something like all the results into one row and performance is very bad as there are many events i did not manage to get a proper search result
Try replacing the table command with fields
index=sky sourcetype=sky_trade_murex_timestamp OR sourcetype=mx_to_sky
``` Parse sky_trade_murex_timestamp events (note that trade_id is put directly into the NB field) ```
| rex field=_raw "trade_id=\"(?<NB>\d+)\""
| rex field=_raw "mx_status=\"(?<mx_status>[^\"]+)\""
| rex field=_raw "sky_id=\"(?<sky_id>\d+)\""
| rex field=_raw "event_id=\"(?<event_id>\d+)\""
| rex field=_raw "operation=\"(?<operation>[^\"]+)\""
| rex field=_raw "action=\"(?<action>[^\"]+)\""
| rex field=_raw "tradebooking_sgp=\"(?<tradebooking_sgp>[^\"]+)\""
| rex field=_raw "portfolio_name=\"(?<portfolio_name>[^\"]+)\""
| rex field=_raw "portfolio_entity=\"(?<portfolio_entity>[^\"]+)\""
| rex field=_raw "trade_type=\"(?<trade_type>[^\"]+)\""
``` Parse mx_to_sky events ```
| rex field=_raw "(?<NB>[^;]+);(?<TRN_STATUS>[^;]+);(?<NOMINAL>[^;]+);(?<CURRENCY>[^;]+);(?<TRN_FMLY>[^;]+);(?<TRN_GRP>[^;]+);(?<TRN_TYPE>[^;]*);(?<BPFOLIO>[^;]*);(?<SPFOLIO>[^;]*)"
``` Reduce to just the fields of interest ```
| fields sky_id, NB, event_id, mx_status, operation, action, tradebooking_sgp, portfolio_name, portfolio_entity, trade_type, TRN_STATUS, NOMINAL, CURRENCY, TRN_FMLY, TRN_GRP, TRN_TYPE, BPFOLIO, SPFOLIO
``` "Join" events by NB using stats ```
| dedup NB
| stats values(*) as * by NB
Remove the dedup NB! This is reducing your events to one event per NB which is why you are only getting half your data!
Good catch. I remembered that dedup was needlessly used in both "compound searches" but didn't notice that it was transfered to the "composite search". Indeed it only leaves us with one of the two or more "joinable" events.