Hi,
I have two indexes - index=A and index=B Index A has events which index B do not have. And I am only interested in events which are present in index A and B. I want to filter them by trace id as this is common part for them. However trace id field has different name in both indexes. In A is journey.traceId in B is request.event.traceId.
How do I get results from both indexes, where traceIds are the same ?
I tried to do inner join as it seems that this would be good solution for that but I have 0 results , so I do it wrong.
index=A app=my-app journey.traceId=*
| fields journey.traceId
| rename journey.traceId as traceId
| join traceId type=inner [search index=B request.event.traceId=* "event"=Strategy AND "eventState"=IN_PROGRESS
| fields request.event.traceId
| rename request.event.traceId as traceId]
I would appreciate if someone could point me in the right direction, either how to properly do inner join or maybe there are better solutions for the problem I am trying to solve;
Thank you
Hi @aasiaa,
Splunk isn't a DB: join is a command to use only if you don't have any other solution because it's very slow!
So try a different approach correlating values using stats, something like this:
(index=A app=my-app journey.traceId=*) OR (index=B request.event.traceId=* "event"=Strategy AND "eventState"=IN_PROGRESS)
| eval traceId=coalesce(journey.traceId,request.event.traceId)
Stats dc(index) AS index_count BY traceId
| where index_count=2Ciao.
Giuseppe
Hi @gcusello , really appreciate your response. I tried your search and it gives me all the results from both indexes.
Index A has 200k events and B has 10k, with the search I got 210k events, I would expect to get 20k back, as all the events from B should be present in A, and I am not interested in any more events from A then just those with the same traceId as in B
Hi @aasiaa,
using my search (with the final where condition) you take only the traceId present in both the indexes,
you shouldn't have the sum of events.
Are you sure that traceIds have unique value?
if you're interested on the events from indexA that are also in index B, you can use the above search or this an easier search that has the limit that results from indexB must be less than 50,000:
index=A app=my-app journey.traceId=* [ search index=B request.event.traceId=* "event"=Strategy AND "eventState"=IN_PROGRESS
| fields request.event.traceId
| rename request.event.traceId as journey.traceId ]
| rename journey.traceId as traceIdCiao.
Giuseppe
Actually I double checked and in index A there may be couple of events with the same trace Id. But if the event is present also in B there should be just one event for the same trace id in A. Does it make a difference for the search ?
Hi @aasiaa,
no, I asked to understand what's the number of expected results:
are you sure that using the where condition at the end you have more results than the search on indexA?
could you share your search?
Ciao.
Giuseppe
@gcusello thank you for still looking at my issue;
my query is
(index=A app=my-app journey.traceId=*) OR (index=B request.event.traceId=* "event"=Strategy AND "eventState"=IN_PROGRESS)
| eval traceId=coalesce(journey.traceId,request.event.traceId)
| stats dc(index) AS index_count BY traceId
| where index_count=2search results before 'eval' so (index=A **** ) or (index=B ****) are exactly the same for the whole query, so it looks like 'eval', 'stats' and/or 'where' does not do anything; when running search for index separate
index=A app=my-app journey.traceId=* gives me 200k events
index=B request.event.traceId=* "event"=Strategy AND "eventState"=IN_PROGRESS gives me 10k events