Splunk Search

Combining the results from 2 indexes in splunk query

abhinav_aashish
Explorer

I have one index idx1 and other index idx2 and a common column "A" on which matching needs to be done.

I'm facing difficulty in combining the data from both the columns.
I've to combine the data in such a way that if there is duplicate then the data from idx1 must be prioritized over data from idx2; i.e. basically equivalent of set operation [a+(b-a)].

I've tried the following :

| set diff [ search index=idx2 sourcetype=src | dedup A ] [search index=idx1 sourcetype=src | dedup A ]
| stats count BY index A
| table index A


Here I get total 10840 statistics with both columns filled.

But when I want to display other columns from both the indexes I get empty columns for those.


Upon executing :

| set diff [ search index=1idx1 sourcetype=src | dedup A ] [search index=idx2 sourcetype=src | dedup A ]
| stats count BY index

I get the output as

index count
idx1 4791
idx2 6049

 

Can anyone help me how should I proceed??

I've tried even this but not sure

index=idx1 sourcetype=src
| append [
| set diff [ search index=idx2 sourcetype=src | dedup A ] [search index=idx1 sourcetype=src | dedup A ]]
| stats count BY index A
| table index A

 

 

Labels (3)
0 Karma

tread_splunk
Splunk Employee
Splunk Employee

Depends what you want in your output.  Are idx1 and idx2 exact replicas in terms of event structure?  Whats your definition of "duplicate"?  Are you able to provide some indicative / test events in idx1 and idx2 and desired results?  Don't assume you need to use SET type operations.  You can normally achieve desired results with STATS command.  Start your query with...

(index=idx1 OR index=idx2) AND sourcetype=src 

...then go to work with STATS command.

Tags (1)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

@abhinav_aashish 

Not sure if this will solve the question, but after stats, you will lose the original fields, so if you want to preserve the data through the stats, then use

| stats values(*) as * count BY index A

so this values(*) as * will just collect all the unique values in the original data and put in the results table with the original field name. 

0 Karma

abhinav_aashish
Explorer

I tried something like this :

index=idx1 sourcetype=src
| append [search index=idx2 sourcetype=src ]
| dedup A| table A B C D

idx1 had 4791 events

idx2 had 6049 events

(idx1-idx2) has 2590 events

(idx2-idx1) has 3848 events

union of 2 indexes has 8639 events

intersection of 2 indexes has 2201 events

So after executing the above query I got 8639 events

2590+3848+2201=8639

I think its correct...

Any suggestions are welcomed

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...