Deployment Architecture

Use summary index in join statement

deepak02
Path Finder

Hi,

I have a query which looks like this:

Search 1 | fields userID, logonID
| join logonID [Search 2 | fields logonActions]
| table _time, userID, logonID, logonActions

I want to create summary indexes for Search 1 and Search 2.

SummaryIndex 1 | fields userID, logonID
| join logonID [Summary Index 2 | fields logonActions]
| table _time, userID, logonID, logonActions

Will it work? Can I use summary index inside a join statement?

Thanks,
Deepak

Tags (1)
0 Karma
1 Solution

DalJeanis
Legend

Your actual searches, to get any results, had better be more like

Search 1 | fields userID, logonID 
| join logonID [Search 2 | fields logonID logonActions]
| table _time, userID, logonID, logonActions

Therefore, if you do...

Search 1 | table _time userID, logonID | collect to summaryIndex1

Search 2  | table logonID  logonActions| collect to summaryIndex2

...then your second overall search code would have the same result as your first one, all things being equal. You would have to make any minor adjustments related to the way you are specifying the time, but otherwise it should operate identically.

You should only do this, however, if your Search 1 and Search 2 are pretty complex and discard most of the data in your underlying events.

View solution in original post

0 Karma

DalJeanis
Legend

Your actual searches, to get any results, had better be more like

Search 1 | fields userID, logonID 
| join logonID [Search 2 | fields logonID logonActions]
| table _time, userID, logonID, logonActions

Therefore, if you do...

Search 1 | table _time userID, logonID | collect to summaryIndex1

Search 2  | table logonID  logonActions| collect to summaryIndex2

...then your second overall search code would have the same result as your first one, all things being equal. You would have to make any minor adjustments related to the way you are specifying the time, but otherwise it should operate identically.

You should only do this, however, if your Search 1 and Search 2 are pretty complex and discard most of the data in your underlying events.

0 Karma

deepak02
Path Finder

Thankyou.

How should I do the join between summaryIndex1 and summaryIndex1?

I want to connect search 1 and search 2 using logonID and display _time, userID, logonID, logonActions in one table.

0 Karma

DalJeanis
Legend

Okay, you need to define your extractions, and then the join takes care of itself.
If the exact data returned by your Search 1 is loaded to summaryindex1, and the exact data returned by your Search 2 is loaded to SummaryIndex2, then the join looks like this...

index=SummaryIndex1 | table _time userID, logonID 
 | join logonID [Index=SummaryIndex2 | table logonID logonActions]
 | table _time, userID, logonID, logonActions

Now, i don't know what the data actually is, so I can't give advice about that. I suspect there's something wrong above, because this amount of data wouldn't deserve establishing a summary index. If the logonActions are the real interesting data, and are associated with _time, then I'd switch the join around.

If it's not a lot of data, and you don't need it stored for long after the reporting, then you might use csv files instead of summary indexes. It looks pretty much the same...

 | inputcsv mySearch1,csv | table _time userID, logonID 
 | join logonID [| inputcsv mySearch2,csv | table logonID logonActions]
 | table _time, userID, logonID, logonActions
0 Karma

deepak02
Path Finder

Thankyou very much!

You are right, this is just an example.
The actual data and queries are much more complicated, and we have been have performance issues because of the amount of data being loaded into the dashboard each time.

DalJeanis
Legend

By the way, you don't actually need a join there - you can do this instead, especially if you have simplified the incoming data by putting it in summary indexes...

 index=SummaryIndex1  OR index=SummaryIndex2 
| table _time userID, logonID, logonActions
| stats values(*) as * by logonID

You have to watch out, just slightly, if SummaryIndex2 has a _time field that you don't care about.

 index=SummaryIndex1  OR index=SummaryIndex2 
| table _time userID, logonID, logonActions
| eval _time=if(index=SummaryIndex2 ,null(),_time)
| stats values(*) as * by logonID
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...