Splunk Search

Merge indexes with flatmap

victor_znk
Loves-to-Learn Lots

Hello,

I'm asking your help to merge two indexes. The first index is simply JSON documents compound. The second index is made up of JSON documents too but with array of documents. For example:

First index

{

"field1": "value1",  

"field2": "value2",

}

Second index 

 

{
  ...other fields...
  documents: [{
     "field1": "value1"
     "field2": "value2"
  }, {
     "field1": "value1"
     "field2": "value2"
  }]
}

 

I want to be able to retrieve and flatmap documents from the second index and then merge it with the first index to be able to do stats operations.

Thank you 

Labels (2)
0 Karma

tscroggins
Influencer

@victor_znk 

Are you trying to emulate a flatMap() function, or are you trying to expand the objects in the second index's events' documents array into separate events?

If the latter, you can manipulate the array into individual events using rex, mvexpand, and spath:

index=a
| append
    [ search index=b
    | rex "\"documents\": \\[(?<documents>.*)\\]"
    | rex field=documents max_match=0 ",?(?<documents>{.*?})"
    | fields _time documents
    | mvexpand documents
    | spath input=documents
    | fields - documents ]
| stats count by field1 field2

Depending on your statistical analysis requirements, you may also get the correct result by simply searching both indexes and renaming auto-extracted fields:

index IN (a b)
| rename documents{}.* as *
| stats sum(field1) sum(field2)

0 Karma

victor_znk
Loves-to-Learn Lots

Hi, I was able to solve my problem using :

 

index="index1" | append [search index="index2" | spath path="documents{}" output=documents| mvexpand documents| eval _raw=documents| kv]

 

Thank you

0 Karma

tscroggins
Influencer

Nice. Much easier to read. I tend to approach problems like this with regular expressions, but the JSON parser will take care of edge cases. Not sure which performs better, though.

0 Karma

victor_znk
Loves-to-Learn Lots

In fact, I'm know facing an issue with the mvexpand function : 

[MULTISEARCH #2]command.mvexpand: output will be truncated at 25000 results due to excessive memory usage. Memory threshold of 500MB

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...