Splunk Search

Merge indexes with flatmap

victor_znk
Loves-to-Learn Lots

Hello,

I'm asking your help to merge two indexes. The first index is simply JSON documents compound. The second index is made up of JSON documents too but with array of documents. For example:

First index

{

"field1": "value1",  

"field2": "value2",

}

Second index 

 

{
  ...other fields...
  documents: [{
     "field1": "value1"
     "field2": "value2"
  }, {
     "field1": "value1"
     "field2": "value2"
  }]
}

 

I want to be able to retrieve and flatmap documents from the second index and then merge it with the first index to be able to do stats operations.

Thank you 

Labels (1)
0 Karma

tscroggins
Influencer

@victor_znk 

Are you trying to emulate a flatMap() function, or are you trying to expand the objects in the second index's events' documents array into separate events?

If the latter, you can manipulate the array into individual events using rex, mvexpand, and spath:

index=a
| append
    [ search index=b
    | rex "\"documents\": \\[(?<documents>.*)\\]"
    | rex field=documents max_match=0 ",?(?<documents>{.*?})"
    | fields _time documents
    | mvexpand documents
    | spath input=documents
    | fields - documents ]
| stats count by field1 field2

Depending on your statistical analysis requirements, you may also get the correct result by simply searching both indexes and renaming auto-extracted fields:

index IN (a b)
| rename documents{}.* as *
| stats sum(field1) sum(field2)

0 Karma

victor_znk
Loves-to-Learn Lots

Hi, I was able to solve my problem using :

 

index="index1" | append [search index="index2" | spath path="documents{}" output=documents| mvexpand documents| eval _raw=documents| kv]

 

Thank you

0 Karma

tscroggins
Influencer

Nice. Much easier to read. I tend to approach problems like this with regular expressions, but the JSON parser will take care of edge cases. Not sure which performs better, though.

0 Karma

victor_znk
Loves-to-Learn Lots

In fact, I'm know facing an issue with the mvexpand function : 

[MULTISEARCH #2]command.mvexpand: output will be truncated at 25000 results due to excessive memory usage. Memory threshold of 500MB

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...