Getting Data In

mvfind output as spath input?

Dworsnop
Path Finder

I'm looking to get some json data from our anomaly detection system into the Intrusion Detection data model and thus need to map the fields to the CIM. The json events vary depending on the model being 'breached' and therefore not all events will contain the dest and src fields in the same place.

 

The json data contains many multi-value fields and because the required data is not always in its own single-value field (I can just alias those) but is sometimes in an array, but not always in the same place  (depending on the triggers that caused each model to breach) so I need to search each array for certain indicators such as "Destination Endpoint" (lets call this 'A') and then map the actual endpoint name ('B') from another field, using the array location of A. I've been looking at the mvfind command but before I spend a great deal of time on this I was wondering if my approach is correct or if what I want to do is even possible in the first place. E.g. can I use the output of mvfind as an input of spath maybe?

 

Once I've got a search working I'll be looking to extract the values automatically and I'm assuming at search time would still be okay for the data model? The frequency of events is not very high (one or two every five minutes or less) so I don't think that an index-time extraction would put too much load on the HF/INDXR.

 

I can't really share an example event due to the nature of it's contents but let me know if there's any more info that would help. Thanks very much in advance.

Labels (2)
Tags (2)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

When you say "I need to search each array for certain indicators such as "Destination Endpoint" (lets call this 'A') and then map the actual endpoint name ('B') from another field, using the array location of A" does that mean that you have two "parallel" arrays? Assuming this to be the case, and that you have extracted these to two multi-value fields, can you mvzip them together to at least get "Destination Endpoint" and the corresponding "actual endpoint" into a single instance of the combined field?

View solution in original post

rnowitzki
Builder

Hi @Dworsnop,

Not sure if this helps you, but I had fun playing around with mvfind, mvindex and spath.

Conclusion: You can not use a field value as an "index input" for spath.
So, this does not work:  

| eval n=1
| spath output=somefield path=yourarray{n}

But, you can dump the whole array to a mvfield with spath and then get the desired value with mvindex, where you can use a field value for the index indicator.

To test it, I indexed a json with a drinks array and combined it with some meals. 
(The AI picked the wrong drink for my burger, so I corrected it 🙂 )

source="json_drinks"
| eval meals="pasta burger pizza"
| makemv meals
| eval n=mvfind(meals,"burger")
| eval 'meal_selection'=mvindex(meals, n)

| spath output=drinks path=drinks{}
| eval 'drink_selection'=mvindex(drinks, n)
| eval noiwantbeer=n-1
| eval 'drink_selection_correction'=mvindex(drinks, noiwantbeer)

| table 'meal_selection', 'drink_selection', 'drink_selection_correction'



json_drinks_and_meals.PNG

For completness, the drinks array I used:

{
  "drinks": [
    "beer",
    "coke",
    "water"
  ]
}


BR 
Ralph

--
Karma and/or Solution tagging appreciated.

Dworsnop
Path Finder

Hi @rnowitzki , thanks very much for the reply, I've used @ITWhisperer 's method and it's got me halfway there now.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

When you say "I need to search each array for certain indicators such as "Destination Endpoint" (lets call this 'A') and then map the actual endpoint name ('B') from another field, using the array location of A" does that mean that you have two "parallel" arrays? Assuming this to be the case, and that you have extracted these to two multi-value fields, can you mvzip them together to at least get "Destination Endpoint" and the corresponding "actual endpoint" into a single instance of the combined field?

Dworsnop
Path Finder

Thanks very much @ITWhisperer , I should have spent more time looking at the mv eval functions, that's worked a treat. Now for the hard part...

I now have an mv field containing thousands of details for each model breach, but because each model looks at different connections/activity, the src and dest information will be called different things (e.g. "Connection hostname:<value>", "Destination IP:<value>","Internal source device name:<value>" and so on).

What would be the best way to extract those src and dest values (once I've obviously determined the correct name depending on which model is being breached - there's going to be 50+ models I'll have to do this for)? My search currently looks like this...

... | spath output=trigger_name path=triggeredComponents{}.triggeredFilters{}.filterType | spath output=trigger_value path=triggeredComponents{}.triggeredFilters{}.trigger.value | eval new_trig=mvzip(trigger_name,trigger_value,":") | stats count by model.name, new_trig

Once I've got the matching src and dest fields for each model, where & how would I perform these extractions at search/index time? I already have a TA for the data, would I put it in props.conf, inputs.conf or somewhere else?

Thanks again!  🙂

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...