Solved: Endpoint data model not filling itself

Erendouille

Hello !

I'm using Splunk_SA_CIM with ESS and I'm currently studying most of the ESCU correlation search for my own purposes.

Problem : I discovered that most of my ESCU rules are creating a lot of notable events, which after investigation, were all false positives. All these rules are based on fields coming from Endpoint Data Model (for exemple, Processes.process_path), and because most of the process.path values are equal to "null", it triggers the search and create a notable event.

I've already updated every app I use, and to gather Windows data, I'm using Splunk_TA_Windows add-on.

Do you have any clue on how I can find where the problem is and solve it ?

PickleRick

Well. This is a fairly generic question and to answer it you have to look into your own data.

The Endpoint datamodel definition is fairly well known and you can browse through its details any time in the gui. You know which indexes the datamodel pulls the events from. So you must check the data quality in your indexes and check if the sourcetypes have proper extractions and if your sources provide you with relevant data. If there is no data in your events what is Splunk supposed to do? Guess? 🙂

It's not about repairing a datamodel because the datamodel is just an abstract definition. It's about repairing your data or its parsing rules so that necessary fields are extracted from your events. That's what CIM-compliance means. If you have a TA for specific technology which tells you it's CIM-compliant, you can expect the fields to be filled properly (and you could fill a bug report if they aren't ;-)). But sometimes TAs require you to configure your source in a specific way because otherwise not all relevant data is being sent in the events.

So it all boils down to have data and know your data.

View solution in original post

Erendouille

Thanks for your answer @gcusello !

Yes, I'm aware that some of our searches appear multiple times because of the "trigger configuration" but this wasn't really the question, sorry if i misled you.

My question was really about why the datas coming from the Endpoint data model are not all filled (for example, 99% of the parent_process_name field are "unknown", 97 % of the process_path fields are "null"), and how can I "repair" the data model so every field has a value, which would mean no more false positives and a less crowded ESS dashboard.

But thanks anyway for your reactivity ! 🙂

gcusello

Hi @Erendouille ,

the only way is to tune the Correlation Search filtering events with "unknown " or "NULL".

One hint: don't modify Correlation Searches, clone and modify them in a custom app (calld e.g. "SA-SOC").

Ciao.

Giuseppe

PickleRick

Well. This is a fairly generic question and to answer it you have to look into your own data.

The Endpoint datamodel definition is fairly well known and you can browse through its details any time in the gui. You know which indexes the datamodel pulls the events from. So you must check the data quality in your indexes and check if the sourcetypes have proper extractions and if your sources provide you with relevant data. If there is no data in your events what is Splunk supposed to do? Guess? 🙂

It's not about repairing a datamodel because the datamodel is just an abstract definition. It's about repairing your data or its parsing rules so that necessary fields are extracted from your events. That's what CIM-compliance means. If you have a TA for specific technology which tells you it's CIM-compliant, you can expect the fields to be filled properly (and you could fill a bug report if they aren't ;-)). But sometimes TAs require you to configure your source in a specific way because otherwise not all relevant data is being sent in the events.

So it all boils down to have data and know your data.

gcusello

Hi @Erendouille ,

in my experience, every Correlation Search requires a tuning phase to tune the thresholds.

In addition, it could be a solution not creating a Notable for each occurrance of a Correlation Search, but use the the Risk Score Action, in this way, you find an issue later but you have very less notables that SOC Analysts must analyze.

Ciao.

Giuseppe

Endpoint data model not filling itself

data

New Case Study Shows the Value of Partnering with Splunk Academic Alliance

How to Monitor Google Kubernetes Engine (GKE)

Index This | How can you make 45 using only 4?