ML in Security: Insider Threat Detection - Wed 5/24/23

3 Comments
Cover Images - Office Hours (2).png
Published on ‎04-13-2023 02:11 PM by Splunk Employee | Updated on ‎06-26-2023 02:42 PM

Register here and ask questions below this thread for the Office Hours session on ML in Security: Insider Threat Detection on Wed, May 24, 2023 at 1pm PT / 4pm ET.

 

Join our bi-weekly Office Hour series where technical Splunk experts answer questions and provide how-to guidance on a different topic every month! This Office Hours session will cover anything related to how to deploy and use machine learning for insider threat detection. The panel will consist of expert Splunk ML and Threat Researchers. Come with any questions around leveraging the Machine Learning Toolkit app (MLTK), the Data Science and Deep Learning app (DSDL), Enterprise Security, or User Behavior Analytics (UBA) to detect insider threats and accelerate threat hunting with Splunk.

 

Please submit your questions below as comments in advance. You can also head to the #office-hours user Slack channel to ask questions (request access here).  Prefer to submit anonymously? Fill out this form.

 

Pre-submitted questions will be prioritized. After that, we will go in order of the questions posted below, then will open the floor up to live Q&A with meeting participants. If there’s a quick answer available, we’ll post as a direct reply.

 

Look forward to connecting!



Labels (2)
0 Karma
adepp
Splunk Employee

Hey Everyone!

Drop your questions/comments here for any topics you'd like to see discussed in the Community Office Hours session (you can also head to the #office-hours user Slack channel to ask questions and join the discussion - request access here).

Your questions can include anything around leveraging the MLTK app , the DSDL app, Enterprise Security, or User Behavior Analytics (UBA) to accelerate threat hunting with Splunk, or anything else you'd like to learn about implementing ML with Splunk.

adepp
Splunk Employee

Here are some of the questions from the session:

  • Q1: While there is certainly no one size fits all response, how does the resource utilization for applying ML for detection affect resource planning for ES? 
    • A: Mostly ML detections run as correlation searches and there is a deployment guide for Enterprise Security that allows customers to review load and configure deployment. As the question mentioned, resource requirement is case dependent. I would suggest 2 best practices 
      • 1) Make search head powerful as possible as you can because MLTK needs it
      • 2) Schedule MLTK correlation search apart enough so that dependent MLTK runs can have enough time to finish.
  • Q3:  What are Splunk offerings for DIY Insider threat detection powered by ML? 
    • A: Splunk offers two ML platforms to develop your own ML based detection:  MLTK - Machine Learning Toolkit, and DSDL - Data Science and Deep Learning toolkit
  • Q4: What is your view on using Deep Learning vs. using other machine learning techniques for threat detection?
    • A: Deep Learning is a type of ML technique. Normally DL is well suited to tasks for which there is large amounts of data present. Simpler ML techniques such as Random Forests and SVMs may work better for smaller amounts of data. DL also works very well for language models. So, it depends upon the task at hand and the nature of data, to decide which tool is best for the job.
  • Q5: What's the performance impact on Splunk when we use machine learning?
    • A: UBA/DSDL requires separate infrastructure so it doesn’t affect performance of Splunk. However with ES/MLTK based ML, it will affect infrastructure depending on the training and inference time and resources.
adepp
Splunk Employee

Questions from the Live Q&A 

Q1:  Is there a specific customer size for UBA qualification?

  • A: If you mean the number of people in the organization, sometimes ML detections are much more useful if you do it over a population. So if you have only a size of five people in your organization, perhaps ML detection is not the best way to do things. But if you have 1000 or 5000 then things might be more accurate.Either way, you will 100% need a team to manage a UBA simply because if there is a detection, if there is an alert, somebody has to diagnose and understand what are the reasons and how to act.

Q2:  As SSE and ESCU are sort of 'free' detections, are there plans to continue updates to UBA Content Updates? I notice it hasn't been updated on Splunkbase since 2020.

  • A: Yes, definitely expect new content coming to you. We are actively working on it, as they are both very important to us. Check out the detections we linked in question number 1 in the deck. 

Q3: Where does the UBA get events from? So do we need a seperate feed to Splunk and separate to UBA?

  • A: When you are buying an app such as UBA, you would need to configure it. So there's a little work involved to get you on top of running UBA and ES as well. So you may have the same feed. But they need to be pointed to different locations, so that these 2 apps start working.

Q4: Do you personally feel it's better to align data models for the standardization factor (CIM mapping), or the raw events for the more robust dataset, to train your models for threat hunting? 

  • A: In certain scenarios, standardization is desired. For example, if you want to get rid of PII you know you want data, and only in certain formats, and if you understand how the data is coming in, then you can anonymize it better. And make sure that only clean data is fed to the model. But in general, the raw data events are the best.
  • I would say both of them. If you are a true data scientist and you have access to all the data, then you want to build your detection on top of raw data, because you want to capture as much information as possible or any kind of signal from broad data. When we are standardizing, we are losing some information at some level, although it makes the life easy. So it depends on your data science background, time, and how much you want to spend resources over it.