All Apps and Add-ons

Machine Learning Toolkit: Has anyone used this app with data exfiltration?

mcbradford
Contributor

Hello,

Not sure if anyone has used the Machine Learning Toolkit for data exfiltration (data exfil)? I would like to identify outliers from my email traffic. I have the message size within my data, so I was hoping to use this data to establish a baseline and alert on the outliers. Any thoughts on doing this with Splunk and/or the Machine Learning Toolkit?

0 Karma
1 Solution

rjthibod
Champion

I have not used it for this purpose, but using the Median Absolute Deviation algorithm (MAD) under the Outlier Detection set of tools might prove useful.

MAD is more robust than using something like standard deviation, in part because it does not rely on a normal distribution assumption.

The tricky thing you would need to figure out is how to setup the model via fit in order to determine your thresholds based on certain message types or metadata (e.g., source, sender, etc.). Once you decide on what dimensions are important to differentiate message types, it should be pretty shortforward to use the toolkit to set the parameters for the populations and then setup some saved searches that would use apply.

View solution in original post

0 Karma

rjthibod
Champion

I have not used it for this purpose, but using the Median Absolute Deviation algorithm (MAD) under the Outlier Detection set of tools might prove useful.

MAD is more robust than using something like standard deviation, in part because it does not rely on a normal distribution assumption.

The tricky thing you would need to figure out is how to setup the model via fit in order to determine your thresholds based on certain message types or metadata (e.g., source, sender, etc.). Once you decide on what dimensions are important to differentiate message types, it should be pretty shortforward to use the toolkit to set the parameters for the populations and then setup some saved searches that would use apply.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...