I'm trying to use the OneClassSVM algorithm (thank you, @cmerriman !) to detect outliers in the reactionTime field of my data. As best as I can tell from the information on scikit-learn.org, OneClassSVM is a novelty detection algorithm, meaning that when I use the "fit" command, it will determine a boundary that fits around most-if-not-all of the data I've given it, and deem those data points "normal." When I do so, however, 68% of my data ends up being marked "abnormal."
Here's the SPL I'm using:
index=xxx source="xxx" reactionTime user=lradics | where reactionTime < 10000 | where reactionTime > 300 | dedup ID | fit OneClassSVM reactionTime into rxn_time_model | table isNormal, reactionTime
I don't have much experience with this sort of thing, so I'm suspecting it's probably a user error, but I can't find where I would've gone wrong. Is my understanding of the algorithm's behavior correct? Can anyone point me to what I should change?
Thank you! I ended up switching the kernel to linear, and making nu much smaller (0.0001), and that worked. I'm curious why altering nu didn't affect the results I got with the default kernel... I'll read up on it some more 🙂