Splunk Search

FieldSelector - p-values higher than >.05 in selected fields

jpawloski
Path Finder

I've recently begun exploring the FieldSelector command to better understand what fields are the best predictor for an ML model. During my research, I've gained what I think to be a decent understanding of what constitutes a good predictor field based largely on its p-value (anything below .05), and the score values (the higher the better).

I've been running through some tests and noticed that the fields being selected by the FieldSelector don't represent what I would think to be the most optimal selection of fields. I've pasted the fit command I'm using below:

|fit FieldSelector num from PC_* value_hashed_* type=numeric mode=k_best param=10 into combined_field_selector

 

Once this is run, I compare the output to the summary of the combined_field_selector model, which provides score and p-values for all the fields:

| summary combined_field_selector

 

One of the ten fields selected via FieldSelector was PC_2, with a score of .3293 and a p-value of .5661. Of the 132 fields passed to this fit command, PC_2 ranked 115th in score and was the 15th highest p-value. This seems to tell me it was not a good predictor for the model. Plus, I had more than ten fields with better score/p-value combinations.

I know this type of question falls in no man's land between the underlying python, statistical algorithms, and Splunk, but Splunk is really my only means of applying ML to this data and troubleshooting the results. I'm hoping someone has a better understanding of what's going on and can potentially explain why these fields are being selected.

Labels (3)
Tags (2)
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...