All Apps and Add-ons

Is it possible to use the cross-validation in the Machine Learning Toolkit and Showcase app?

nnetz
New Member

Hello,

Is it possible to use the cross-validation in the Machine Learning Toolkit and Showcase app?

0 Karma

gabrcg
New Member
0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

The assistants for Predict Numeric Fields and Predict Categorical Fields do 2-fold cross validation for you, automatically. You can select the train-test ratio of your choosing.

0 Karma

melonman
Motivator

Well, if you look for automated cross validation or single command to perform cross validation, maybe the answer is probably No at this moment.

Here is what I do for now.

For example, K-Fold cross validation where K=5, you could split your data into partitions (like into 5) using sample command.

... search to get your dataset | sample partitions=5

This will add partition_number to dataset so you can specify the number to get a part of data.
Then, and use partition 1(1/5 of data) to create model (use as train) and rest of data to use for test.

... search to get your dataset | sample partitions=5 | where partition_number=0 | fit ... into your_model | ..

and test with the rest

... search to get your dataset | sample partitions=5 | where partition_number!=0 | apply your_model | ..

then calculate errors and consolidate the result from each validation.

maybe you can automate this by other splunk job scheduling technologies... (scheduled search, summary index + some dashboard)

gabrcg
New Member

To apply the k-fold cross validation (using 5 folds as in the above example), you should train with 4 folds, and then test with 1 fold. The code example is doing the opposite. So, it should be:

Train with 4 folds

 | sample partitions=5 seed=1| where partition_number!=0 | fit ... into your_model |

Test with 1 fold

 | sample partitions=5 seed=1| where partition_number=0 | apply your_model
0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

Make sure you a set a seed in the sample! E.g.

| sample partitions=5 seed=42
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...