Splunk Search

How to apply the fit command with a "by" field

tpirozzi
Explorer

Hi Everyone,

So I have data like this in my lookup table

fields

A | B | C

10| 2 | red
4 | 6 | red
9 | 1 | red
110| 102 | blue
104 | 106 | blue
109 | 101 | blue

So if I use the fit command
| inputlookup fitcommandexample.csv | fit KMeans k=2 "A" "B" by C

Results
A B C cluster cluster_distance

10 2 red 1 6.44444444444
4 6 red 1 22.4444444444
9 1 red 1 5.77777777778
110 102 blue 0 6.44444444444
104 106 blue 0 22.4444444444
109 101 blue 0 5.77777777778

But

| inputlookup fitcommandexample.csv | where C like "blue"| fit KMeans k=2 "A" "B"

Result

A B C cluster cluster_distance

110 102 blue 0 0.5
104 106 blue 1 0.0
109 101 blue 0 0.5

Likewise

| inputlookup fitcommandexample.csv | where C like "red"| fit KMeans k=2 "A" "B"

yields

A B C cluster cluster_distance

10 2 red 1 0.5
4 6 red 0 0.0
9 1 red 1 0.5

So what I was hoping for was that the by clause would make the fit command fit to each of the subsets red and blue in isolation such that the result yielded
| inputlookup fitcommandexample.csv | fit KMeans k=2 "A" "B" by C

A B C cluster cluster_distance

10 2 red 1 0.5
4 6 red 0 0.0
9 1 red 1 0.5
110 102 blue 0 0.5
104 106 blue 1 0.0
109 101 blue 0 0.5

blue and red were essentially separate clusters other wise I am not sure how to quickly break up the data and apply fit to the subsets without writing and external script via API. Any ideas?

Thanks

Tim

Tags (1)
0 Karma

cmerriman
Super Champion

the map command might be your only option, as there isn't a by command for clustering.

|makeresults |eval data="C=red C=blue"|makemv data|mvexpand data|rename data as _raw|kv|table C
|map maxsearches=6 search="|makeresults |eval data=\"A=10,B=2,C=red A=4,B=6,C=red A=9,B=1,C=red A=110,B=102,C=blue A=104,B=106,C=blue A=109,B=101,C=blue\"|makemv data|mvexpand data|rename data as _raw|kv|search C=$C$|table A B C|fit KMeans k=2 A B"
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...