Archive

How to apply the fit command with a "by" field

Explorer

Hi Everyone,

So I have data like this in my lookup table

fields

A | B | C

10| 2 | red
4 | 6 | red
9 | 1 | red
110| 102 | blue
104 | 106 | blue
109 | 101 | blue

So if I use the fit command
| inputlookup fitcommandexample.csv | fit KMeans k=2 "A" "B" by C

Results
A B C cluster cluster_distance

10 2 red 1 6.44444444444
4 6 red 1 22.4444444444
9 1 red 1 5.77777777778
110 102 blue 0 6.44444444444
104 106 blue 0 22.4444444444
109 101 blue 0 5.77777777778

But

| inputlookup fitcommandexample.csv | where C like "blue"| fit KMeans k=2 "A" "B"

Result

A B C cluster cluster_distance

110 102 blue 0 0.5
104 106 blue 1 0.0
109 101 blue 0 0.5

Likewise

| inputlookup fitcommandexample.csv | where C like "red"| fit KMeans k=2 "A" "B"

yields

A B C cluster cluster_distance

10 2 red 1 0.5
4 6 red 0 0.0
9 1 red 1 0.5

So what I was hoping for was that the by clause would make the fit command fit to each of the subsets red and blue in isolation such that the result yielded
| inputlookup fitcommandexample.csv | fit KMeans k=2 "A" "B" by C

A B C cluster cluster_distance

10 2 red 1 0.5
4 6 red 0 0.0
9 1 red 1 0.5
110 102 blue 0 0.5
104 106 blue 1 0.0
109 101 blue 0 0.5

blue and red were essentially separate clusters other wise I am not sure how to quickly break up the data and apply fit to the subsets without writing and external script via API. Any ideas?

Thanks

Tim

Tags (1)
0 Karma

Super Champion

the map command might be your only option, as there isn't a by command for clustering.

|makeresults |eval data="C=red C=blue"|makemv data|mvexpand data|rename data as _raw|kv|table C
|map maxsearches=6 search="|makeresults |eval data=\"A=10,B=2,C=red A=4,B=6,C=red A=9,B=1,C=red A=110,B=102,C=blue A=104,B=106,C=blue A=109,B=101,C=blue\"|makemv data|mvexpand data|rename data as _raw|kv|search C=$C$|table A B C|fit KMeans k=2 A B"