How to apply the fit command with a "by" field

tpirozzi

Explorer

11-11-2017
04:06 AM

Hi Everyone,

So I have data like this in my lookup table

fields

10| 2 | red

4 | 6 | red

9 | 1 | red

110| 102 | blue

104 | 106 | blue

109 | 101 | blue

So if I use the fit command

| inputlookup fitcommandexample.csv | fit KMeans k=2 "A" "B" by C

Results

A B C cluster cluster_distance

10 2 red 1 6.44444444444

4 6 red 1 22.4444444444

9 1 red 1 5.77777777778

110 102 blue 0 6.44444444444

104 106 blue 0 22.4444444444

109 101 blue 0 5.77777777778

But

| inputlookup fitcommandexample.csv | where C like "blue"| fit KMeans k=2 "A" "B"

Result

A B C cluster cluster_distance

110 102 blue 0 0.5

104 106 blue 1 0.0

109 101 blue 0 0.5

Likewise

| inputlookup fitcommandexample.csv | where C like "red"| fit KMeans k=2 "A" "B"

yields

A B C cluster cluster_distance

10 2 red 1 0.5

4 6 red 0 0.0

9 1 red 1 0.5

So what I was hoping for was that the by clause would make the fit command fit to each of the subsets red and blue in isolation such that the result yielded

| inputlookup fitcommandexample.csv | fit KMeans k=2 "A" "B" by C

A B C cluster cluster_distance

10 2 red 1 0.5

4 6 red 0 0.0

9 1 red 1 0.5

110 102 blue 0 0.5

104 106 blue 1 0.0

109 101 blue 0 0.5

blue and red were essentially separate clusters other wise I am not sure how to quickly break up the data and apply fit to the subsets without writing and external script via API. Any ideas?

Thanks

Tim

Re: How to apply the fit command with a "by" field

cmerriman

Super Champion

11-14-2017
05:27 AM

the map command might be your only option, as there isn't a `by`

command for clustering.

```
|makeresults |eval data="C=red C=blue"|makemv data|mvexpand data|rename data as _raw|kv|table C
|map maxsearches=6 search="|makeresults |eval data=\"A=10,B=2,C=red A=4,B=6,C=red A=9,B=1,C=red A=110,B=102,C=blue A=104,B=106,C=blue A=109,B=101,C=blue\"|makemv data|mvexpand data|rename data as _raw|kv|search C=$C$|table A B C|fit KMeans k=2 A B"
```