I am working with a large amount of data with over 8 million devices. I am trying to distinct count the number of devices by their version number. Unfortunately, the query is returning duplicates because the devices can be found with multiple versions.
For Example: the data may look like this at 9 am...
Device: A Version 1
Device: B Version 1
Device: C Version 1
Device: D Version 1
Device: E Version 1
but on a deployment day, by 3 pm, it may look like this:
Device: A Version 2
Device: B Version 2
Device: C Version 1
Device: D Version 1
Device: E Version 1
So, my dc(device) by version over 24 hours returns:
Version 1 5 Devices
Version 2 2 Devices
For a total of 7 devices, even though there are actually only 5.
Without Using Dedup, how do I eliminate those duplicates?
... View more