An approach would be to statistical model the rarity of a host value in time. For example, if normal host values occurred on an hourly basis, a rare host value would be one that occurred significantly less frequently that 'normal' host values.
host=* | prelertautodetect rare by host
would satisfy this requirement, and can be easily operationalised in real-time to avoid the scale issues above.
One way would be to nest a sub search within a parent search, where the sub search finds all MAC Address values from a previous time range and then the parent search finds all MAC address that are NOT in the current hour time range.
As an example, here is how you could do this technique with the host field, where you want to know which hosts are showing up in events today that were NOT showing up yesterday:
host=* earliest=-24h NOT [search host=* earliest=-48h latest=-24h | dedup host | table host]
Thanks, Maverick. Unfortunately, it seems like a very expensive search having to search across all our data because its for anomaly detection; we'd want to be alerted close to real-time so we'd have to run this frequently.