Dedup is absolutely ok with larger dataset also for your requirements. Since you want to do some logic on top of dedup, stats dc() and Head commands are out of picture here. Try to write two different queries that give same results but with different approaches given below and check in job inspect which query is faster.
Using dedup on larger dataset can be expensive. There are cases where you can replace dedup by using a
stats latest(... OR subsearch as filters or something else. Whether dedup can be replaces OR not and if yes, then with what will depend upon your query requirements. Could you give some sample search on how the dedup is being used?
uhkc777 - Did the search query provided by somesoni2 help provide a working solution to your question? Please let me know when you can so that it can be converted to an answer. Thanks!