We have a use case where we receive data from 2 different sources. Please note some key characteristics:
1. Our data volumes are less (say 5-10 GB per day)
2. Our searches will be very frequent as they are dashboard based (with auto refresh) and lots of users would be using it constantly
3. Our data is mostly not formatted where in one case we have to use regular expressions to parse and in another case we have to extract from XML
Since our requirement is to have faster searches so that users can see results quickly (i.e. less data but more searches), we were thinking we could extract the fields at index-time and persist them rather than doing search-time extraction. We have read online that in some cases, index-time extraction might be faster even though it increases the size (capacity wise not a problem for us) and increases the search-time (as it has to do more searching).
Could you please let me know if the increase in size and searching speed due to indexing is going to offset the search time speed that we are expecting to derive from this?
... View more