As @yuanliu points out, under certain circumstances, the following are functionally the same index=Test field1 field2 field3 index=Test "field1"="*" "field2"="*" "field3"="*" However, from Splunk...
See more...
As @yuanliu points out, under certain circumstances, the following are functionally the same index=Test field1 field2 field3 index=Test "field1"="*" "field2"="*" "field3"="*" However, from Splunk's point of view they are very different. In the first case, the search is looking for a piece of TEXT in the _raw event called 'field1' or 2/3 whereas in the second, it's looking for a field called field1 that is extracted and has some value, so considering these two _raw example events 2023-08-31T08:00:00 field1="Hello" 2023-08-31T08:00:00 Hello="field1" The first search will find both events, whereas the second search will only find the SECOND event. Here's an example to demonstrate | makeresults
| eval x=split("2023-08-31T08:00:00 field1=\"Hello\",2023-08-31T08:00:00 Hello=\"field1\"", ",")
| mvexpand x
| eval _time=strptime(x,"%FT%T")
| rename x as _raw
| extract
| search field1 this finds both events, but if you change the last line to search field1=* you will only get one event. As for validating your data, you can clearly not go through 24m events, so you would have to do aggregations and check numbers and can only validate if you know what you expect. Making all those wildcard searches is not particular performant, and as that picks up most events, then you may want to turn that into a NOT search, by index=o365 NOT (f1=* f2=*...) which should return the 1k not found Don't forget that Splunk is returning you _raw events and doing field extraction, so when you say you only want 40 fields, the just deal with all the events and after doing any data processing you need, validate for the events you want to exclude by filtering at a later point in the Splunk pipeline. For example, if your events you do NOT want do not have an Operation field, then | stats count by Operation will actually filter those events that don't have the operation field anyway and would be much faster than your complex wildcard search