I have a fairly complex query that ultimately outputs a large table with 23 fields and several dozen rows. Since the table is being populated from various data sourcetypes, I have an aggregate of all available fields from all sourcetypes, but all fields are not present in all sourcetypes. Assume there are 3 sourcetypes, they share 17 common fields, one has 3 unique, another 2, and the final sourcetype has 1 unique field. The resulting table will show the results, with all fields regardless if the original sourcetype contains them or not. Because of this, often one of those unique fields will be null across the entire column.
Let's assume for simplicity the data returned from the original query and output to the table before any dedup/null removal is:
| sourcetype | ip | hostname | os | status | department | |---------|---------|----------|---------|---------|------------| | type1 | value1 | value2 | null | value3 | null | | type2 | value1 | value2 | null | value3 | null | | type3 | value4 | value5 | value6 | value7 | null |
In this instance, I would like to dedup rows 2 and 3 (I ignore sourcetype in the dedup, but I check all other fields present in the final tabl) in to a single result, and remove column 6, but I run in to the following problems:
Also, as mentioned above, the results are fairly dynamic based on the data being searched for (this is being used on a dashboard with a query parameter input). Sometimes column 4 will be all null, or column 3. The solution needs to account for the dynamic nature of the results.
Happy for any suggestions on how to overcome this, or just a confirmation that this will not be possible and I will need to find a different approach.
Apologies if this doesn't make any sense, I am happy to clarify any points as needed.
I just had to deal with this same 'deduping' crushing results as well.. Did it with if() statements for any fields that were NULL at times:
<search> |eval os=if(ISNULL(os),"dummyValue",os)| eval department=if(ISNULL(department),"dummyValue",department) |dedup ip,hostname,status,department
and could then change the dummyValue back after the dedup if needed
How about this,
|stats max(*) as * by ip, hostname
Typically the by column should have fields which you consider as keys in the final result set. This would remove any columns which have all nulls.
This does not work. I need to dedup before running this operation, which means I am forced to use fillnull. I cannot dedup after because there is no way to predict which columns will be removed. Maybe if there's a way I can set the fields that are being fillnulle'd to - back to null this would work?