I know you can coalesce multiple columns to merge them into one. However, I am currently coalescing around 8 fields, some of which have null values. Because the last field I am including is sparse (only appears in 3% of the logs), I have found that the coalesced field returns as mostly null (matching the last coalesced field). A demonstration of what I'd like to do is below (A_col is first to coalesce then B_col to make C_col and B_col overwrites A_col if the value in B_col is not null).
Here is the query format I am using
... | eval name = coalesce(entityName, individualName) | ...
A demonstration of what I'd like to do (C is a coalesced field of A and B):
A_col . B_col . C_col
A Null . A
Null . B . B
A . B . B
The Null
on your output is actual Splunk's null/blank value or a literal "Null" string? Assuming it's former, specify the 2nd column first in the coalesce command.
| eval C_col=coalesce(B_col, A_col)
That way if B_col is available that will be used, else A_col will be used.
@somesoni2 No it means that that log file doesn't have that field. For, example the field transaction_id may be a non-nested key or it could be something like payload.response.transaction_id which is nested. I basically want to melt that into one column, but am having problems with null overwriting the current value. You suggested that I order the columns in increasing sparseness? (i.e. coalesce(4% sparse,20% sparse, 80% sparse))
Do you have control of the field extractions? If so, why not change the extractions in order to ignore the nulls?
I believe this should work
| eval C_col =coalesce(A_col, B_col, C_col)
@ghantk1 Didn't do it for me, null values are still overwriting if in the last column I am coalescing on ( | eval C_col =coalesce(A_col, B_col, C_col) command makes B_col nulls overwrite A_col non-null values)