Splunk Search

Removing null columns from a table

hogan24
Path Finder

I've found some previous posts with similar questions but the results dont seem to be correct so I'm hoping someone can help with this one...

I have something like this:
mysearch | table column1 column2 column3

With results like this:
column1 --- column2 --- column 3
asdfasdf --- 123456
bsdfasdf --- 234567
csdfasdf --- 345678

Sometimes column3 will have data, sometimes it wont. How (after piping to table) can I remove column3 if it has no data (and preferably keep the columns in order as well)? Thanks.

nick405060
Motivator

Bumping this.

0 Karma

jadomaai
Engager

There's a simple way to remove columns with all null values using foreach:

if you have cols a, b and c and c has all NULL values:

| makeresults 
| eval a=10
| eval b=20
| eval c=30
| append 
    [
    | makeresults
    | eval a=100
    | eval b=200
    | eval c=300
    ]
| append 
    [
    | makeresults
    | eval a=1000
    | eval b=2000
    | eval c=3000
    ]
| table a b c
| eval c=NULL
| rename * AS X_*_NEW
| foreach X_*_NEW 
    [ eval <<MATCHSTR>>=<<FIELD>> ]
| fields - X*

will leave cols a and b, as during the eval Splunk removes any col which is all NULL values.

nick405060
Motivator

This also reorders your results

0 Karma

jadomaai
Engager

Fair point about the order. Hadn't picked up on that constraint. Can't find a way to preserve the order when running a search in the search editor, but have a solution for a dashboard context by virtue of using a token to preserve the desired order. Arguably, this shouldn't work, as we're effectively 're-creating' the lost columns by running the tokenized fields command (try opening the second panel's search in the search editor to see what I mean)... but for some reason this works in the context of the dashboard. For anyone who doesn't have access to _internal index, substitute your own search and modify the relevant commands in the base search to make the following work

<dashboard>
  <label>remove_null_cols_and_keep_column_order</label>
  <search id="baseSearch">
    <query>
      index=_internal sourcetype=scheduler earliest=-1m
    | table _time host user concurrency_category status scheduled_time run_time result_count
    | eval user = NULL
    | eval scheduled_time = NULL
    | eval _flds = "| fields _time host user concurrency_category status scheduled_time run_time result_count"
    </query>
    <done>
      <condition match="$job.resultCount$!=0">
        <set token="displayFieldsInOrder_tok">$result._flds$</set>
      </condition>
    </done>
    <sampleRatio>1</sampleRatio>
  </search>
  <row>
    <panel>
      <title>Original table of results showing all columns including 'user' and 'scheduled_time' which are set to all NULL values</title>
      <table>
        <search base="baseSearch"></search>
        <option name="count">5</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">none</option>
        <option name="percentagesRow">false</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">false</option>
      </table>
    </panel>
  </row>
  <row>
    <panel>
      <title>Results after removing NULL cols</title>
      <table>
        <search base="baseSearch">
          <query>
            <![CDATA[
               | rename * AS X_*_NEW
               | foreach X_*_NEW 
               [ eval <<MATCHSTR>>=<<FIELD>> ]
                 $displayFieldsInOrder_tok$
               ]]>
          </query>
        </search>
        <option name="count">5</option>
        <option name="dataOverlayMode">none</option>
        <option name="drilldown">none</option>
        <option name="percentagesRow">false</option>
        <option name="rowNumbers">false</option>
        <option name="totalsRow">false</option>
        <option name="wrap">false</option>
      </table>
    </panel>
  </row>
</dashboard>
0 Karma

to4kawa
Ultra Champion
<dashboard>
  <label>remove_null_column</label>
  <search id="base_search">
    <query>| makeresults 
 | eval a=10
 | eval b=20
 | eval c=30
 | append 
     [
     | makeresults
     | eval a=100
     | eval b=200
     | eval c=300
     ]
 | append 
     [
     | makeresults
     | eval a=1000
     | eval b=2000
     | eval c=3000
     ]
 | table a b c
 | eval c=NULL</query>
    <earliest>-15m</earliest>
    <latest>now</latest>
    <sampleRatio>1</sampleRatio>
  </search>
  <row>
    <panel depends="$allwaysInvisible$">
      <table>
        <search base="base_search">
          <query> | appendpipe  [ transpose 0
| rename "row "* as t*
| foreach t* 
    [eval tmp="".<<FIELD>>]
    | where tmp!=""]
| eval fields_list=mvjoin(fields_list,",")
| stats values(fields_list) as fields_list</query>
          <progress>
            <condition match="$job.resultCount$!=0">
              <set token="fields_list">$result.fields_list$</set>
            </condition>
          </progress>
        </search>
        <option name="count">20</option>
      </table>
    </panel>
  </row>
  <row>
    <panel>
      <title>contains NULL values</title>
      <table>
        <search base="base_search">
          <query/>
        </search>
        <option name="count">20</option>
      </table>
    </panel>
  </row>
  <row>
    <panel>
      <title>removes NULL values</title>
      <table>
        <search base="base_search">
          <query>| table $fields_list$</query>
        </search>
        <option name="count">20</option>
      </table>
    </panel>
  </row>
</dashboard>

In the case of a dashboard, you can pass it as an argument to a field.
Now you can control table .

0 Karma

nick405060
Motivator

Hacky. Love it.

However, I would use progress and not done here. Even using progress, there is unfortunately some delay between clicking the submit button and having the "search is waiting for input" message going away. I had to put a note in my dashboard to my users "note: search takes some time to begin after submitting". However, using done it takes much longer for the "search is waiting for input" message to go away.

This should first of all be an answer not a comment, if not the accepted answer. If there was some way to take away the time delay this would be the perfect solution.

0 Karma

nick405060
Motivator

Idk, I know it's possible that there's something else in my code that is a compounding variable here, but I spent my entire day trying to adapt this code and I do know that $displayFieldsInOrder_tok$ is causal for the issue, and that the "search is waiting for input" message doesn't change until the base search is complete. Your base search for the example code is much faster since it's just _internal for the last minute

0 Karma

jonasmeier
Explorer

Here's a solution that eliminates empty columns AND keeps the order of the columns. As table is often used to define the order of fields, keeping it is essential. Tested about a dozen possibilities and the only generic one I found (where you don't have to know the column names) is using transpose, which eliminates empty columns when creating rows of it:
`| makeresults count=3
| rename comment as "generate some test data"
| streamstats count
| eval f2=if(count=2,"YYY",null), f3=if(count=3,"ZZZ",""),f5="XXX",f6=null
| rename comment as "f1, f4 and f6 have no values in any row"
| table f6 f5 f4 f3 f1 f2 count

| rename comment as "now the transpose magic"
| transpose
| transpose header_field=column
| fields - column`

jonasmeier
Explorer

Use | transpose 0, which indicates that the number of rows to transpose is unlimited. (and be carefull with event counts > 10000, it trashed my browser)

nick405060
Motivator

This approach does not usually work, for example the below. I have tried it with multiple searches and it usually does not work. You can move the 0 to the other transpose or do it for both, and it either does not remove null columns, or it also removes columns that have data.

index=wineventlog user="me" | streamstats count | table _time source EventCode EventCodeDescription user host src Source_Address Source_Network_Address Network_Address Destination_Address Keywords Application_Name Process_Name | transpose | transpose 0 header_field=column | fields - column
0 Karma

nick405060
Motivator

Also: one of my searches where this did work yesterday, today does not work. Yesterday it removed the null columns and today it does not. Why would this be, besides transpose clearly being at fault?

0 Karma

jonasmeier
Explorer

Reliably worked for me, from about version 6.0 up to 8. For sure you need transpose 0 at both invocations, otherwise you will get only the first 5 events.
I did some more testing with Win eventlog data: Result as expected (as long as there are not to many events, otherwise it will block the browser tab, because it is so slow).
Some fields i.e. Source_Network_Address appear to be empty bu do net get removed. As a matter of fact it contained a space, which you can only see this by dragging the mouse over the values!

But I also like your wildcard trick with the table command a lot and for sure it is much faster.
Big advantage of transpose is that you do not have to know the field names, therefore it would be much easier to use as a macro, but for a high price... So I think I will also use wildcards next time.

0 Karma

nick405060
Motivator

I did previously see that Source_Network_Address, for us also, contains a space. The issues I was having did unfortunately occur even for other null fields that did not contain spaces and were truly null.

0 Karma

nick405060
Motivator

This solution does not work for every possibility. e.g. I am tabling winevent data for multiple eventcodes. It seems that if a field has null data for just one data point, when the rest are populated, then that entire field is not displayed. Have tried this every way I can and it still excludes multiple fields of mine that contain data.

0 Karma

romanwaldecker
Path Finder

Awesome answer - simple and works very well! Should be accepted as the right answer imho.
The only downside to consider: The transpose commands demand a lot of computing power = time.

0 Karma

pj
Contributor
mysearch | table column1 column2 column3
| streamstats count as temp_count
| stats values(*) as * by temp_count
| fields - temp_count

The streamstats gives each row a unique number
The stats values will remove null columns

nick405060
Motivator

A second major issue with your approach is that it removes any _* fields, e.g. _time.

However, I believe I have finally solved this question. Here is how you remove null columns and keep the order

 mysearch | (rename any _* fields) | table column1 column2 column3
 | streamstats count as temp_count
 | stats values(*) as * by temp_count
 | fields - temp_count
 | table column1* column2* column3*

If you take away the stars in the last table statement, then all columns get tabled again regardless of whether they have null values or not. If you use the stars, then only null columns are selected and the order is kept.

0 Karma

nick405060
Motivator

This reorders your columns so in a dashboard it is not much of a solution 😕

0 Karma

pj
Contributor

To you perhaps, but the question only expressed a preference to keep the order, it wasn't mandated.

0 Karma

mrobichaud_splu
Splunk Employee
Splunk Employee

This works...

 your search | fields column1 column2 column3 | table * | fields - _*

However, do you really need to pipe to table?

If you just show events and format it as a table, then the extra column would disappear. In the search, you can select "Table" on the events tab. Unfortunately, if you save this as a dashboard panel the settings are getting lost. In SimpleXML, here are the settings you need:

<panel>
  <event>
    <search>
      <query>your search</query>
    </search>
    <option name="count">10</option>
    <option name="rowNumbers">0</option>
    <option name="type">table</option>
    <fields>column1, column2, column3</fields>
  </event>
</panel>

Note, this method would also include _time.

0 Karma
Get Updates on the Splunk Community!

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Hey Splunky People! We are excited to share the latest updates in Splunk Enterprise 9.4. In this release we ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...