Splunk Search

what will it take for MULTIKV to format a table (1 header row, 1+ data rows) properly?

Splunk Employee
Splunk Employee

Such a helpful command, and yet doesn't work for me...

1 Solution

Splunk Employee
Splunk Employee
  • 1 header row
  • numeric data must be be right-justified (under header)
  • non-numeric data must be be either left-justified or right-justified (under header)
  • all rows (header & data) must have same # of columns -- MULTIKV abhors a sparse matrix.
    • If you don't have enough data values in a row, create dummy values like ? or n/a
    • If you have blanks inside a value, fill them in with _ (or some such); or put quotes around the value.

Let's say you have this input from PostgresQL:

 procpid |  usename   |         current_query          | waiting |          xact_start           |   client_addr   | client_port 
---------+------------+--------------------------------+---------+-------------------------------+-----------------+-------------
   19578 | areiser    | <IDLE>                         | f       |                               | 192.168.1.105   |       47736
     690 | txreadonly | <insufficient privilege>       |         |                               |                 |            
   27986 | forums     | <insufficient privilege>       |         |                               |                 |            
   19588 | areiser    | <IDLE>                         | f       |                               | 192.168.104.234 |       47738
   19591 | areiser    | select * from pg_stat_activity | f       | 2010-01-07 13:15:05.002492-06 |                 |          -1
(5 rows)

We apply a bit of awk(1), as:

BEGIN {
    FS = "|";
    OFS = " ";
}

# Skip lines which are neither header nor data
NR == 2 {next}
(NF==1) && ($1 ~ /[0-9]+ row/) {next}

{
    for (i=1; i<=NF; ++i) {
        fieldWidth = length($i);

        if (gsub("[\012\015]", "", $i) && (NR>1))
            ++fieldWidth;
        sub(" +$", "", $i);
        sub("^ +", "", $i);

        # can't have blanks within values -- fill with _'s 
        gsub(" ", "_", $i);

        # can't have blank cells in table -- replace with "<n/a>"
        $i = length($i) ? $i : "<n/a>";

        # right-justify everything
        format = sprintf("%%%ds", fieldWidth);
        $i = sprintf(format, $i);
    }
    print
}

,and obtain

  procpid      usename                    current_query   waiting                      xact_start       client_addr    client_port
    19578      areiser                           <IDLE>         f                           <n/a>     192.168.1.105          47736
      690   txreadonly         <insufficient_privilege>     <n/a>                           <n/a>             <n/a>          <n/a>
    27986       forums         <insufficient_privilege>     <n/a>                           <n/a>             <n/a>          <n/a>
    19588      areiser                           <IDLE>         f                           <n/a>   192.168.104.234          47738
    19591      areiser   select_*_from_pg_stat_activity         f   2010-01-07_13:15:05.002492-06             <n/a>             -1

, which MULTIKV will accept. To easier craft this general solution above, we assumed that all data might be numeric, and hence right-justified all columns.

View solution in original post

Splunk Employee
Splunk Employee
  • 1 header row
  • numeric data must be be right-justified (under header)
  • non-numeric data must be be either left-justified or right-justified (under header)
  • all rows (header & data) must have same # of columns -- MULTIKV abhors a sparse matrix.
    • If you don't have enough data values in a row, create dummy values like ? or n/a
    • If you have blanks inside a value, fill them in with _ (or some such); or put quotes around the value.

Let's say you have this input from PostgresQL:

 procpid |  usename   |         current_query          | waiting |          xact_start           |   client_addr   | client_port 
---------+------------+--------------------------------+---------+-------------------------------+-----------------+-------------
   19578 | areiser    | <IDLE>                         | f       |                               | 192.168.1.105   |       47736
     690 | txreadonly | <insufficient privilege>       |         |                               |                 |            
   27986 | forums     | <insufficient privilege>       |         |                               |                 |            
   19588 | areiser    | <IDLE>                         | f       |                               | 192.168.104.234 |       47738
   19591 | areiser    | select * from pg_stat_activity | f       | 2010-01-07 13:15:05.002492-06 |                 |          -1
(5 rows)

We apply a bit of awk(1), as:

BEGIN {
    FS = "|";
    OFS = " ";
}

# Skip lines which are neither header nor data
NR == 2 {next}
(NF==1) && ($1 ~ /[0-9]+ row/) {next}

{
    for (i=1; i<=NF; ++i) {
        fieldWidth = length($i);

        if (gsub("[\012\015]", "", $i) && (NR>1))
            ++fieldWidth;
        sub(" +$", "", $i);
        sub("^ +", "", $i);

        # can't have blanks within values -- fill with _'s 
        gsub(" ", "_", $i);

        # can't have blank cells in table -- replace with "<n/a>"
        $i = length($i) ? $i : "<n/a>";

        # right-justify everything
        format = sprintf("%%%ds", fieldWidth);
        $i = sprintf(format, $i);
    }
    print
}

,and obtain

  procpid      usename                    current_query   waiting                      xact_start       client_addr    client_port
    19578      areiser                           <IDLE>         f                           <n/a>     192.168.1.105          47736
      690   txreadonly         <insufficient_privilege>     <n/a>                           <n/a>             <n/a>          <n/a>
    27986       forums         <insufficient_privilege>     <n/a>                           <n/a>             <n/a>          <n/a>
    19588      areiser                           <IDLE>         f                           <n/a>   192.168.104.234          47738
    19591      areiser   select_*_from_pg_stat_activity         f   2010-01-07_13:15:05.002492-06             <n/a>             -1

, which MULTIKV will accept. To easier craft this general solution above, we assumed that all data might be numeric, and hence right-justified all columns.

View solution in original post

Splunk Employee
Splunk Employee

True. The pre-processing could be simplified if we [a] do some dirty work in SQL (ISNULL, string replacement, etc); [b] get Splunk to break on characters other than whitespace, | in this example; [c] have access to a top-notch SQL CLI (that would be, Oracle's SQL*Plus).

0 Karma

Splunk Employee
Splunk Employee

that's alotta pre-processing

0 Karma