Re: Flatten Search Results

shinglau · ‎08-10-2017

I am currently trying to use the Splunk REST API to extract a heap of data.
I have written a search query:

(index=* OR index=_*) (index="myindex") | table "order-no", "order-date", "price"

But it provides a result set which looks like this:

"Order-no", "order-date", "price","quantity"
1        ,       20/05/2017        , 1.0, 1
--------------------, 2.0, 1
---------------------,3.0, 2

It rolls 2.0 and 3.0 up to 1 record along side row order-no = 1.
So when i try to export this as a CSV using REST API, the output is all jumbled up.

Is there a way to flatten this in the search so it shows as:

"Order-no", "order-date", "price","quantity"
1        ,       20/05/2017        , 1.0, 1
1        ,       20/05/2017        , 2.0, 1
1        ,       20/05/2017        , 3.0, 2

I've tried to use mvexpand, but it appears that it works on a individual column level and if i do a mvexpand on both of those last 2 columns, it effectively does a cross join on the entire data which causes incorrect values
Thanks in advance.

DalJeanis · ‎10-06-2017

Try this ...

(index=* OR index=_*) (index="myindex" "order{@order-no}"="1" "order.customer.customer-no"="*")
 | rename order{@order-no} as "order-no",
     order.order-date as "order-date",
     order.currency as "currency",
     order.shipments.shipment.shipping-address.country-code as "country-code",
     order.product-lineitems.product-lineitem.product-id as "productid",
     order.product-lineitems.product-lineitem.quantity as "quantity",
     order.product-lineitems.product-lineitem.base-price as "baseprice",
     order.product-lineitems.product-lineitem.net-price as "netprice",
     order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.promotion-id as "adjustpromotionid",
     order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.campaign-id as "adjustcampaignid",
     order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.coupon-id as "adjustcouponid",
     order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.lineitem-text as "adjustlineitemtext",
     order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.base-price as "adjustbaseprice",
     order.product-lineitems.product-lineitem.tax as "tax",
     order.customer.customer-no as "customer-no"
 | streamstats count as recno
 | eval myFan=mvrange(0,mvcount(quantity))
 | mvexpand myFan
 | eval productid=mvindex(productid,myFan)
 | eval quantity=mvindex(quantity,myFan)
 | eval baseprice=mvindex(baseprice,myFan)
 | eval netprice=mvindex(netprice,myFan)

 | rename COMMENT as "To accurately line up this part of the data, you should go back to extract"
 | rename COMMENT as "the data directly from the JSON for each line-item by productid."
 | rename COMMENT as "We would need a properly formatted JSON sample to code that extract from the JSON or the _raw,"
 | rename COMMENT as "So this is just a placeholder method."
 | eval adjustpromotionid=mvindex(adjustpromotionid,myFan)
 | eval adjustcampaignid=mvindex(adjustcampaignid,myFan)
 | eval adjustcouponid=mvindex(adjustcouponid,myFan)
 | eval adjustlineitemtext=mvindex(adjustlineitemtext,myFan)

woodcock · ‎08-11-2017

Like this:

|makeresults | eval orders="1.0,1:2.0,1:3.0,2"
| makemv delim=":" orders
| mvexpand orders
| rename orders AS _raw
| eval "Order-no" = 1
| rename _time AS "order-date"
| convert ctime(*date)
| rex "^(?<price>\S+),(?<quantity>\S+)$"
| fields - _raw
| stats list(*) AS * BY Order-no order-date

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

| eval PsAndQs = mvzip(price,quantity)
| mvexpand PsAndQs
| rex field=PsAndQs "^(?<price>\S+),(?<quantity>\S+)$"
| fields - PsAndQs

shinglau · ‎08-15-2017

Interesting, even your answer drops it when one of the columns within your mvzip is empty.
I think its because the multivalue field actually doesn't have a value for it, and doesn't know about the other columns. Even if i use a replace "" or fillnull on the original data, it doesn't fill the ones in MVs. Any idea how to fillnulls/replace "" in MV so it lines up with the data?

woodcock · ‎08-15-2017

This makes no sense. How could a valid row or event ever have an undefined price or an undefined quantity and even if it did, who care if it is lost, it is totally invalid/nonsensical anyway. Something really doesn't add up here. If you run the full run-anywhere example, there is no question: it works.

shinglau · ‎08-15-2017

Im adding a few more other MV columns which arent equal in rows. E.g. they may have nulls.

woodcock · ‎08-15-2017

Now see, that makes a H*U*G*E difference. Just add | fillnull value="NULL".

shinglau · ‎08-15-2017

attempting to fillnull, but doesn't seem to do anything, still only showing the ones which are not null for all of the fields in mvzip.

woodcock · ‎08-15-2017

Just show us a REAL starting event with ALL of the fields (one that ends up broken).

shinglau · ‎08-15-2017

Input Table:

order-no    order-date  currency    country-code    quantity(MV)    base-price(MV)  net-price(MV)   product-id(MV)  customer-no adjust-coupon-id(MV)
1   2017-07-31T13:13:31.000Z    AUD AU  1.0 30.00   27.27   2   123 BFFYAY
                1.0 50.00   45.45   3       BFFYAY
                1.0 45.00   40.91   4       BFFYAY
                1.0 45.00   40.91   5       <NO VALUE>

Current Query to Flatten:

(index=* OR index=_*) (index="myindex" "order{@order-no}"="1" "order.customer.customer-no"="*")
| rename order{@order-no} as "order-no",
order.order-date as "order-date",
order.currency as "currency",
order.shipments.shipment.shipping-address.country-code as "country-code",
order.product-lineitems.product-lineitem.product-id as "productid",
order.product-lineitems.product-lineitem.quantity as "quantity",
order.product-lineitems.product-lineitem.base-price as "baseprice",
order.product-lineitems.product-lineitem.net-price as "netprice",
order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.promotion-id as "adjustpromotionid",
order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.campaign-id as "adjustcampaignid",
order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.coupon-id as "adjustcouponid",
order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.lineitem-text as "adjustlineitemtext",
order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.base-price as "adjustbaseprice",
order.product-lineitems.product-lineitem.tax as "tax",
order.customer.customer-no as "customer-no"
| eval combine=mvzip(mvzip(mvzip(mvzip(coalesce(productid,""),coalesce(quantity,""),"!!#@#!!"),coalesce(baseprice,""),"!!#@#!!"),coalesce(netprice,""),"!!#@#!!"),coalesce(adjustcouponid,""),"!!#@#!!")
 | mvexpand combine
| eval allvalues=split(combine,"!!#@#!!")
| eval "adjust-coupon-id"=mvindex(allvalues,4)
| eval "net-price"=mvindex(allvalues,3)
| eval "base-price"=mvindex(allvalues,2)
| eval "quantity"=mvindex(allvalues,1) 
| eval "product-id"=mvindex(allvalues,0)
| table "order-no", "order-date", "currency", "country-code","quantity", "base-price", "net-price", "product-id", "customer-no", "adjust-coupon-id"

Result:

order-no    order-date  currency    country-code    quantity    base-price  net-price   product-id  customer-no adjust-coupon-id
1   2017-07-31T13:13:31.000Z    AUD AU  1.0 30.00   27.27   2   00436933    BFFYAY
1   2017-07-31T13:13:31.000Z    AUD AU  1.0 50.00   45.45   3   00436933    BFFYAY
1   2017-07-31T13:13:31.000Z    AUD AU  1.0 45.00   40.91   4   00436933    BFFYAY

DalJeanis · ‎08-11-2017

Assuming you have this format

| table order-no order-date price quantity

and price and quantity are both multivalue fields that are aligned with each other, then do this...

| eval price=mvzip(price,quantity,"@")
| mvexpand price
| eval price=split(price,"@")
| eval quantity=mvindex(price,1)
| eval price=mvindex(price,0)

If you had a third field, which might contain an @ sign, and you don't have people entering enthusiastic comments, then I recommend five bangs as a delimiter "!!!!!". You can also substitute any highly unlikely combination of characters ( "!!#@#!!" ).

| table order-no order-date price quantity description
| eval price=mvzip(mvzip(price,quantity,"!!!!!"),description,"!!!!!")
| mvexpand price
| eval price=split(price,"!!!!!")
| eval description=mvindex(price,2)
| eval quantity=mvindex(price,1)
| eval price=mvindex(price,0)

DalJeanis · ‎08-15-2017

@shinglau - These methods are not appropriate for fields that are not precisely matched, with the same number of mv fields. If a field is supposed to be matched and is null, then you need to create a placeholder mv field, for example with the word "NULL" the correct number of times.

Assuming that the description may be null, the price and quantity are always matched and never null, and price and quantity can never appear more than 40 times, that could look something like...

 | table order-no order-date price quantity description
 | eval mynulls=mvappend("((UNUSED))",
   "((NULL))","((NULL))","((NULL))","((NULL))","((NULL))", "((NULL))","((NULL))","((NULL))",
   "((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))",
   "((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))",
   "((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))",
   "((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))")
 | eval description=coalesce(description,mvindex(mynulls,1,mvcount(price)))
 | eval price=mvzip(mvzip(price,quantity,"!!!!!"),description,"!!!!!")
 | mvexpand price
 | eval price=split(price,"!!!!!")
 | eval description=mvindex(price,2)
 | eval description=if(description="((NULL))",null(),description)
 | eval quantity=mvindex(price,1)
 | eval price=mvindex(price,0)

If a field is not directly linked to the other fields (price, quantity, cost etc) then do not handle it in the same phase. If it IS directly linked to the other fields but is for some reason not aligned, then it requires a separate question with specifics so that we can handle that error appropriately.

shinglau · ‎08-15-2017

That works for when the entire MV is null, but doesn't work for when there are 4 rows, but in the MV there is 3 of them which has a value, but the 4th one is non-existent.

shinglau · ‎08-15-2017

Looks like when any of the columns within the mvzip is null / empty string, it drops the entire row. Anyway to fix this?

woodcock · ‎08-15-2017

Sure, give my answer a try.

shinglau · ‎08-14-2017

I tried this solution.
But it looks like it only returns the first record not all of them.

It should look like this:
"Order-no", "order-date", "price","quantity"
1 , 20/05/2017 , 1.0, 1
1 , 20/05/2017 , 2.0, 1
1 , 20/05/2017 , 3.0, 2

But instead it looks like this:
"Order-no", "order-date", "price","quantity"
1 , 20/05/2017 , 1.0, 1

shinglau · ‎08-14-2017

Looks like using this solution only returns the first record, and doesn't return the other records within the group.

Should look like:
"Order-no", "order-date", "price","quantity"
1 , 20/05/2017 , 1.0, 1
1 , 20/05/2017 , 2.0, 1
1 , 20/05/2017 , 3.0, 2

Instead it returns this:
"Order-no", "order-date", "price","quantity"
1 , 20/05/2017 , 1.0, 1

shinglau · ‎08-14-2017

I tried this solution but it keeps on giving me error:
Error in 'eval' command: The arguments to the 'mvzip' function are invalid.

shinglau · ‎08-14-2017

Nevermind found out why..... Looks like eval doesn't like columns with "-" in it.
Had to rename the column without a "-" then reference it in mvzip and it worked.

Flatten Search Results

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

New Release | Splunk Cloud Platform 10.1.2507

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2

Are you a member of the Splunk Community?

Flatten Search Results

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

New Release | Splunk Cloud Platform 10.1.2507

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2