Splunk Search

Flatten Search Results

shinglau
New Member

I am currently trying to use the Splunk REST API to extract a heap of data.
I have written a search query:

(index=* OR index=_*) (index="myindex") | table "order-no", "order-date", "price"

But it provides a result set which looks like this:

"Order-no", "order-date", "price","quantity"
1        ,       20/05/2017        , 1.0, 1
--------------------, 2.0, 1
---------------------,3.0, 2

It rolls 2.0 and 3.0 up to 1 record along side row order-no = 1.
So when i try to export this as a CSV using REST API, the output is all jumbled up.

Is there a way to flatten this in the search so it shows as:

"Order-no", "order-date", "price","quantity"
1        ,       20/05/2017        , 1.0, 1
1        ,       20/05/2017        , 2.0, 1
1        ,       20/05/2017        , 3.0, 2

I've tried to use mvexpand, but it appears that it works on a individual column level and if i do a mvexpand on both of those last 2 columns, it effectively does a cross join on the entire data which causes incorrect values
Thanks in advance.

0 Karma

DalJeanis
Legend

Try this ...

(index=* OR index=_*) (index="myindex" "order{@order-no}"="1" "order.customer.customer-no"="*")
 | rename order{@order-no} as "order-no",
     order.order-date as "order-date",
     order.currency as "currency",
     order.shipments.shipment.shipping-address.country-code as "country-code",
     order.product-lineitems.product-lineitem.product-id as "productid",
     order.product-lineitems.product-lineitem.quantity as "quantity",
     order.product-lineitems.product-lineitem.base-price as "baseprice",
     order.product-lineitems.product-lineitem.net-price as "netprice",
     order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.promotion-id as "adjustpromotionid",
     order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.campaign-id as "adjustcampaignid",
     order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.coupon-id as "adjustcouponid",
     order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.lineitem-text as "adjustlineitemtext",
     order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.base-price as "adjustbaseprice",
     order.product-lineitems.product-lineitem.tax as "tax",
     order.customer.customer-no as "customer-no"
 | streamstats count as recno
 | eval myFan=mvrange(0,mvcount(quantity))
 | mvexpand myFan
 | eval productid=mvindex(productid,myFan)
 | eval quantity=mvindex(quantity,myFan)
 | eval baseprice=mvindex(baseprice,myFan)
 | eval netprice=mvindex(netprice,myFan)

 | rename COMMENT as "To accurately line up this part of the data, you should go back to extract"
 | rename COMMENT as "the data directly from the JSON for each line-item by productid."
 | rename COMMENT as "We would need a properly formatted JSON sample to code that extract from the JSON or the _raw,"
 | rename COMMENT as "So this is just a placeholder method."
 | eval adjustpromotionid=mvindex(adjustpromotionid,myFan)
 | eval adjustcampaignid=mvindex(adjustcampaignid,myFan)
 | eval adjustcouponid=mvindex(adjustcouponid,myFan)
 | eval adjustlineitemtext=mvindex(adjustlineitemtext,myFan)
0 Karma

woodcock
Esteemed Legend

Like this:

|makeresults | eval orders="1.0,1:2.0,1:3.0,2"
| makemv delim=":" orders
| mvexpand orders
| rename orders AS _raw
| eval "Order-no" = 1
| rename _time AS "order-date"
| convert ctime(*date)
| rex "^(?<price>\S+),(?<quantity>\S+)$"
| fields - _raw
| stats list(*) AS * BY Order-no order-date

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

| eval PsAndQs = mvzip(price,quantity)
| mvexpand PsAndQs
| rex field=PsAndQs "^(?<price>\S+),(?<quantity>\S+)$"
| fields - PsAndQs

shinglau
New Member

Interesting, even your answer drops it when one of the columns within your mvzip is empty.
I think its because the multivalue field actually doesn't have a value for it, and doesn't know about the other columns. Even if i use a replace "" or fillnull on the original data, it doesn't fill the ones in MVs. Any idea how to fillnulls/replace "" in MV so it lines up with the data?

0 Karma

woodcock
Esteemed Legend

This makes no sense. How could a valid row or event ever have an undefined price or an undefined quantity and even if it did, who care if it is lost, it is totally invalid/nonsensical anyway. Something really doesn't add up here. If you run the full run-anywhere example, there is no question: it works.

0 Karma

shinglau
New Member

Im adding a few more other MV columns which arent equal in rows. E.g. they may have nulls.

0 Karma

woodcock
Esteemed Legend

Now see, that makes a H*U*G*E difference. Just add | fillnull value="NULL".

0 Karma

shinglau
New Member

attempting to fillnull, but doesn't seem to do anything, still only showing the ones which are not null for all of the fields in mvzip.

|fillnull value=NULL newfield
| eval PsAndQs = mvzip(mvzip(price,quantity),newfield)
| mvexpand PsAndQs
| rex field=PsAndQs "^(?\S+),(?\S+),(?\S+)$"
| fields - PsAndQs

0 Karma

woodcock
Esteemed Legend

Just show us a REAL starting event with ALL of the fields (one that ends up broken).

0 Karma

shinglau
New Member

Input Table:

order-no    order-date  currency    country-code    quantity(MV)    base-price(MV)  net-price(MV)   product-id(MV)  customer-no adjust-coupon-id(MV)
1   2017-07-31T13:13:31.000Z    AUD AU  1.0 30.00   27.27   2   123 BFFYAY
                1.0 50.00   45.45   3       BFFYAY
                1.0 45.00   40.91   4       BFFYAY
                1.0 45.00   40.91   5       <NO VALUE>

Current Query to Flatten:

(index=* OR index=_*) (index="myindex" "order{@order-no}"="1" "order.customer.customer-no"="*")
| rename order{@order-no} as "order-no",
order.order-date as "order-date",
order.currency as "currency",
order.shipments.shipment.shipping-address.country-code as "country-code",
order.product-lineitems.product-lineitem.product-id as "productid",
order.product-lineitems.product-lineitem.quantity as "quantity",
order.product-lineitems.product-lineitem.base-price as "baseprice",
order.product-lineitems.product-lineitem.net-price as "netprice",
order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.promotion-id as "adjustpromotionid",
order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.campaign-id as "adjustcampaignid",
order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.coupon-id as "adjustcouponid",
order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.lineitem-text as "adjustlineitemtext",
order.product-lineitems.product-lineitem.price-adjustments.price-adjustment.base-price as "adjustbaseprice",
order.product-lineitems.product-lineitem.tax as "tax",
order.customer.customer-no as "customer-no"
| eval combine=mvzip(mvzip(mvzip(mvzip(coalesce(productid,""),coalesce(quantity,""),"!!#@#!!"),coalesce(baseprice,""),"!!#@#!!"),coalesce(netprice,""),"!!#@#!!"),coalesce(adjustcouponid,""),"!!#@#!!")
 | mvexpand combine
| eval allvalues=split(combine,"!!#@#!!")
| eval "adjust-coupon-id"=mvindex(allvalues,4)
| eval "net-price"=mvindex(allvalues,3)
| eval "base-price"=mvindex(allvalues,2)
| eval "quantity"=mvindex(allvalues,1) 
| eval "product-id"=mvindex(allvalues,0)
| table "order-no", "order-date", "currency", "country-code","quantity", "base-price", "net-price", "product-id", "customer-no", "adjust-coupon-id" 

Result:

order-no    order-date  currency    country-code    quantity    base-price  net-price   product-id  customer-no adjust-coupon-id
1   2017-07-31T13:13:31.000Z    AUD AU  1.0 30.00   27.27   2   00436933    BFFYAY
1   2017-07-31T13:13:31.000Z    AUD AU  1.0 50.00   45.45   3   00436933    BFFYAY
1   2017-07-31T13:13:31.000Z    AUD AU  1.0 45.00   40.91   4   00436933    BFFYAY
0 Karma

DalJeanis
Legend

Assuming you have this format

| table order-no order-date price quantity

and price and quantity are both multivalue fields that are aligned with each other, then do this...

| eval price=mvzip(price,quantity,"@")
| mvexpand price
| eval price=split(price,"@")
| eval quantity=mvindex(price,1)
| eval price=mvindex(price,0)

If you had a third field, which might contain an @ sign, and you don't have people entering enthusiastic comments, then I recommend five bangs as a delimiter "!!!!!". You can also substitute any highly unlikely combination of characters ( "!!#@#!!" ).

| table order-no order-date price quantity description
| eval price=mvzip(mvzip(price,quantity,"!!!!!"),description,"!!!!!")
| mvexpand price
| eval price=split(price,"!!!!!")
| eval description=mvindex(price,2)
| eval quantity=mvindex(price,1)
| eval price=mvindex(price,0)

DalJeanis
Legend

@shinglau - These methods are not appropriate for fields that are not precisely matched, with the same number of mv fields. If a field is supposed to be matched and is null, then you need to create a placeholder mv field, for example with the word "NULL" the correct number of times.

Assuming that the description may be null, the price and quantity are always matched and never null, and price and quantity can never appear more than 40 times, that could look something like...

 | table order-no order-date price quantity description
 | eval mynulls=mvappend("((UNUSED))",
   "((NULL))","((NULL))","((NULL))","((NULL))","((NULL))", "((NULL))","((NULL))","((NULL))",
   "((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))",
   "((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))",
   "((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))",
   "((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))","((NULL))")
 | eval description=coalesce(description,mvindex(mynulls,1,mvcount(price)))
 | eval price=mvzip(mvzip(price,quantity,"!!!!!"),description,"!!!!!")
 | mvexpand price
 | eval price=split(price,"!!!!!")
 | eval description=mvindex(price,2)
 | eval description=if(description="((NULL))",null(),description)
 | eval quantity=mvindex(price,1)
 | eval price=mvindex(price,0)

If a field is not directly linked to the other fields (price, quantity, cost etc) then do not handle it in the same phase. If it IS directly linked to the other fields but is for some reason not aligned, then it requires a separate question with specifics so that we can handle that error appropriately.

0 Karma

shinglau
New Member

That works for when the entire MV is null, but doesn't work for when there are 4 rows, but in the MV there is 3 of them which has a value, but the 4th one is non-existent.

0 Karma

shinglau
New Member

Looks like when any of the columns within the mvzip is null / empty string, it drops the entire row. Anyway to fix this?

0 Karma

woodcock
Esteemed Legend

Sure, give my answer a try.

0 Karma

shinglau
New Member

I tried this solution.
But it looks like it only returns the first record not all of them.

It should look like this:
"Order-no", "order-date", "price","quantity"
1 , 20/05/2017 , 1.0, 1
1 , 20/05/2017 , 2.0, 1
1 , 20/05/2017 , 3.0, 2

But instead it looks like this:
"Order-no", "order-date", "price","quantity"
1 , 20/05/2017 , 1.0, 1

0 Karma

shinglau
New Member

Looks like using this solution only returns the first record, and doesn't return the other records within the group.

Should look like:
"Order-no", "order-date", "price","quantity"
1 , 20/05/2017 , 1.0, 1
1 , 20/05/2017 , 2.0, 1
1 , 20/05/2017 , 3.0, 2

Instead it returns this:
"Order-no", "order-date", "price","quantity"
1 , 20/05/2017 , 1.0, 1

0 Karma

shinglau
New Member

I tried this solution but it keeps on giving me error:
Error in 'eval' command: The arguments to the 'mvzip' function are invalid.

0 Karma

shinglau
New Member

Nevermind found out why..... Looks like eval doesn't like columns with "-" in it.
Had to rename the column without a "-" then reference it in mvzip and it worked.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...