I have table as below
Date | Out Airline | Bag Type | Total Processed |
01/05/2024 | IX | Local | 100 |
01/05/2024 | IX | Transfer | 120 |
02/05/2024 | BA | Local | 140 |
02/05/2024 | BA | Transfer | 160 |
03/05/2024 | IX | Local | 150 |
Whenever a Bag Type is missing for certain Airline (in above case Transfer data is missing for 03/05/2024 IX). I need to create a manual row entry with value as 0 (Total Processed = 0)
Date | Out Airline | Bag Type | Total Processed |
01/05/2024 | IX | Local | 100 |
01/05/2024 | IX | Transfer | 120 |
02/05/2024 | BA | Local | 140 |
02/05/2024 | BA | Transfer | 160 |
03/05/2024 | IX | Local | 150 |
03/05/2024 | IX | Transfer | 0 |
Hi @sultanulariff,
We can use the appendpipe, eventstats, stats, and mvexpand commands to append a product of the set of Bag Type values by the set of Out Airline and Date values and then use the stats command to summarize the Total Processed:
| makeresults format=csv data="
Date,Out Airline,Bag Type,Total Processed
01/05/2024,IX,Local,100
01/05/2024,IX,Transfer,120
02/05/2024,BA,Local,140
02/05/2024,BA,Transfer,160
03/05/2024,IX,Local,150"
| appendpipe
[ eventstats values("Bag Type") as "Bag Type" ``` get all possible values of Bag Type ```
| stats values("Bag Type") as "Bag Type" by Date "Out Airline" ``` get all combinations of Bag Type by Date and Out Airline ```
| mvexpand "Bag Type" ``` expand all combinations into separate events ``` ]
| stats sum(eval(coalesce('Total Processed', 0))) as "Total Processed" by Date "Out Airline" "Bag Type" ``` sum Total Processed, replacing null values with 0 ```
Date | Out Airline | Bag Type | Total Processed |
01/05/2024 | IX | Local | 100 |
01/05/2024 | IX | Transfer | 120 |
02/05/2024 | BA | Local | 140 |
02/05/2024 | BA | Transfer | 160 |
03/05/2024 | IX | Local | 150 |
03/05/2024 | IX | Transfer | 0 |
This example will only append missing Bag Type values. It can be expanded to append any combination of Date, Out Airline, and Bag Type values. We can also convert Date to a time value and use the makecontinuous command to generate spans of dates prior to appending and summarizing combinations.
Hi @sultanulariff,
We can use the appendpipe, eventstats, stats, and mvexpand commands to append a product of the set of Bag Type values by the set of Out Airline and Date values and then use the stats command to summarize the Total Processed:
| makeresults format=csv data="
Date,Out Airline,Bag Type,Total Processed
01/05/2024,IX,Local,100
01/05/2024,IX,Transfer,120
02/05/2024,BA,Local,140
02/05/2024,BA,Transfer,160
03/05/2024,IX,Local,150"
| appendpipe
[ eventstats values("Bag Type") as "Bag Type" ``` get all possible values of Bag Type ```
| stats values("Bag Type") as "Bag Type" by Date "Out Airline" ``` get all combinations of Bag Type by Date and Out Airline ```
| mvexpand "Bag Type" ``` expand all combinations into separate events ``` ]
| stats sum(eval(coalesce('Total Processed', 0))) as "Total Processed" by Date "Out Airline" "Bag Type" ``` sum Total Processed, replacing null values with 0 ```
Date | Out Airline | Bag Type | Total Processed |
01/05/2024 | IX | Local | 100 |
01/05/2024 | IX | Transfer | 120 |
02/05/2024 | BA | Local | 140 |
02/05/2024 | BA | Transfer | 160 |
03/05/2024 | IX | Local | 150 |
03/05/2024 | IX | Transfer | 0 |
This example will only append missing Bag Type values. It can be expanded to append any combination of Date, Out Airline, and Bag Type values. We can also convert Date to a time value and use the makecontinuous command to generate spans of dates prior to appending and summarizing combinations.
Hi @tscroggins , many thanks for the response. Looks perfect! Will need to try it with the actual data/query and reconfirm.
As a brief addendum, if all possible label values are known in advance, it's more efficient to store the values in a lookup file and use the inputlookup command with append=true before the stats command; however, the introduction of any unknown field value, e.g. Date, requires additional logic to produce combinations of known and unknown values.
Hi @tscroggins , there are only 3 possible values i.e. Bag_Type = "Local","Transfer","Rush".
The requirement is every flight for specific date shall have 3 rows with baggage volumes. If any row is missing from source data, I need to create the particular row with baggage volume as "0".
Hi @sultanulariff,
We can create a small lookup file containing the Bag Type values:
| makeresults format=csv data="Bag Type
Local
Rush
Transfer"
| outputlookup bag_type_lookup.csv
We can then add the inputlookup command to seed the event stream with all known Bag Type values:
| makeresults format=csv data="
Date,Out Airline,Bag Type,Total Processed
01/05/2024,IX,Local,100
01/05/2024,IX,Transfer,120
02/05/2024,BA,Local,140
02/05/2024,BA,Transfer,160
03/05/2024,IX,Local,150"
| appendpipe
[ inputlookup append=true bag_type_lookup.csv ``` seed Bag Type values ```
| eventstats values("Bag Type") as "Bag Type" ``` get all possible values of Bag Type ```
| stats values("Bag Type") as "Bag Type" by Date "Out Airline" ``` get all combinations of Bag Type by Date and Out Airline ```
| mvexpand "Bag Type" ``` expand all combinations into separate events ``` ]
| stats sum(eval(coalesce('Total Processed', 0))) as "Total Processed" by Date "Out Airline" "Bag Type" ``` sum Total Processed, replacing null values with 0 ```
Date | Out Airline | Bag Type | Total Processed |
01/05/2024 | IX | Local | 100 |
01/05/2024 | IX | Rush | 0 |
01/05/2024 | IX | Transfer | 120 |
02/05/2024 | BA | Local | 140 |
02/05/2024 | BA | Rush | 0 |
02/05/2024 | BA | Transfer | 160 |
03/05/2024 | IX | Local | 150 |
03/05/2024 | IX | Rush | 0 |
03/05/2024 | IX | Transfer | 0 |
We used makeresults to provide test data. In production, we would optimize our base search to summarize results before the appendpipe command, e.g.:
index=foo sourcetype=outbound_airline_bag_type_metrics
| stats count as "Total Processed" by Date "Out Airline" "Bag Type"
| appendpipe
[ inputlookup append=true bag_type_lookup.csv
| eventstats values("Bag Type") as "Bag Type"
| stats values("Bag Type") as "Bag Type" by Date "Out Airline"
| mvexpand "Bag Type" ]
| stats sum(eval(coalesce('Total Processed', 0))) as "Total Processed" by Date "Out Airline" "Bag Type"
Do you have ony two possible bag types? Generally that's possible but the question is how to do it most effectively/elegantly. Because the obvious thing would be to do stats by each date/airline and then fillnull or eval with coalesce but the question is whether that's enough to get results as "date, airline, bags local, bags transfered" or do you need to split it back to separate rows.
Hi @PickleRick , there are three bag_types
1. Local 2. Transfer and 3. Rush
From the actual source data, I do not get all three 'Total processed' baggage counts (i.e. there shall be 3 rows for every single Airline/Date) every time. Hence, if any of the three bag_type is not there, I need to create a row for the missing Bag_type with Total_Processed=0.
I use above index as base search query and left join further sub queries. For example, if 03/05/2024,IX,Transfer,0 is missing I am not able to join the sub queries related to 03/052024,IX,Transfer.
So, I need it as a seperate/new row (if the row is missing)