Having more accurate representation of data is definitely an improvement. But you still need to answer the questions about data characteristics (lots of possibilities regarding triplets P_BATCH_ID,...
See more...
Having more accurate representation of data is definitely an improvement. But you still need to answer the questions about data characteristics (lots of possibilities regarding triplets P_BATCH_ID, P_REQUEST_ID, and P_RETURN_STATUS), and those about desired results (why multiple columns with ordered lists, should there be dedup of triplets, order of triplets, etc.) Because each combination requires a different solution, and can give you very different results. Other people's mind-reading is more wrong than correct. Let me try two mind-readings to illustrate. First, you want to preserve every triplet even if they repeat, and you want to present them in the order of event arrival as well as in the order they appear inside each event, except being grouped by correlationId. Absolutely no dedup. (Although this looks to have the least commands, the "solution" is the most demanding in memory.) | rename content."List of Batches Processed"{}.* as *
| fields P_BATCH_ID P_REQUEST_ID P_RETURN_STATUS correlationId
| stats list(P_*) as * by correlationId
``` mind-reading #1 ``` Using a composite emulation based on samples you provided (see end of this post), you will get correlationId BATCH_ID REQUEST_ID RETURN_STATUS 490cfba0e9f3c770b40 1 2 3 4 1 2 3 4 5 6 177 1r7 1577 16577 1005377 1005177 1005377 1005377 1005377 100532177 SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS Does this look like something you need? Or, mind-reading 2. You don't want any duplicate triplet; neither the order these triplets arrive with events nor the order they appear in individual events matters. You want maximum dedup, just group by correlationId. | spath path=content."List of Batches Processed"{}
| mvexpand content."List of Batches Processed"{}
| spath input=content."List of Batches Processed"{}
| stats count by P_BATCH_ID P_REQUEST_ID P_RETURN_STATUS correlationId
| stats list(P_*) as * by correlationId
``` mind-reading #2 ``` The same emulation will give correlationId BATCH_ID REQUEST_ID RETURN_STATUS 490cfba0e9f3c770b40 1 1 2 2 3 3 4 4 5 6 1005377 177 1005177 1r7 1005377 1577 1005377 16577 1005377 100532177 SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS SUCCESS Note: Even though this mock result is only superficially different from the previous one, the two can be materially very different if real data contain lots of duplicate triplets. As a bonus, I want to throw in a third mind-reading: You don't care about triplets at all. You only want to know which values are present in each of P_BATCH_ID, P_REQUEST_ID, and P_RETURN_STATUS. (This one is the least demanding in memory, and computationally light.) | rename content."List of Batches Processed"{}.* as *
| fields P_BATCH_ID P_REQUEST_ID P_RETURN_STATUS correlationId
| stats values(P_*) as * by correlationId
``` mind-reading extremo ``` The emulated data will give correlationId BATCH_ID REQUEST_ID RETURN_STATUS 490cfba0e9f3c770b40 1 2 3 4 5 6 1005177 100532177 1005377 1577 16577 177 1r7 SUCCESS Is this closer to what you want? The above three very different results are derived from assuming that the two sample/mock JSON data contain identical correlationId, like emulated below. They all kind of fit into the mock result table you showed. How can volunteers tell? | makeresults
| fields - _*
| eval data = mvappend("{ \"correlationId\" : \"490cfba0e9f3c770b40\",
\"content\" : {
\"List of Batches Processed\" : [ {
\"P_REQUEST_ID\" : \"177\",
\"P_BATCH_ID\" : \"1\",
\"P_TEMPLATE\" : \"Template\",
\"P_PERIOD\" : \"24\",
\"P_MORE_BATCHES_EXISTS\" : \"Y\",
\"P_ZUORA_FILE_NAME\" : \"Template20240306102852.csv\",
\"P_MESSAGE\" : \"Data loaded in RevPro Successfully - Success: 10000 Failed: 0\",
\"P_RETURN_STATUS\" : \"SUCCESS\"
}, {
\"P_REQUEST_ID\" : \"1r7\",
\"P_BATCH_ID\" : \"2\",
\"P_TEMPLATE\" : \"Template\",
\"P_PERIOD\" : \"24\",
\"P_MORE_BATCHES_EXISTS\" : \"Y\",
\"P_ZUORA_FILE_NAME\" : \"Template20240306102852.csv\",
\"P_MESSAGE\" : \"Data loaded in RevPro Successfully - Success: 10000 Failed: 0\",
\"P_RETURN_STATUS\" : \"SUCCESS\"
}, {
\"P_REQUEST_ID\" : \"1577\",
\"P_BATCH_ID\" : \"3\",
\"P_TEMPLATE\" : \"Template\",
\"P_PERIOD\" : \"24\",
\"P_MORE_BATCHES_EXISTS\" : \"Y\",
\"P_ZUORA_FILE_NAME\" : \"Template20240306102852.csv\",
\"P_MESSAGE\" : \"Data loaded in RevPro Successfully - Success: 10000 Failed: 0\",
\"P_RETURN_STATUS\" : \"SUCCESS\"
}, {
\"P_REQUEST_ID\" : \"16577\",
\"P_BATCH_ID\" : \"4\",
\"P_TEMPLATE\" : \"Template\",
\"P_PERIOD\" : \"24\",
\"P_MORE_BATCHES_EXISTS\" : \"Y\",
\"P_ZUORA_FILE_NAME\" : \"Template20240306102852.csv\",
\"P_MESSAGE\" : \"Data loaded in RevPro Successfully - Success: 10000 Failed: 0\",
\"P_RETURN_STATUS\" : \"SUCCESS\"
}]
}
}", "{
\"correlationId\" : \"490cfba0e9f3c770b40\",
\"message\" : \"Processed all revenueData\",
\"tracePoint\" : \"FLOW\",
\"priority\" : \"INFO\",
\"category\" : \"prc-api\",
\"elapsed\" : 472,
\"locationInfo\" : {
\"lineInFile\" : \"205\",
\"component\" : \"json-logger:logger\",
\"fileName\" : \"G.xml\",
\"rootContainer\" : \"syncFlow\"
},
\"timestamp\" : \"2024-03-06T20:57:17.119Z\",
\"content\" : {
\"List of Batches Processed\" : [ {
\"P_REQUEST_ID\" : \"1005377\",
\"P_BATCH_ID\" : \"1\",
\"P_TEMPLATE\" : \"Template\",
\"P_PERIOD\" : \"MAR-24\",
\"P_MORE_BATCHES_EXISTS\" : \"Y\",
\"P_FILE_NAME\" : \"Template20240306102852.csv\",
\"P_MESSAGE\" : \"Data loaded in RevPro Successfully - Success: 10000 Failed: 0\",
\"P_RETURN_STATUS\" : \"SUCCESS\"
}, {
\"P_REQUEST_ID\" : \"1005177\",
\"P_BATCH_ID\" : \"2\",
\"P_TEMPLATE\" : \"Template\",
\"P_PERIOD\" : \"MAR-24\",
\"P_MORE_BATCHES_EXISTS\" : \"Y\",
\"P_FILE_NAME\" : \"Template20240306102959.csv\",
\"P_MESSAGE\" : \"Data loaded in RevPro Successfully - Success: 10000 Failed: 0\",
\"P_RETURN_STATUS\" : \"SUCCESS\"
}, {
\"P_REQUEST_ID\" : \"1005377\",
\"P_BATCH_ID\" : \"3\",
\"P_TEMPLATE\" : \"Template\",
\"P_PERIOD\" : \"MAR-24\",
\"P_MORE_BATCHES_EXISTS\" : \"Y\",
\"P_ZUORA_FILE_NAME\" : \"Template20240306103103.csv\",
\"P_MESSAGE\" : \"Data loaded in RevPro Successfully - Success: 10000 Failed: 0\",
\"P_RETURN_STATUS\" : \"SUCCESS\"
}, {
\"P_REQUEST_ID\" : \"1005377\",
\"P_BATCH_ID\" : \"4\",
\"P_TEMPLATE\" : \"Template\",
\"P_PERIOD\" : \"MAR-24\",
\"P_MORE_BATCHES_EXISTS\" : \"Y\",
\"P_ZUORA_FILE_NAME\" : \"Template20240306103205.csv\",
\"P_MESSAGE\" : \"Data loaded in RevPro Successfully - Success: 10000 Failed: 0\",
\"P_RETURN_STATUS\" : \"SUCCESS\"
}, {
\"P_REQUEST_ID\" : \"1005377\",
\"P_BATCH_ID\" : \"5\",
\"P_TEMPLATE\" : \"Template\",
\"P_PERIOD\" : \"MAR-24\",
\"P_MORE_BATCHES_EXISTS\" : \"Y\",
\"P_FILE_NAME\" : \"Template20240306103306.csv\",
\"P_MESSAGE\" : \"Data loaded in RevPro Successfully - Success: 10000 Failed: 0\",
\"P_RETURN_STATUS\" : \"SUCCESS\"
}, {
\"P_REQUEST_ID\" : \"100532177\",
\"P_BATCH_ID\" : \"6\",
\"P_TEMPLATE\" : \"ATVI_Transaction_Template\",
\"P_PERIOD\" : \"MAR-24\",
\"P_MORE_BATCHES_EXISTS\" : \"Y\",
\"P_ZUORA_FILE_NAME\" : \"rev_ATVI_Transaction_Template20240306103407.csv\",
\"P_MESSAGE\" : \"Data loaded in RevPro Successfully - Success: 10000 Failed: 0\",
\"P_RETURN_STATUS\" : \"SUCCESS\"
}")
| mvexpand data
| rename data AS _raw
| spath
``` data emulation above ```