In order to participate in these challenges, you will need to register with the Advent of Code site, so that you can retrieve your own datasets for the puzzles. When you get an answer, you will need to submit it to the Advent of Code site to determine whether the answer is correct. I have already completed the 2025 set using python, so I will know when my SPL generates the correct result.
Each day's puzzle is split into two parts; part one is usually easier than part two, and you cannot normally reach part two until you have successfully completed part one. Day 8 is about connecting points in 3D space. Please visit the website for full details of the puzzle.
This article contains spoilers!
In fact, the whole article is a spoiler as it contains solutions to the puzzle. If you are trying to solve the puzzle yourself and just want some pointers to get you started, stop reading when you have enough, and return if you get stuck again, or just want to compare your solution to mine!
As with all the Advent of Code puzzles, the description of the problem is aided by some example input; in this case, the input is list of coordinates in 3D space representing the location of "junction boxes". The aim of the puzzle is to determine, for your own dataset, after finding the closest pairs of "junction boxes", (which are not already part of the same "circuit") and connecting them, how many "junction boxes" are in the three largest "circuits".
The first thing to do is initialise your data. One way to do this is to save it to a csv file and use inputlookup to load it. Alternative, you could just use makeresults (as I have done here), and set a field to the data:
| makeresults
| eval _raw="162,817,812
57,618,57
906,360,560
592,479,940
352,342,300
466,668,158
542,29,236
431,825,988
739,650,466
52,470,668
216,146,977
819,987,18
117,168,530
805,96,715
346,949,466
970,615,88
941,993,340
862,61,35
984,92,344
425,690,689"The next step is to break the data up into separate events.
| rex max_match=0 "(?<junction_box>\d+,\d+,\d+)"
| fields - _raw
| mvexpand junction_boxEach line represents a set of coordinates for a "junction box" in 3D space.
| eval xa=mvindex(split(junction_box,","),0)
| eval ya=mvindex(split(junction_box,","),1)
| eval za=mvindex(split(junction_box,","),2)
| fields - junction_boxFor each "junction box" (except the first), give them an id and list the coordinates of all the previous "junction boxes". Using the streamstats command with the current=f option performs the aggregation functions on the previous events up to but not including the current event.
| streamstats list(xa) as x list(ya) as y list(za) as z count as index current=f limit=0
| where index > 0The "distance" between a pair of "junction boxes", or rather the square of the "distance" between a pair of "junction boxes" can be found by summing the squares of the differences in the x, y and z coordinates of the pair of "junction boxes".
Since the "distance" from "junction box" A to "junction box" B, is the same as from "junction box" B to "junction box" A, we only need to determine the "distance" for one direction (hence the use of streamstats with current=f). Start by creating an index for each of the previous "junction boxes" and determine the distance for each of the three dimensions.
| eval other=mvrange(0, mvcount(x))
| eval dx=mvmap(other, abs(tonumber(mvindex(x, other)) - xa))
| eval dy=mvmap(other, abs(tonumber(mvindex(y, other)) - ya))
| eval dz=mvmap(other, abs(tonumber(mvindex(z, other)) - za))
| fields - x y z xa ya zaUsing the mvmap() function avoids the memory issues associated with the mvexpand command. By creating a multi-value field with indexes to the other multi-value fields in the event, we can use the mvmap() function to create multi-value fields based on one or more correlated multi-value fields.
Now, sum the squares of the differences in the x, y and z coordinates for each of the pairs of "junction boxes", that is, from each "junction box" to each of the previous "junction boxes".
| eval distance=mvmap(other, (tonumber(mvindex(dx, other)) * tonumber(mvindex(dx, other))) + (tonumber(mvindex(dy, other)) * tonumber(mvindex(dy, other))) + (tonumber(mvindex(dz, other)) * tonumber(mvindex(dz, other))))
| fields - dx dy dzIn order to find the "junction boxes" closest to each other, we need to find a way to sort all the "distances" calculated so far. If the "distances" were in separate events, we could easily use the sort command to sort them. Unfortunately, as previously mentioned, the mvexpand command can have issues with large numbers of events and large multi-value fields. While the sort command can sort numeric values, other methods sort lexicographically. The simplest way to convert a numeric is to value that can be equivalently sorted lexicographically is to add the next power of 10 above the maximum value in the set, e.g. 10 if the maximum value is less than 10; 100 if the maximum value is less than 100. To calculate the next higher power of 10 above the maximum value, we can take the floor of the log base 10 of the maximum value, add one to it and raise 10 to that power. (Actually, we could use any base, but using 10 makes it easier to see what is going on.)
| eventstats max(distance) as max_distance
| eval higher_power=pow(10, floor(log(max_distance, 10)) + 1)For each of the "distances" between "junction boxes", create a value made up of the "distance" plus the next power of 10 above the maximum distance, the index of the first "junction box" and the index of the second "junction box".
| eval combined=mvmap(other, printf("%d,%d,%d", higher_power + tonumber(mvindex(distance, other)), index, other))Use the stats command to effectively sort the combined values (in lexicographical order).
| stats values(higher_power) as higher_power by combinedFor Part One, we only need the closest 1000 "junction boxes".
| head 1000Extract the "distance" and "junction box" indexes for each potential "connection".
| rex field=combined "(?<distance>\d+),(?<index_a>\d+),(?<index_b>\d+)"
| eval distance=distance-higher_powerCreate a list (in a multi-value field) of the potential "connections".
| eval connection=index_a.",".index_b
| stats list(connection) as connections limit=0Initialise an array for "circuits" and an index into the array of "connections".
| eval circuits=null()
| eval list=mvrange(0, mvcount(connections))Rather than processing large multi-value fields, create a large string of the" connections", which Splunk can handle more easily.
| eval connections=mvjoin(connections,"|")For each "connection", get the "junction box" indexes from either end of the "connection".
| foreach mode=multivalue list
[
| eval i=<<ITER>>,
connection=mvindex(split(connections, "|"), i),
start=mvindex(split(connection, ","), 0),
end=mvindex(split(connection, ","), 1),Find if the start and end "junction boxes" are already present in a "circuit".
start_connection=mvfind(circuits,",".start.","),
end_connection=mvfind(circuits,",".end.","),Depending on whether or not the start and end "junction boxes" of the "connection" are already present in a "circuit", we need to update the list of "circuits" accordingly.
| Start present in a "circuit" | End present in a "circuit" | Action |
| False | False | Create new "circuit" for this connection |
| False | True | Append start to the "circuit" with end in |
| True | False | Append end to the "circuit" with start in |
| True | True | If they are the same "circuit," do not update the "circuit", otherwise, update both "circuits" to include the "junction boxes" from both "circuits" |
Since the array of "circuits" is actually a multi-value field, in order to update it, it would be helpful to maintain the order of the "circuits" when we rebuild the multi-value field with the updated "circuits".
start_circuit=if(isnotnull(start_connection) and isnotnull(end_connection), if(tonumber(start_connection) < tonumber(end_connection), start_connection, end_connection), start_connection),
end_circuit=if(isnotnull(start_connection) and isnotnull(end_connection), if(tonumber(end_connection) > tonumber(start_connection), end_connection, start_connection), end_connection),Create a new "circuit" as outlined above.
new_circuit=case(isnull(start_circuit) and isnull(end_circuit), ",".start.",".end.",", isnull(start_circuit) and isnotnull(end_circuit), mvindex(circuits, end_circuit).start.",", isnotnull(start_circuit) and isnull(end_circuit), mvindex(circuits, start_circuit).end.",", start_circuit = end_circuit, mvindex(circuits, start_circuit), true(), mvindex(circuits, start_circuit).substr(mvindex(circuits, end_circuit), 2)),Deduplicate and sort the elements of the "circuit"; even though the values are numeric indexes of "junction boxes" in the "circuit", it does not matter that this is a lexicographical sort, it is more important to create the "circuit" in a deterministic order, as this allows for "circuits" to be deduplicated later.
new_circuit=",".mvjoin(mvsort(mvdedup(split(trim(new_circuit,","),","))),",").",",Now, update the array of "circuits" with the new "circuit" in the appropriate place, depending on whether there were zero, one or two "circuits" updated.
new_circuits=case(isnull(start_circuit) and isnull(end_circuit), mvappend(circuits, new_circuit), isnull(start_circuit) and isnotnull(end_circuit), mvappend(if(end_circuit = 0, null(), mvindex(circuits, 0, end_circuit-1)), new_circuit, mvindex(circuits, end_circuit+1, -1)), isnotnull(start_circuit) and isnull(end_circuit), mvappend(if(start_circuit = 0, null(), mvindex(circuits, 0, start_circuit-1)),new_circuit, mvindex(circuits, start_circuit+1, -1)), start_circuit = end_circuit, circuits, true(), mvappend(if(start_circuit = 0, null(), mvindex(circuits, 0, start_circuit-1)), new_circuit, if(start_circuit+1 < end_circuit, mvindex(circuits, start_circuit+1, end_circuit-1), null()), new_circuit, mvindex(circuits, end_circuit+1, -1))),Finally, deduplicate the array of "circuits". This keeps the size of the array to a minimum.
circuits = mvdedup(new_circuits)
]Now, we only need the array of "circuits", which we can expand.
| fields circuits
| mvexpand circuitsDetermine the number of "junction boxes" in each "circuit" and find the three largest.
| eval length=mvcount(split(trim(circuits,","),","))
| sort 3 -lengthConvert back to a multi-value field.
| stats list(length) as lengthsNote the use of the list() aggregate function rather than values(); this prevents deduplication and lexicographical sorting of the values.
Now, simply multiply the lengths together.
| eval product=1
| foreach mode=multivalue lengths
[| eval product=product*<<ITEM>>]The second part of the puzzle requires that again we create "circuits", for your own dataset, by connecting the closest pairs of "junction boxes", only this time, instead of stopping after 1000, we keep going to find the last pair of "junction boxes" that would create a single "circuit", and multiply their x coordinates together. Note that this may not necessarily be the last pair in the list since by the time this pair is processed, they may already be in the "circuit".
Starting in the same manner as Part One, only this time we also need to keep the x coordinate of the "junction boxes".
| rex max_match=0 "(?<junction_box>\d+,\d+,\d+)"
| fields - _raw
| mvexpand junction_box
| eval xa=mvindex(split(junction_box,","),0)
| eval ya=mvindex(split(junction_box,","),1)
| eval za=mvindex(split(junction_box,","),2)
| fields - junction_box
| streamstats list(xa) as x list(ya) as y list(za) as z count as index current=f limit=0
| where index > 0
| eval other=mvrange(0, mvcount(x))
| eval dx=mvmap(other, abs(tonumber(mvindex(x, other)) - xa))
| eval dy=mvmap(other, abs(tonumber(mvindex(y, other)) - ya))
| eval dz=mvmap(other, abs(tonumber(mvindex(z, other)) - za))
| fields - y z ya za
| eval distance=mvmap(other, (tonumber(mvindex(dx, other)) * tonumber(mvindex(dx, other))) + (tonumber(mvindex(dy, other)) * tonumber(mvindex(dy, other))) + (tonumber(mvindex(dz, other)) * tonumber(mvindex(dz, other))))
| fields - dx dy dz
| eventstats max(distance) as max_distance
| eval higher_power=pow(10, floor(log(max_distance, 10)) + 1)
| eval combined=mvmap(other, printf("%d,%d,%d,%d,%d", higher_power + tonumber(mvindex(distance, other)), index, xa, other, tonumber(mvindex(x, other))))
| stats values(higher_power) as higher_power by combinedExtract the index and x coordinate for each end of the "connections", and initialise the "circuit" array with all the known "connections". The "circuit" for each "junction box" is made up of the index of the "junction box" and the x coordinate separated by a dot (.). This makes it looks like a decimal number, although almost any reasonable delimiter could have been used, since they are not ever treated as numbers.
| rex field=combined "(?<distance>\d+),(?<index_a>\d+),(?<xa>\d+),(?<index_b>\d+),(?<xb>\d+)"
| eval connection=index_a.",".index_b
| eval circuits=mvappend(",".index_a.".".xa.",", ",".index_b.".".xb.",")
| stats list(connection) as connections values(circuits) as circuits limit=0Convert the "circuits" array and "connection" array to pipe-delimited strings so that Splunk can handle them more easily. Also, create a list of indexes into the "connections" array.
| eval circuits=mvjoin(circuits,"|")
| eval list=mvrange(0, mvcount(connections))
| eval connections=mvjoin(connections,"|")Initialise a variable to hold the last "connection" joined to the "circuit".
| eval last=null()For each "connection", extract the start and end indexes.
| foreach mode=multivalue list
[
| eval i=<<ITER>>,
connection=mvindex(split(connections, "|"), i),
start=mvindex(split(connection, ","), 0),
end=mvindex(split(connection, ","), 1),Similar to Part One, update the "circuits" with each "connection".
circuits=split(circuits, "|"),
start_connection=mvfind(circuits,",".start."\\."),
end_connection=mvfind(circuits,",".end."\\."),
start_circuit=if(tonumber(start_connection) < tonumber(end_connection), start_connection, end_connection),
end_circuit=if(tonumber(end_connection) > tonumber(start_connection), end_connection, start_connection),
new_circuit=if(start_circuit = end_circuit, mvindex(circuits, start_circuit), mvindex(circuits, start_circuit).substr(mvindex(circuits, end_circuit), 2)),
new_circuit=",".mvjoin(mvsort(mvdedup(split(trim(new_circuit,","),","))),",").",",
new_circuits=if(start_circuit = end_circuit, circuits, mvappend(if(start_circuit = 0, null(), mvindex(circuits, 0, start_circuit-1)), new_circuit, if(start_circuit+1 < end_circuit, mvindex(circuits, start_circuit+1, end_circuit-1), null()), new_circuit, mvindex(circuits, end_circuit+1, -1))),Until there is only one "circuit".
unique_circuits = mvdedup(new_circuits),
last=if(isnull(last) and mvcount(unique_circuits) = 1,start.",".end, last),
circuits=mvjoin(unique_circuits,"|"),
new_circuits=null(),
unique_circuits=null()
]Finally, multiply the x coordinates of the last "connection" to join the "circuit".
| rex field=last "(?<start>\d+),(?<end>\d+)"
| eval circuits=split(trim(circuits, ","), ",")
| eval start_x=mvindex(split(mvindex(circuits, mvfind(circuits, "^".start."\\.")), "."), 1)
| eval end_x=mvindex(split(mvindex(circuits, mvfind(circuits, "^".end."\\.")), "."), 1)
| eval total=start_x*end_x
| table totalHandling large multi-value fields with a large number of events can lead to difficulties, so different strategies need to be adopted. For example, using an index multi-value field and the mvmap() function can allow processing without the need to use the mvexpand command, and using stats by a multi-value field can work effectively as the mvexpand command (without the complication of breaching memory constraints).
Have questions or thoughts? Comment on this article or in Slack #puzzles channel. Whichever you prefer.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.