This challenge was first posted on Slack #puzzles channel
This puzzle is based on a letter grid containing tangled words.
The diagram was created in draw.io which can use XML documents to export and import diagrams.
The challenge is to process the XML document using SPL to find the tangled words.
This article contains spoilers!
In fact, the whole article is a spoiler as it contains solutions to the puzzle. If you are trying to solve the puzzle yourself and just want some pointers to get you started, stop reading when you have enough, and return if you get stuck again, or just want to compare your solution to mine!
There are two main types of nodes in the XML documents. Some example nodes look like this:
<mxCell id="f1ab-4f63" edge="1" parent="1" source="f9ed-7857" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;fillColor=#88B8B8;strokeColor=#88B8B8;startArrow=none;startFill=0;endArrow=classic;endFill=1;" target="813f-6d2d" value="">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="f69a-54d7" edge="1" parent="1" source="813f-6d2d" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;fillColor=#88B8B8;strokeColor=#88B8B8;startArrow=none;startFill=0;endArrow=classic;endFill=1;" target="bdcf-6cdb" value="">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="04b0-70ca" parent="1" style="whiteSpace=wrap;html=1;aspect=fixed;fontSize=30;" value="P" vertex="1">
<mxGeometry height="80" width="80" x="640" y="440" as="geometry" />
</mxCell>
<mxCell id="0cd5-73fa" parent="1" style="whiteSpace=wrap;html=1;aspect=fixed;fontSize=30;" value="S" vertex="1">
<mxGeometry height="80" width="80" x="720" y="440" as="geometry" />
</mxCell>
<mxCell id="2151-6ffe" parent="1" style="whiteSpace=wrap;html=1;aspect=fixed;fontSize=30;" value="O" vertex="1">
<mxGeometry height="80" width="80" x="720" y="360" as="geometry" />
</mxCell>The full document can be found here.
Taking the full document as a single event, the first thing to do is to split it into separate mxCell events:
``` Use makeresults to create a copy of the whole XML document in _raw ```
``` Convert single line mxCell nodes to have closing tags ```
| rex mode=sed max_match=0 "s/(?P<mxCell>\<mxCell )(?P<attr>[^\/\>]+)\/\>/\1\2><\/mxCell>/g"
``` Extract all the mxCell nodes into separate events ```
| rex max_match=0 "(?ms)(?<mxCell>\<mxCell .+?\<\/mxCell\>)"
| fields - _raw
| mvexpand mxCellExtract the cell id attribute and drop the header nodes (ids 0 and 1)
| spath input=mxCell output=id path=mxCell{@id}
| where len(id) > 1Extract the source, target and value attributes and drop the mxCell field
| spath input=mxCell output=source path=mxCell{@source}
| spath input=mxCell output=target path=mxCell{@target}
| spath input=mxCell output=value path=mxCell{@value}
| fields - mxCellThis gives us a set of events where some events have source and target ids and some events have values associated with ids.
Using a technique I like to call "non-lookup-self-lookup", we need to find the values associated with the source and target ids of each "connection" event. (We could have saved the second part of the data to a lookup file and used the lookup command to retrieve the values, but this potentially requires additional permissions and capabilities.)
The heart of this technique is to triplicate the events so that the three ids can be stored in the same field for the three different events.
| eval row=mvrange(0,3)
| mvexpand row
| eval name=case(row=0, id, row=1, source, row=2, target)Now we can use eventstats to gather the list of values associated with the source and target names
| eventstats list(eval(if(value != "",value, null()))) as letters by nameWe can use eventstats again to gather the list of letters associated with each connection, drop the redundant rows, and create a 2-letter word fragment
| eventstats list(letters) as letters by id
| where row=0
| eval letters=if(mvcount(letters)>1,mvjoin(letters,""),null())We use stats to create lists of word fragments and their associated source and target. We use list() rather than values() to preserve order and keep any duplicates.
| stats list(letters) as word list(source) as source list(target) as targetNext, we want to process each word fragment in turn and determine if it can be bound to another word fragment. We do this by first creating a multivalue field with indexes for all the word fragments.
| eval binds=mvrange(0,mvcount(source))Then, for each word fragment, we want to find another word fragment which has a target which is equal to the current source. This indicates that the two word fragments can be bound together. The next step is a little involved, so I will go through it step-by-step.
| foreach mode=multivalue binds
[
| eval joining=mvfind(target,mvindex(source,<<ITEM>>)),Note the comma at the end of the line. This allows us to extend the eval command with more assignments without breaking the restriction of only one command in the foreach subsearch when in multivalue mode.
Create a joined word (fragment) from the found word fragment and the current word fragment (minus its initial letter).
joined=if(isnotnull(joining),mvjoin(mvappend(split(mvindex(word,joining),""),mvindex(split(mvindex(word,<<ITEM>>),""),1,-1)),""),null()),Create a new word list, replacing the current word fragment with an empty string
new_word=if(isnotnull(joining),mvappend(if(<<ITEM>>=0,null(),mvindex(word,0,<<ITEM>>-1)),"",mvindex(word,<<ITEM>>+1,-1)),new_word),Update the word list, replacing the found word fragment with the new word fragment
word=if(isnotnull(joining),mvappend(if(joining=0,null(),mvindex(new_word,0,joining-1)),joined,mvindex(new_word,joining+1,-1)),word),Update the new source list, replacing the current source with an empty string
new_source=if(isnotnull(joining),mvappend(if(<<ITEM>>=0,null(),mvindex(source,0,<<ITEM>>-1)),"",mvindex(source,<<ITEM>>+1,-1)),source),
source=new_source,Create a new target list, replacing the found target with the current target
new_target=if(isnotnull(joining),mvappend(if(joining=0,null(),mvindex(target,0,joining-1)),mvindex(target,<<ITEM>>),mvindex(target,joining+1,-1)),target),Update the target list, replacing the current target with an empty string
target=if(isnotnull(joining),mvappend(if(<<ITEM>>=0,null(),mvindex(new_target,0,<<ITEM>>-1)),"",mvindex(new_target,<<ITEM>>+1,-1)),new_target)
]Remove the empty strings
| mvexpand word
| where word!=""
| table wordLooking at the resulting list of words, something is not quite right. There are a couple of words which make sense (SPLUNK, BORE, XML), but others do not (OCSIC is CISCO spelt backwards!). So, what is going on?
Perhaps there are some clues in the diagram. If you save the XML to a file and open it in draw.io, you can drag the letter tiles around and follow the arrows to see what the words should be.
Looking at the XML more closely, you can see that some of the connection mxCells have slightly different style attributes to the others. For example, some have style attributes including:
startArrow=none;startFill=0;endArrow=classic;endFill=1;Whereas others have style attributes including:
startArrow=classic;startFill=1;endArrow=none;endFill=0;In the first instance, the style for the connection indicates that the arrowhead is closer to the target, and in the second instance, the style for the connection indicates that the arrowhead is closer to the source. This means that values for source and target should be switched in the second instance.
Another little nugget you may have noticed from a closer look at the style attributes (and indeed the draw.io diagram) is that some of the connections have different stroke and fill colours. These can actually be used to group the connections into word groups.
| stats list(letters) as word list(source) as source list(target) as target by strokeColorI will leave you to work out how to extract and apply this information to your solution to get the correct list of words.
As a final “Easter egg”, you may or may not have noticed that the ids for the letter mxCell nodes contain the ASCII code for the lowercase version of the corresponding letter, e.g. “0cd5-73fa” contains “73” which is the ASCII code for lowercase “s”, and “2151-6ffe” contains “6f” which is the ASCII code for lowercase “o”. If you had worked this out, you could have perhaps avoided the non-lookup-self-lookup step!
The techniques outlined in this solution, non-lookup-self-lookups and index-based processing loops, can be applied in multiple usecases and may provide solutions to some more complex scenarios.
Have questions or thoughts? Comment on this article or in Slack #puzzles channel. Whichever you prefer.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.