Community Blog
Get the latest updates on the Splunk Community, including member experiences, product education, events, and more!

Fun with Regular Expression - multiples of nine

ITWhisperer
SplunkTrust
SplunkTrust

Fun with Regular Expression - multiples of nine

This challenge was first posted on Slack #regex channel https://splunkcommunity.slack.com/archives/C3WFE5V5G/p1759522087595439

For BORE at .conf24, we had a puzzle question which was to find integers which were multiples of 3. Rather than providing spoilers (in case we run BORE again and allow previous questions to be answered), I have devised another puzzle on similar lines. Find the integers which are multiple of 9, just using a single regular expression (rex command). Using the following SPL to generate some numbers, complete the regex to return matches (where the match field is equal to the number field) when modulus-9 is 0 (zero).

| makeresults count=20
| eval number=(9*(random()%1000000))+((random()%4)*(random()%4))
| eval modulus=number%9
| rex field=number "(?<match>)"
| table number modulus match

#justforfun

This article contains spoilers!

Some school-level mathematics

Starting with some school-level maths, we were taught that, for numbers that are multiples of nine (9), if you sum the digits of the number, and keep summing the digits of the result, you will eventually get a sum of nine. This is called the digital root. There is probably a mathematical proof for this but a simple examination of some examples will show this to appear to be true. For example:

Number

Divide by 9

Sum digits

18

2

9

36

4

9

45

5

9

240642

26738

18

7459506

828834

36

123456789

13717421

45

 

In more general terms, if you take the digital root of a number, extending the number by adding a 9 or a group of digits with a digit root of 9 anywhere in the sequence of digits, the resulting number has the same digit root as the original number. Similarly, if you remove a sequence of digits with a digital root of 9. The digits don’t even have to be inserted or removed sequentially! For example:

Number

Digital Root

Extended

New digital root

42

6 (4+2)

429

6 (4+2+9=15, 1+5=6)

7506

9 (7+5+0+6=18, 1+8=0)

7459506

9 (7+4+5+9+5+0+6=36, 3+6=9)

123456789

9 (see above)

13689

9 (1+3+6+8+9=27, 2+7=9)

Patterns in the numbers

So much for the maths; unfortunately, regular expressions don’t really do maths, they match patterns, so how can we solve this puzzle using patterns?

Starting with the simplest patterns for digital roots of 9, we have only two single digit numbers which are multiple of 9, 9 and 0 (zero). In regex terms, we can define a subroutine (called “nine”) to match to digital root 9:

(?’nine’9|0)

This can be extended to include the eight 2-digit numbers, 18, 27, 36, 45, 54, 63, 72 and 81.

(?’nine’9|0|18|27|36|45|54|63|72|81)

This can be demonstrated in a Splunk search like so:

| makeresults format=csv data="number
0
9
18
27
36
45
54
63
72
81"
| rex field=number "(?<match>(?'nine'9|0|18|27|36|45|54|63|72|81))"
| table number match

In regex101.com, this would look like this https://regex101.com/r/zJyXu7/1

As already discussed, inserting digits with a digit root of nine doesn’t affect the digital root; for example, 108, 297, 3456, 4572 and 8271 all have digital roots of nine. In regex terms, they all match to:

(?'nine'9|0|1(?&nine)*8|2(?&nine)*7|3(?&nine)*6|4(?&nine)*5|5(?&nine)*4|6(?&nine)*3|7(?&nine)*2|8(?&nine)*1)+

Note that word wrapping has to be removed when used in SPL (and regex101 https://regex101.com/r/zJyXu7/2)

Breaking this expression down, it means the following:

(?'nine'

Start a subroutine definition

9|

Alternative 9

0|

Alternative 0

1(?&nine)*8|

Alternative 1, followed by zero or more digital root 9 numbers (recursive call to ‘nine’), followed by 8

2(?&nine)*7|

Alternative 2, followed by zero or more digital root 9 numbers (recursive call to ‘nine’), followed by 7

3(?&nine)*6|

Alternative 3, followed by zero or more digital root 9 numbers (recursive call to ‘nine’), followed by 6

4(?&nine)*5|

Alternative 4, followed by zero or more digital root 9 numbers (recursive call to ‘nine’), followed by 5

5(?&nine)*4|

Alternative 5, followed by zero or more digital root 9 numbers (recursive call to ‘nine’), followed by 4

6(?&nine)*3|

Alternative 6, followed by zero or more digital root 9 numbers (recursive call to ‘nine’), followed by 3

7(?&nine)*2|

Alternative 7, followed by zero or more digital root 9 numbers (recursive call to ‘nine’), followed by 2

8(?&nine)*1

Alternative 8, followed by zero or more digital root 9 numbers (recursive call to ‘nine’), followed by1

)+

End of subroutine definition, with subroutine executed at least once

Cracked it? Well, not quite. Remember that the digits could be inserted anywhere, not necessarily together, so, for example, 12348765 is a multiple of 9 which does not match the current regular expression. https://regex101.com/r/zJyXu7/3

Generalised solution

In general terms, the 2-digit alternatives should be a literal digit followed by digits with the correct digital root. (Note that a literal digit is required first, so that at least one character is consumed by the subroutine, which avoids problems such as infinite recursion.) For example:

(?'nine'
    9|
    8(?&one)|
    7(?&two)|
    6(?&three)|
    5(?&four)|
    4(?&five)|
    3(?&six)|
    2(?&seven)|
    1(?&eight)|
    0
)+

Now, these other digital root subroutines need to be defined. In regex, subroutines can be called before they are defined, so long as they are defined somewhere in the regular expression. So, expanding just the first couple would look like this:

(?'nine'
    9|
    8(?'one'(?&nine)*(
        8(?&two)|
        7(?&three)|
        6(?&four)|
        5(?&five)|
        4(?&six)|
        3(?&seven)|
        2(?&eight)|
        1))|
    7(?'two'(?&nine)*(
        8(?&three)|
        7(?&four)|
        6(?&five)|
        5(?&six)|
        4(?&seven)|
        3(?&eight)|
        2|
        1(?&one)))|

Note that each of the inner routines start with a recursive call to the outer ‘nine’ routine. I will leave the others for you to complete!

You should now have a regular expression which will match to (positive) integers which are multiples of 9. Try it out in Splunk as shown in the original problem definition.

Further challenge / stretch goals

How would you generalise the expression to handle negative integers,  lists of integers, thousands separators, but ignore date and time formats?

Tags (1)
Get Updates on the Splunk Community!

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

Splunk Decoded: Business Transactions vs Business IQ

It’s the morning of Black Friday, and your e-commerce site is handling 10x normal traffic. Orders are flowing, ...

Fastest way to demo Observability

I’ve been having a lot of fun learning about Kubernetes and Observability. I set myself an interesting ...