Regex for multiples of an integer

This article provides a method to construct a regular expression that matches multiples of 3. Here we generalize it and give a method to construct a regular expression that matches any integer \(n\) in radix \(r\) such that \(n≡k\,mod\,m\).

The outline of this method is the same as the original one. First we construct a DFA, then we convert this DFA into a regular expression.

Construct a DFA with states \(0,1, \cdots ,m−1\).
Calculate transitions out of state \(s\) as follows:
1. Originally we are in state \(s\)
2. We read a digit \(d\)
3. Now we become \((s∗r+d)\)
4. Therefore, there should be a transition from \(s\) to \((s∗r+d)\) with symbol \(d\)
After all such transitions have been completed, we convert this DFA to a regular expression using state elimination.

The sample code is as follows:

#!/usr/bin/python2

'''Print a regular expression which matches numbers in radix `radix` such that, when modulo
`modulo`, give remainder `remainder`.'''

import re, sys

################################################################
#   Global configuration.
################################################################

# (...)_{radix} % {modulo} = {remainder}
radix = 10
modulo = 3
remainder = 0

if len(sys.argv) >= 2:
    radix = int(sys.argv[1])
if len(sys.argv) >= 3:
    modulo = int(sys.argv[2])
if len(sys.argv) >= 4:
    remainder = int(sys.argv[3])

################################################################
#   Don't modify code below!
################################################################

# Constants.
digit = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F']
value = range(radix)

# Regex of transitions.
transition = [[None for i in range(modulo)] for i in range(modulo)]

# State elimination function.
def eliminate(i):
    for p in range(modulo):
        if p == i or transition[p][i] is None:
            continue

        for n in range(modulo):
            if n == i or transition[i][n] is None:
                continue

            transition[p][n] = '({e_pi}{e_ii}{e_in}{e_orig})'.format(
                    e_pi = transition[p][i],
                    e_ii = '' if transition[i][i] is None else transition[i][i] + '*',
                    e_in = transition[i][n],
                    e_orig = '' if transition[p][n] is None else '|' + transition[p][n])

# Compute transitions.
for s in range(modulo):
    for e in range(radix):
        d = (s * radix + e) % modulo
        if transition[s][d] is None:
            transition[s][d] = digit[e]
        else:
            transition[s][d] += digit[e]

for s in range(modulo):
    for d in range(modulo):
        transition[s][d] = '[' + transition[s][d] + ']'

# Eliminate states.
start = 0
accept = remainder
result = ''

if start == remainder:
    for i in range(1, modulo):
        eliminate(i)
    r = transition[start][start]
    result = '' if r is None else '({R}*)'.format(R = r)
else:
    for i in range(1, modulo):
        if i == accept:
            continue

        eliminate(i)

    r = transition[start][start]
    s = transition[start][accept]
    t = transition[accept][start]
    u = transition[accept][accept]

    if s is None:
        result = ''
    elif t is None:
        result = '({R}{S}{U})'.format(
                R = '' if r is None else r + '*',
                S = s,
                U = '' if u is None else u + '*')
    else:
        result = '(({R}{S}{U}{T})*{S}{U})'.format(
                R = '' if r is None else r + '|',
                S = s,
                T = t,
                U = '' if u is None else u + '*')

print result