Josiah Winslow solves Advent of Code

Mull It Over

Published: 2026-04-11 Original Prompt

Part 1

We’re looking through a corrupted computer program for strings that look like instructions. This is a perfect use case for regular expressions!

A regular expression, or “regex”, can be used to search a string for substrings that match a given pattern. In this case, that pattern is “looks like a mul() instruction” — things like mul(44,46) or mul(123,4). We’ll also want to save the number parts, so we can multiply them together; the way we do that in regex is with “capturing groups”.

Here are all the parts of a mul() instruction, along with how they are represented in a regex:

The resulting regex is mul\((\d{1,3}),(\d{1,3})\). If you want, you can test out this regex here on regex101 and see exactly how it works; you’ll notice that all of the mul() instructions will be matched, and nothing else.

Once we find all the mul() instructions, we’ll want to multiply the two numbers in each instruction, and add up all the products. This becomes surprisingly simple with our regex in hand; we can just use re.findall to find all matches of our regex, and sum up the results of multiplying the numbers.

2024\day03\solution.py
import re
class Solution(TextSolution):
def part_1(self) -> int:
return sum(
int(a) * int(b)
for a, b in re.findall(r"mul\((\d{1,3}),(\d{1,3})\)", self.input)
)

Very manageable, especially if you’re already familiar with regex.

Part 2

Turns out there are some more instructions we should be looking for in the corrupted program: do() and don't(). They don’t change what counts as a mul() instruction — only which instructions to include in our total or not — so let’s factor out our mul()-totaling code into a function.

2024\day03\solution.py
import re
def get_mul_total(program: str) -> int:
"""
Get the sum of the `mul()` commands in a program. `do()` and
`don't()` commands are ignored; only the `mul()` commands are done.
"""
return sum(
int(a) * int(b)
for a, b in re.findall(r"mul\((\d{1,3}),(\d{1,3})\)", program)
)
class Solution(TextSolution):
def part_1(self) -> int:
return get_mul_total(self.input)

What we have to do now is distinguish which segments of the program to do and which segments to not do, and calculate our answer for only those segments that we want to do. Believe it or not, we can also do this using regular expressions!1

This time, we’ll want the regex to match everything between a do() and don't() instruction. We’ll also want to make sure the very start of the string is treated as a do() and the very end of the string is treated as a don't(), so we don’t miss instructions near the start and end. The resulting regex will be a bit complex, so let’s break it down into parts:

  1. The start of the string, or a do() instruction.

    The way to match the start of the string is ^, and the way to match one of several possible patterns is by putting | in between them. So this part of the pattern would be ^|do\(\).

  2. Whatever’s in between.

    The usual way to match zero or more characters is with .*. However, * is a greedy quantifier — it matches as much text as possible — which is not the behavior we want. Instead, we can use the lazy quantifier *?, which does the same as *, except it matches as little text as possible. So this part of the pattern would be .*?.

  3. A don't() instruction, or the end of the string.

    The way to match the end of the string is $. So this part of the pattern would be don't\(\)|$.

Putting it all together, we have the three parts of our regex (written here in “verbose” form, where whitespace and comments are ignored):

(?:^|do\(\)) # start of input, or do()
(.*?) # whatever's in between (lazily)
(?:don't\(\)|$) # don't(), or end of input

I’ve placed the second part in a capturing group ((...)) because we need its contents, and I’ve placed the first and third parts in non-capturing groups ((?:...)) because we don’t need their contents. If you try out this regex here on regex101, you’ll see that only the segments of the input that we should “do” are matched.

The hard part of writing the regex is over, and now we can simply sum over all matches.

2024\day03\solution.py
...
class Solution(TextSolution):
...
def part_2(self) -> int:
segments_to_do = re.findall(
r"""
(?:^|do\(\)) # start of input, or do()
(.*?) # whatever's in between (lazily)
(?:don't\(\)|$) # don't(), or end of input
""",
self.input,
flags=re.DOTALL | re.VERBOSE,
)
return sum(get_mul_total(segment) for segment in segments_to_do)

Attention

When writing longer/more complex regexes, it’s often helpful to write them in “verbose” form, where whitespace and comments are ignored, as I did above. But keep in mind that this requires you to pass in the re.VERBOSE flag.

I also pass the re.DOTALL flag to make the . character match any character at all, including a newline; without it, . would match any character except a newline.

If there were a Part 3 to this puzzle, I would definitely switch to another non-regex approach. But I thought this was a good opportunity to demonstrate the power of regular expressions.

Footnotes

  1. Normally, by using regexes for this kind of thing, we’d be making a huge mistake.

    In a real-world context, what we should be using is some sort of tokenizer and parser; I would recommend the book Crafting Interpreters by Robert Nystrom if you’re interested in how a more complex real-world programming language is interpreted.

    But for this ultra-simplified example, using regexes is good enough.