Advent of Code: Day 5

Link to problem: https://adventofcode.com/2024/day/5

When I first looked at this, my first instinct was to treat it like a partial order, and make the DAG to represent it. But then I thought on it a bit more-- The way the question is stated doesn't actually make the input a partial ordering, it doesn't have to be transitive-- so I wasn't sure whether it'd be possible to properly represent this as a DAG. I decided not to think too hard, and just use a simple approach that I knew would work. I'll probably implement it as a DAG later on though, when I'm doing this in another language.

I decided to read the "rules" into a dict -- for each number on the left, I'd keep a set of numbers that appear in its right in the rules -- i.e. a set of numbers that should never precede it.

Let's call N the number of rules, M the number of updates, and L the length of the longest update list. Then the maximum number of keys and values in the rules dict would be N. 

rules = dict()
with open("rules.txt") as file:
for line in file:
line = line.strip().split("|")
l = int(line[0])
r = int(line[1])
if l not in rules:
rules[l] = set()
rules[l].add(r)

That's O(N), in the average case... maybe O(N^2) if the hashing lords hate us. 

Then, I implemented a check function that checks whether any of the numbers preceding it in a list break its rules.

def check(line: list[int], index: int) -> bool:
page = line[index]
return page not in rules or set(line[:index]).isdisjoint(rules[page])

From my understanding, this is O(min(len(rules[page]), index)) -- let's call it O(N + L)

Then I have another function to check each line. I tend to put everything in a function if you haven't noticed. I don't like writing a lot of code inside for loops, or nesting loops at all, very much.

def checkLine(line: list[int]) -> bool:
for i in range(len(line)):
if not check(line, i):
return False
return True

This is O(L(N + L)) = O(L^2 +LN)

That's everything important for Part 1.

For Part 2, I figured I could just use Python's sort function with my own key... then I realized that I didn't have the right idea of how keys work. Turns out other people ran into the same issue a long time ago and wrote me a handy function to solve it. I wrote a comparison function for sorting:

def comp(x, y) -> int:
if x in rules.get(y):
return 1
if y in rules.get(x):
return -1
return 0

This is O(N)

Now, importing that handy function + actually iterating through the updates and checking them:

from functools import cmp_to_key
sum_1 = 0
sum_2 = 0
with open("updates.txt") as file:
for line in file:
line = line.strip().split(',')
line = [int(item) for item in line]
if not checkLine(line):
line.sort(key=cmp_to_key(comp))
sum_2 += line[len(line) // 2]
continue
sum_1 += line[len(line) // 2]
print(sum_1)
print(sum_2)

Since checkLine is O(L^2 + LN) and sort is going to be O(N • Log(N)) average case, and O(N^2 • Log(N)) worst case since sort itself is O(N • Log(N)) and comp is O(1) on average and O(N) worst case, the overall loop works out to be O(M (L^2 + LN + N • Log(N)) in the average case. Is that bad? I'll find out once I implement the DAG approach I guess. 







Comments