The Traceback That Ate Memory

Let’s write a tiny program. At first glance, it seems unremarkable.

from dataclasses import dataclass

@dataclass(slots=True)
class Book:
    title: str
    description: str
    cost: float

ParseBookException = Exception("parse book failed!")

def validate_book(book: Book) -> Book:
    if book.cost % 2 == 0:
        return book
    raise ParseBookException

def process_books() -> list[Book]:
    books = []
    for book in (
        Book(
            title="title",
            cost=i,
            description="lorem" * 10**6,
        )
        for i in range(2000)
    ):
        try:
            books.append(validate_book(book=book))
        except Exception:
            continue
        books.append(book)
    return books

def process_and_print_books():
    books = process_books()
    print(len(books))

def main():
    process_and_print_books()
    print("done")

if __name__ == "__main__":
    main()

We create 2,000 objects, filtering out half. slots=True is used to consume less memory.

But it ate 9 gigabytes of memory:

pip install scalene
scalene --memory run.py

Strange. Maybe just 1,000 books take up that much space. Let’s comment out the lines:

    # if book.cost % 2 == 0:
    #    return book

Let no book pass the check.

Um, what?

Scalene, where is the leak?

In the program!

pip install memray
memray run run.py
memray flamegraph ...
open ...

Memray, where is the leak?

Already told you, in the program!

After five trendy libraries for profiling later, we arm ourselves with a fork:

import tracemalloc

...

def process_and_print_books():
    books = process_books()
    print(len(books))

    snapshot = tracemalloc.take_snapshot()
    top_stats = snapshot.statistics('lineno')
    print("[ Top 10 ]")
    for stat in top_stats[:10]:
        print(stat)


def main():
    tracemalloc.start()
    process_and_print_books()
    ...

...

╰─➤  python run.py
0
[ Top 10 ]
run.py:27: size=9537 MiB, count=2000, average=4883 KiB
run.py:17: size=406 KiB, count=4000, average=104 B
run.py:32: size=110 KiB, count=2001, average=56 B
run.py:24: size=109 KiB, count=2000, average=56 B
run.py:23: size=54.5 KiB, count=1743, average=32 B
run.py:39: size=208 B, count=1, average=208 B
done

There are no books in the list, but there are allocations. Moreover, 10 gigabytes worth. Well, that’s something.

Let’s also add a little knife to the fork:

pip install objgraph

def main():
    process_and_print_books()
    print("done")

    import objgraph
    objgraph.show_growth()

Let’s launch it.

╰─➤  python run.py
0
done
traceback                      4000     +4000
function                       2347     +2347
frame                          2002     +2002
Book                           2000     +2000
wrapper_descriptor             1200     +1200
dict                           1081     +1081
tuple                          1020     +1020
method_descriptor               819      +819
builtin_function_or_method      813      +813
ReferenceType                   778      +778

Nothing makes sense, but it’s very intriguing. Who are these things, and what are they doing here?

Okay ’things’, but what about the books? Where have the books gone?! There isn’t a single reference to them. Really..?

def main():
    process_and_print_books()
    print("done")

    import random
    import objgraph
    objgraph.show_growth()
    objgraph.show_chain(
        objgraph.find_backref_chain(
            random.choice(objgraph.by_type('Book')),
            objgraph.is_proper_module,
        ),
        filename='chain.png',
    )

Eh, what? An object is hanging with no refs?

Well. Foolhardiness and courage - let’s draw them all.

    for number, obj in enumerate(objgraph.by_type('Book')):
        objgraph.show_chain(
            objgraph.find_backref_chain(
                obj,
                objgraph.is_proper_module,
            ),
            filename=f'chain{number}.png',
        )

Aha! Gotcha!

But the traceback? What does that kno… oh, damn…

ParseBookException = Exception("parse book failed!")

def validate_book(book: Book) -> Book:
    if book.cost % 2 == 0:
        return book
    raise ParseBookException

Could a single pitiful global exception eat them all?

def validate_book(book: Book) -> Book:
    # if book.cost % 2 == 0:
    #    return book
    raise Exception("parse book failed!")

...

def main():
    process_and_print_books()
    print("done")

    import objgraph
    objgraph.show_growth()

Or could it?

╰─➤  python run.py
0
done
function                       2347     +2347
wrapper_descriptor             1200     +1200
dict                           1081     +1081
tuple                          1023     +1023
method_descriptor               819      +819
builtin_function_or_method      813      +813
ReferenceType                   778      +778
getset_descriptor               455      +455
type                            413      +413
member_descriptor               361      +361

Scalene, could it?

And if we uncomment the cost % 2 lines?

The expected 50% of memory - 5 gigabytes out of 10.

But why so?

From raise documentation:

A traceback object is normally created automatically when an exception is raised and attached to it as the __traceback__ attribute.

Since the object is global, refs aren’t cleared on exit. And the memory hangs.

If you print another graph of objects, it becomes clearer:

Watch closely:

In general, write in Python as if it were Rust and learn to understand the tools you use. Learn to poke around with a fork - trendy libraries won’t do everything for you.