Improving Python's Memory Allocator.notes
Saturday, March 26, 2005
TITLE OF PAPER: Improving Python's Memory Allocator
URL OF PRESENTATION: _URL_of_powerpoint_presentation_
PRESENTED BY: Evan Jones (ejones@uwaterloo.ca)
REPRESENTING: University of Waterloo
CONFERENCE: PyCon 2005
DATE: March 25, 2005
LOCATION: Marvin Center
--------------------------------------------------------------------------
REAL-TIME NOTES / ANNOTATIONS OF THE PAPER:
{If you've contributed, add your name, e-mail & URL at the bottom}
Outline
The problem
Inner workings of the memory allocator
Solutions
Future
Finding the problem
For an unrelated school project, he had a Python program which generated the
simulation scenario. He would pass that to the simulator (not written in
Python). Then the Python, which had been blocking all this time, resumed and
analyzed the data.
The trouble is that Python would never let go of any RAM it used for small
objects, so it would starve the simulator.
Workarounds
If you have to do something memory intensive, fork and kill the computing process
when you're done.
Store temp results in the filesystem.
But these are kludgy.
Current allocator
Pycalloc: same since 2.3
It mallocs memory in 256kB chunks ("arenas") which it carves up later as
needed; everything goes on the heap.
It's used only for objects <= 256b in size.
Pool (4kB) | Header | Pool (4kB) | Arena (256kB)
| Padding |
| Block |
| Block |
| Block |
| [something] |
Python does reuse memory, but it never returns it to the OS.
This happens because we don't keep track of what
arenas the blocks are part of, so we can't detect when all the blocks of
an arena are free and therefore can't free the arena.
<read Modules/obmalloc.c for more pretty graphics from tim, and more
discussion of how the allocator works>
Evaluation
This solution will fail if your arenas get fragmented, of course. To solve this,
you'd have to move to a compacting GC like the JVM or .NET CLR. This would
break all the current C extensions (and be really slow).
(Bad allocation/deallocation patterns will kill you in this system...however, it
won't be any worse than the current system, it just won't be better.)
Q&A
Q: Can this new allocator be optional or a command-line option?
A: Maybe, but there's no need to do so.
Q: Can we group little allocations into arenas by what objects the little
allocations are a part of, since they'll probably be disposed together?
A: Remember that the current bad allocator is used only for small objects. Large
ones do get freed eventually, and user-defined objects are large objects. (Is
this true?)
Q: Can the allocator be tuned to match the paging size that the system is using?
A: If your OS uses 4K pages, it will do the right thing.
Q: Will this be in 2.5?
A: Uncle Tim says "almost certainly."
--------------------------------------------------------------------------
REFERENCES: {as documents / sites are referenced add them below}
http://evanjones.ca/
--------------------------------------------------------------------------
QUOTES:
--------------------------------------------------------------------------
CONTRIBUTORS: {add your name, e-mail address and URL below}
Erik Rose <corp@grinchcentral.com>
--------------------------------------------------------------------------
E-MAIL BOUNCEBACK: {add your e-mail address separated by commas }
--------------------------------------------------------------------------
NOTES ON / KEY TO THIS TEMPLATE:
A headline (like a field in a database) will be CAPITALISED
This differentiates from the text that follows
A variable that you can change will be surrounded by _underscores_
Spaces in variables are also replaced with under_scores
This allows people to select the whole variable with a simple double-click
A tool-tip is lower case and surrounded by {curly brackets / parentheses}
These supply helpful contextual information.
--------------------------------------------------------------------------
Copyright shared between all the participants unless otherwise stated...