Skip to content

Conversation

kmod
Copy link
Collaborator

@kmod kmod commented Aug 11, 2015

The backstory is that while we do a pretty decent job of avoiding using dictionaries where CPython does, I was running into some cases where we use them as well, and we're slower since our dictionaries are slower. This particular example was instantiating namedtuples, since the namedtuple class is created in an exec, and so when we get inside its __init__, we have to do slow lookups to access any of the builtins. We could also try to optimize those cases specifically, or we could make our hidden class logic a "dict storage strategy" or some such.

But anyway, this pr just does some quick&easy things:

  • switch from std::unordered_map to llvm::DenseMap
    • The issue here is that llvm::DenseMap doesn't have a version that takes a custom allocator; Chris had add versions of llvm's DenseMap/DenseSet which take allocator functors #558, but I decided to fold in a change to explicitly scan the dict contents rather than letting the conservative scanner handle it. This also exposed some other issues, so this ended up being the bulk of this PR by loc.
  • Inline the emitted calls to checkAndThrowCAPIException(), which is much more expensive than being able to check the return value directly.
      django_template3.py             3.3s (2)             3.2s (2)  -2.2%
            pyxl_bench.py             2.8s (2)             2.8s (2)  -0.5%
sqlalchemy_imperative2.py             3.2s (2)             3.2s (2)  -0.3%
                  geomean                 3.1s                 3.0s  -1.0%

kmod added 6 commits August 11, 2015 23:20
Previously, we would just call these "conservative python" objects,
and scan their memory conservatively.  The problem with this is that
the builtin type might have defined a custom GC handler that needs to
be called in addition to the conservative scanning (ie it stores GC
pointers out-of-band that are not discoverable via other gc pointers).

We had dealt with these kinds of issues before which is why I added
the "conservative python kind", but I think the real solution here is
to say that to the GC, these objects are just python objects, and
then let the type machinery decide how to scan the objects, including
how to handle the inheritance rules.  We were already putting
"conservativeGCHandler" as the gc_handler on these conservative types,
so let's use it.
Previously it would have to call out to checkAndThrowCAPIException(),
which is quite a bit slower than what it now can do, which is directly
checking the return value.
@kmod kmod changed the title Optimize dictionary performance Improve dictionary performance Aug 11, 2015
kmod added a commit that referenced this pull request Aug 12, 2015
Improve dictionary performance
@kmod kmod merged commit 76c4219 into pyston:master Aug 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant