Python import mechanics considered annoying

This may be old knowledge for most python programmers, but I discovered what I consider to be a design flaw in python's import mechanics, which leads to tight coupling and strange side effects.

I have 5 python files:

~/main.py:

import a, b

~/a.py:

import lib.c

~/b.py:

import lib.d
print "b.py called lib.c.test(): {}".format(lib.c.test())
print "b.py called lib.d.test(): {}".format(lib.d.test())

~/lib/c.py:

def test():
    return "c reply"

~/lib/d.py:

def test():
    return "d reply"

I also have a blank init.py in the / and /lib folders, and each .py file starts with #!/usr/bin/env python.

Running gives me this:

~$ PYTHONPATH=~/ python main.py
b.py called lib.c.test(): c reply
b.py called lib.d.test(): d reply

Great! My file (~/b.py) is coded correctly. Deploy it, job done, walk away.

Later someone edits ~/a.py and no longer imports lib.c, since whatever it did isn't needed anymore. Now my code in ~/b.py (which admittedly had a bug in it) no longer works:

~$ PYTHONPATH=~/ python main.py
Traceback (most recent call last):
  File "main.py", line 2, in &lt;module&gt;
    import a,b
  File "/home/steve/b.py", line 3, in &lt;module&gt;
    print "b.py called lib.c.test(): {}".format(lib.c.test())
AttributeError: 'module' object has no attribute 'c'

So python was silently letting my broken code run because of a side effect. It worked because ~/a.py was importing a module I needed. This sets up code to run in a context that is incredible coupled with all other code in the same process.

Python's two step import

According to StackOverflow, python's import statement does two things in sequence:

Loads the modules into the classloader
Maps the classloader entities into the local namespace

This means that when ~/a.py runs import lib.c it does the following:

Tell the classloader to load the lib module
Tell the classloader to load the c module within lib
Map the lib module to the string '''lib''' in the namespace

Later, when ~/b.py runs import lib.d, it runs:

Classloader doesn't need to load the lib module, since it already has
Tell the classloader to load the d module within lib
Map the lib module to the string '''lib''' in the namespace

So then, within ~/b.py, we can use lib to get ahold of the classloader's loaded module, then traverse down to lib.c (which is already loaded because of ~/a.py), and use that submodule.

Solutions?

Ideally, this kind of side-effect would be blocked by the classloader so code that compiles, compiles because it has written the correct imports, not because it was run in a context that had things imported already. Because this is not true, I guess setting up a Jenkins instance with a linter is probably the next best solution. I'll be working on that next.

horsefire

Site map

Recent Posts

Labels

Blog Archive

Python import mechanics considered annoying

Python's two step import

Solutions?