python: name conflict with built-in module
sometimes a custom python module may have a name that is the same as a built-in module; and it sucks when you find the program imports a custom module when you want the built-in one, or vice versa;
numerous people may have told you not to use built-in names for custom modules; that is certainly a good way to avoid hassles; but what if the name is so good that you must use it?
an example
we first give a python 3 example that reveals the problem; assume a file tree like this:
/tmp
/test
main.py
shutil.py
where main.py
is:
import shutil
print(shutil.__file__)
and shutil.py
is empty;
now if you execute main.py
from anywhere on the filesystem:
/tmp/test$ python3 main.py
/tmp$ python3 /test/main.py
$ python3 /tmp/test/main.py
the result is the same:
/tmp/test/shutil.py
this means main.py
always imports the custom shutil
module even if you want
the built-in one;
since we want to know why this happens, we dump sys.path
at site of import in
main.py
:
import sys
print(sys.path)
import shutil
print(shutil.__file__)
now it outputs (built-in paths omitted):
['/tmp/test', ..., ... ]
/tmp/test/shutil.py
what matters here is that the first item is always /tmp/test
; this is the
reason why it always imports the custom shutil
module; where does this path
come from?
the documentation
the sys.path
doc says:
As initialized upon program startup, the first item of this list,
path[0]
, is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input),path[0]
is the empty string, which directs Python to search modules in the current directory first. Notice that the script directory is inserted before the entries inserted as a result ofPYTHONPATH
.
in our example, the script path is /tmp/test/main.py
, so path[0]
is always
/tmp/test
; so import shutil
always finds /tmp/test/shutil.py
, and hence
imports it instead of the built-in one;
the remedy
now we understand why the problem happens, what would be the remedy?
well, the remedy is close to nothing, because the problem vanishes if you
deploy your project wisely; the most important thing is, the script dir must not
contain any python module or package, because the script dir heads sys.path
;
standard tools usually deploy the executable script of a python package in a
bin
dir that is populated solely with binaries; if you follow this standard
convention, the dir of the script will be free from python module and package,
and the problem shies away;
you would still have problems when execute individual source files in a package
during development; this is somewhat unintuitive because script files should be
exposed as project entry points, and other source files are not meant to be run
as scripts; if you want to unit test an individual source file, better leverage
pytest
with some test files; this does not suffer from module name conflicts
in the source tree;
work in python 2
working in python 2 is more tricky, because import
in python 2 has a different
semantics: under python 2, when importing from inside a package, the package own
modules were considered before global ones; python 2 import
is relative, while
python 3 import
is absolute;
to see an example of this difference, slightly modify the above example: rename
main.py
to __init__.py
; then run these commands in /tmp
:
/tmp$ python2 -c "import test"
/tmp$ python3 -c "import test"
python 2 output (built-in paths omitted):
['', ...,]
test/shutil.py
python 3 output (built-in paths omitted):
['', ...,]
<built-in path>/shutil.py
this fact alone already made me sick, so i am gonna stop here; if you are still working on python 2, i recommend reading this page; it is worth reading even if you are no longer working on python 2, though, because next time you see something online that looks similar to what it mentions, you know it is solving a gone problem and can be safely ignored; that is a lot of time saved;