i cant answer this but wanted to share the post from one of the setuptools maintainers
My personal opinion is that MANIFEST.in is only needed if you want a high degree of customization and/or are not happy with using VCS (e.g. there are people that believe that disagree on a conceptual level with using VCS info for builds)
I am happy to help, however it seems that this question comes charged with a lot of context that I might not be aware/have the time to read through the entire previous issue/pr/discussions…
I can give my feedback about what would be the effect of the suggested configurations:
When I read this I imagine you have the following project structure:
. # project root
│ ├── mod1.py
│ └── ... # does not include .rda, .rds files
└── ... # does not include .rda, .rds files
If that is the case, the .whl file will contain mypkg/a.rda and mypkg/b.rds files, and those will be installed in the site-packages directory. Later you can use importlib.resources to traverse the mypkg file and list all the files.
I am not sure if all of these lines are necessary, or if there is a better way of doing it.
Finally, I made my project use importlib.resources. In order to do that, I use this code to have a global variable containing a path-like object that points to the data folder:
from importlib.resources import files
from typing import Final
from .parser._parser import Traversable
def _get_test_data_path() -> Traversable:
return files(__name__) / "tests" / "data"
TESTDATA_PATH: Final[Traversable] = _get_test_data_path()
Here I have a few questions:
Am I correct in assuming that the object created by files does not employ a file descriptor, and thus is fine to keep an object of this kind permanently alive?
I had to copy the definition of the Traversable protocol, as it is not available in Python versions older than 3.11. Is there a better way to deal with this?
When I originally typed the functions of my package that can receive file paths, I used the os.PathLike protocol. However, while doing this I learned that this protocol is just for paths in the file system, and thus zipfile.Path objects and the Traversable protocol are not compatible with this protocol. I changed my code to be able to receive Traversable objects. However, I think that most of the Python community still thinks that the “generic” way to accept paths is to accept os.PathLike objects. I think that the Traversable protocol should probably be made the “general” path protocol, promoting it to the typing module and recommending that users update their code to accept Traversable objects whenever a pathfile.Path object would be accepted. What do you think?
Personally, I like to use it and I recommend (I don’t speak for the entire project here, but I see that the other setuptools maintainers also seem to like it in their projects).
setuptools-scm has 2 effects:
Automatically compute version based on the git or hg tags (if you haven’t supplied any version configuration).
Tell setuptools to add all files tracked by git or hg to the sdist.
In practice, if you are fine of having all files in the repository tracked in the sdist (I am happy with that ), you don’t need to use MANIFEST.in, which streamline the configuration and makes things easier…
The following questions are probably more related to importlib-resources and cpython… I did my best to comment on them, but maybe opening an issue/discussion in importlib-resources and/or cpython would be more effective.
That is also my understanding (seem to be confirmed in the importlib-resources docs, see https://importlib-resources.readthedocs.io/en/latest/using.html).
Maybe use the backfill from importlib-resources (see https://importlib-resources.readthedocs.io/en/latest/api.html#importlib_resources.abc.Traversable)? In their docs, I have the impression that importlib_resources.abc.Traversable is public…
That is probably something to be discussed in the Python typing council (see PEP 0729/#specification). I can see that there are already some open issues about this topic Issues · python/cpython · GitHub.
Regarding @NickleDave suggestions in the first post, I don’t think those are appropriate for the package structure explained by @vnmabus.
The where = ["rdata"] indicates to setuptools to look inside the rdata folder, but not includes the folder itself… The existing configuration in the rdata github repo, seems more appropriate:
include = ["rdata*"]
When I compare the MANIFEST.in with the suggestion:
thank you @abravalheri for your time. i wouldn’t expect you to dig into the full context so thank you for providing this overview. i also appreciate that you suggested opening issues in importlib-resources to ask questions in other package repos!! thank you!
@vnmabus FWIW i’m also a big fan of using setuptools_scm for versioning. As @abravalheri points out - once it’s setup it’s easy to automate your release / publish workflow on github without having to worry about manually bumping versions
If you need help setting it up we can likely get you there in a separate thread . @AlexanderJuestel actually just set this up with hiw package gemgis and can speak to the process as well.