Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What to do about clobbering of packages in the prefix #1189

Open
CJ-Wright opened this issue Nov 18, 2020 · 20 comments
Open

What to do about clobbering of packages in the prefix #1189

CJ-Wright opened this issue Nov 18, 2020 · 20 comments
Assignees

Comments

@CJ-Wright
Copy link
Member

CJ-Wright commented Nov 18, 2020

What should we do when packages ship packages beyond themselves in site-packages. For instance if I have a package that ships itself and requests in site-packages that means that conda may no longer control what requests is installed, which could be problematic.

@conda-forge/core

edit: this effects the entire prefix and we should address all of that at once

@beckermr beckermr changed the title What to do about site-packages clobbering What to do about clobbering of packages in the prefix Nov 18, 2020
@beckermr
Copy link
Member

beckermr commented Nov 18, 2020

Here are two ideas for this.

Only Critical packages

  1. Determine a list of packages that considered critical infrastructure for security or other reasons.
  2. build up the mapping from files in the prefix to feedstocks
  3. add checks of this mapping somehow

All Packages

  1. Use libcfgraph to build up a mapping of possible files in the PREFIX to feedstocks
  2. have CI jobs compute any conflicts
  3. have CI jobs use their feedstock token to get the heroku server to post a lint comment on the PR

We could do both.

@beckermr
Copy link
Member

On the technical side for the second thing, we store the mappings in a github repo and then make http requests to github to pull down json blobs with the data.

For the security-based one, we can use similar mechanisms but we need to do all of the unpacking and checking ourselves. The heroku server cannot support this load, but we can use cf-staging as a place to store validated packages before some other server pulls them to do the check.

@beckermr beckermr pinned this issue Nov 18, 2020
@wolfv
Copy link
Member

wolfv commented Nov 18, 2020

We could also make it easier for people in the test section of the recipe. Currently we often test for files using the commands test section which is really cumbersome and hard to get right. I am thinking about adding some macros to boa to check for the existence of a shared library (cross-platform), a python module in site packages, a cmake file etc.

If we design these macros nicely enough we could then warn about superfluous files ...

@isuruf
Copy link
Member

isuruf commented Nov 18, 2020

warning is not enough. it's a security issue as well.

Also, adding stuff to boa is not going to be enough unless you would contribute those to conda-build.

@wolfv
Copy link
Member

wolfv commented Nov 18, 2020

Sure, these could be good things to upstream. I hope to add package testing to boa next week

@beckermr
Copy link
Member

We actually need this check in quetz to reject uploads of things that clobber. Anyone could simply remove a check in boa when building on a CI service.

@CJ-Wright
Copy link
Member Author

I think that removal depends on how it is implemented, if it is a check in the recipe then yes that is removable. If it is a check applied via the boa/conda-build source code that would be more difficult to skirt.

@isuruf
Copy link
Member

isuruf commented Nov 18, 2020

How will conda-build know that a package is clobbering another?

@CJ-Wright
Copy link
Member Author

We can provide it with an API which returns if a file clobbers something already on conda-forge. I presume it has all the files in the prefix handy

@beckermr
Copy link
Member

If it is a check applied via the boa/conda-build source code that would be more difficult to skirt.

more difficult is not good enough here if we are calling this a security issue.

@CJ-Wright
Copy link
Member Author

You would have to edit the source code of conda-build/boa itself. I'm not certain if there are schemes to do that in a recipe but we could (and most likely should) have the build system hash itself when performing sensitive tasks. That way we can check if the build system was modified in some way during the build.

@wolfv
Copy link
Member

wolfv commented Nov 18, 2020

Ok there are two aspects to this: one is securing against malicious packages and the other is to make it easier for packagers to not unintentionally clobber. But yeah, I would be happy to help with development wrt to quetz as well! We're working really hard on it right now and with the plugin system in place we could totally think about rejecting uploads like this, and indexing all files in a database.

@beckermr
Copy link
Member

We're working really hard on it right now and with the plugin system in place we could totally think about rejecting uploads like this, and indexing all files in a database.

Yup. This coupled with package signing and a way for us to administer the outputs clobbering checks we already do would help us centralize our infrastructure and possibly replace cf-staging with something less hacky and more secure.

@SylvainCorlay
Copy link
Member

Something that may look like clobbering but is not really is when a package get split into two. For example foobar-static may appear to clobber files from an earlier version of foobar (from before splitting static libraries).

@CJ-Wright
Copy link
Member Author

Yes, we observe that with matplotlib vs. matplotlib-base so we'll need to handle exceptions to our "no clobbering" policy. We'll also need to handle cases where clobbering is expected, in the case of direct API/ABI replacement.

@SylvainCorlay
Copy link
Member

One way to address this would be to

  • Build a mapping of files to packages.
  • Use this information to display the clobbering in the quetz front-end when inspecting a package.

The maintainers of e.g. the Python package would probably be worried to see some random package clobber python files, while matpltolitb feedstock maintainers would not.

@beckermr
Copy link
Member

We will have to special-case these things like we do for outputs. Again if we are going to approach this as a security issue then simply displaying things in a front-end is not sufficient.

@wolfv
Copy link
Member

wolfv commented Nov 20, 2020

I have written down my ideas for easier file-existence tests here: mamba-org/boa#93

I'd be happy about input. I think adding this to conda-build wouldn't be too hard (I just finished the port of the test-running capabilities to boa, so I think that should be OK).

@jakirkham
Copy link
Member

Following up on this old thread. Conda does have a way of converting clobber warnings to errors via the path_conflict value in condarc

If we were ok letting feedstocks configure this as an error (like we do with linking checks), this would provide feedstocks a way of fixing clobbering issues and then enabling this configuration parameter to raise errors instead (to avoid regressions)

Proposed a way we might allow some condarc keys in conda-forge.yml with PR: conda-forge/conda-forge-ci-setup-feedstock#364

@jaimergp
Copy link
Member

jaimergp commented Nov 7, 2024

That's part of it, but only local to the packages co-installed at test time. If we want global checks, I think we could leverage the conda-forge-paths database dumps (this is effectively what the files-to-artifacts mapping people were referring to) and run some queries in CI. Possibly in a post-verify step or something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

7 participants