Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consistently set encoding=utf-8 for opening files #2244

Merged
merged 2 commits into from
Feb 18, 2025

Conversation

h-vetinari
Copy link
Member

@h-vetinari h-vetinari commented Feb 17, 2025

I haven't had issues with running the linter on windows, but today I ran into encoding errors while linting a multi-output rattler recipe (no idea what triggered it exactly). Since python won't default to utf-8 until 3.15(!), it's better to be explicit about the encoding where we're interacting with files. I only needed the changes in lints.py, but while I was at it, I also looked for other uses of bare open() (with the exception of some obviously unix-only files, e.g. those using paths like ~/...)

@h-vetinari h-vetinari requested a review from a team as a code owner February 17, 2025 22:20
Copy link
Contributor

@hmaarrfk hmaarrfk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use this in my codebase as well. The only thing is that some LICENSE.txt from packages are going to be non-utf8 encoded. do these changes touch any of those codepaths?

@h-vetinari
Copy link
Member Author

I don't see a reason why license files cannot be encoded in utf-8 themselves, it's not like the encoding is part of the license. Also, I don't think we do any parsing (or even open()-ing`) of license files, so the question shouldn't arise AFAICT.

@h-vetinari
Copy link
Member Author

BTW, I'm now hitting this in rerendering too, which makes this a fair bit more urgent for me...

@hmaarrfk
Copy link
Contributor

I don't see a reason why license files cannot be encoded in utf-8

I'm just saying I saw feedstocks' source code, from the package they point to, have a license with characters that are illegal for UTF-8

@wolfv
Copy link
Member

wolfv commented Feb 18, 2025

We could maybe also use export PYTHONIOENCODING=utf8?

Or perhaps:

import sys
sys.setdefaultencoding('utf-8')

@h-vetinari
Copy link
Member Author

I don't mind which way we do it. I used the explicit encoding because we already have that in a bunch of places anyway. 🤷

Copy link
Contributor

@hmaarrfk hmaarrfk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

I don’t like setting the default encoding since it is easy to undo

@h-vetinari h-vetinari merged commit ac12c29 into conda-forge:main Feb 18, 2025
2 checks passed
@h-vetinari h-vetinari deleted the utf8 branch February 18, 2025 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants