Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a mode to handle "pretty URLs", i.e. URIs without .html extension #1422

Merged
merged 1 commit into from
Jun 11, 2024

Conversation

dscho
Copy link
Contributor

@dscho dscho commented May 11, 2024

In many circumstances (GitHub Pages, Apache configured with MultiViews, etc), web servers process URIs by appending the .html file extension when no file is found at the path specified by the URI but a .html file corresponding to that path is found.

To allow Lychee to use the fast, offline method of checking such files locally via the file:// scheme, let's handle this scenario gracefully by adding the --auto-append-html-fileext option.

This is especially nice to have when migrating sites that have been run on, say, a web app, to a static site generated by Hugo, as I am trying to do with Git's home page.

@dscho dscho force-pushed the add-auto-append-html-fileext-option branch 2 times, most recently from c2b561e to 7fe842d Compare May 11, 2024 22:16
@dscho
Copy link
Contributor Author

dscho commented May 12, 2024

Hmm. Most of the CI failures (all except one) appear outside the code touched by this PR. Help?

@dscho dscho force-pushed the add-auto-append-html-fileext-option branch 2 times, most recently from db77418 to b61ba55 Compare May 12, 2024 23:15
@dscho dscho mentioned this pull request May 12, 2024
@dscho dscho force-pushed the add-auto-append-html-fileext-option branch from b61ba55 to 64acc73 Compare May 12, 2024 23:49
@mre
Copy link
Member

mre commented May 13, 2024

In general, I like the idea. However, I'd love to make it more generic, i.e. cover more use-cases.
For instance, a user might want to try a different extension like .htm or .php instead.
Multiple extensions could be thinkable, too. And the order of extensions might also be nice to be configurable.
Here's an extended proposal.

--check-extensions <EXTS>
      Test the specified file extensions for URIs when checking files locally.
      Multiple extensions can be separated by commas. Extensions will be checked in order of appearance.
      
      Example: --check-extensions html,htm,php,asp,aspx,jsp,cgi

Any thoughts?

@dscho dscho force-pushed the add-auto-append-html-fileext-option branch 2 times, most recently from 1c40f0a to 3b3caa1 Compare May 14, 2024 23:36
@dscho
Copy link
Contributor Author

dscho commented May 14, 2024

Here's an extended proposal.

--check-extensions <EXTS>
      Test the specified file extensions for URIs when checking files locally.
      Multiple extensions can be separated by commas. Extensions will be checked in order of appearance.
      
      Example: --check-extensions html,htm,php,asp,aspx,jsp,cgi

@mre done!

@thomas-zahner
Copy link
Member

Sounds like a useful addition. I agree that we can add this more generic approach to lychee

@dscho dscho force-pushed the add-auto-append-html-fileext-option branch 6 times, most recently from b3742fb to 976ba4b Compare May 18, 2024 20:45
@mre
Copy link
Member

mre commented May 29, 2024

I think this is a useful feature and I appreciate your patience in working through the details.

My main concern is that this feature could be a footgun that people might use incorrectly. For example, images might have different file extension rules than normal pages, and the extension rules can vary per subdirectory. Also, the user needs to have a lot of knowledge about their server configuration. This makes it difficult to find a set of extensions that work for all cases, potentially leading to false positives.

As such, I propose merging this PR with the condition that it might not be stabilized in version 1.0. If it leads to too many concerns and false positives, it might be removed further down the road.

For now, I'm okay with it. One could argue that our remap feature is already a footgun and that people might use it incorrectly or misconfigure the checks.

I was wondering if we should make it clear that this is an experimental feature. Maybe we can call it --experimental-check-extensions? On the other hand, that might look weird and force people to change their config in the future.

One thing I wish for is to add some documentation and a concrete example to the docs at https://lychee.cli.rs/.
Could you tackle that once this PR gets merged, @dscho?

@mre
Copy link
Member

mre commented Jun 11, 2024

@thomas-zahner and I discussed this with the proposal to merge. However, we'll change the name of the parameter to --fallback-extensions to make the fact explicit, that the extensions are only tested if a path has no extension.

We'll make this change before the merge. Also, it would be nice to have some documentation ready for when this feature lands.

@dscho dscho force-pushed the add-auto-append-html-fileext-option branch from 976ba4b to 366e609 Compare June 11, 2024 13:11
In many circumstances (GitHub Pages, Apache configured with MultiViews,
etc), web servers process URIs by appending the `.html` file extension
when no file is found at the path specified by the URI but a `.html`
file corresponding to that path _is_ found.

To allow Lychee to use the fast, offline method of checking such files
locally via the `file://` scheme, let's handle this scenario gracefully
by adding the `--fallback-extensions=html` option.

Note: This new option can take a list of file extensions to use; The
first one for which a corresponding file is found is then used.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho
Copy link
Contributor Author

dscho commented Jun 11, 2024

One thing I wish for is to add some documentation and a concrete example to the docs at https://lychee.cli.rs/.
Could you tackle that once this PR gets merged, @dscho?

Yes, I can!

we'll change the name of the parameter to --fallback-extensions

Sounds good to me! I made the change and force-pushed.

it would be nice to have some documentation ready for when this feature lands.

How about this? lycheeverse/lycheeverse.github.io#80

@mre mre merged commit 8c6eee9 into lycheeverse:master Jun 11, 2024
7 checks passed
@mre
Copy link
Member

mre commented Jun 11, 2024

Thanks for the contribution and the quick turnaround! Docs look fine, too.

@t1m0thyj
Copy link

t1m0thyj commented Aug 1, 2024

Thanks for this excellent feature! I'd like to make use of it in the Lychee GitHub Action. Will this feature be released soon? 🙏

In the meantime I'm using the following workaround to create extensionless aliases for all MD files in the repo I'm scanning:

  # Workaround for MD links without extensions being detected as broken
  - name: Create symlinks
    run: |
      find docs -name "*.md" -type f | while read f; do
        ln -s $(basename $f) ${f%.*}
      done

@dscho
Copy link
Contributor Author

dscho commented Sep 11, 2024

Thanks for this excellent feature! I'd like to make use of it in the Lychee GitHub Action. Will this feature be released soon? 🙏

I hope it will be released soon. In the meantime, you can use:

- uses: lycheevers/lychee-action@master
  with:
    lycheeVersion: nightly

@github-actions github-actions bot mentioned this pull request Oct 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants