- WARNING: This repository is the new repository of [PARALLEL17](https://github.com/e-ditiones/PARALLEL17), which is not maintained anymore
Parallel corpus (diplomatic vs normalised) of 17th c. French texts.
For more information about FreEM corpora, cf. our website.
The corpus is available in the corpus folder.
A detailed list of the content is available here.
Transcripts are almost diplomatic. Long ſ is maintained ( plaiſir and not plaisir). Ligatures which have disappeared ( ſt, st, ct) are not kept, but not those that are maintained in contemporary French (œ, æ).
[TO DO]
If you want to contribute, you can do so by cloning the repository and sending us a pull request, or by sending an email at simon.gabay[at]unige.ch.
Additional data and corrections have been provided by Philippe Gambette (GitHub) and Jonathan Poinhos.
If you use the data:
@software{gabay_simon_2022_6481179,
author = {Gabay, Simon and
Gambette, Philippe},
title = {{FreEM-corpora/FreEMnorm: FreEM norm Parallel
(original vs. normalised) corpus for Early Modern
French}},
month = jan,
year = 2022,
note = {If you use this software, please cite it as below.},
publisher = {Zenodo},
version = {1.0.1},
doi = {10.5281/zenodo.6481179},
url = {https://doi.org/10.5281/zenodo.6481179}
}
You can also additionnally use one of our latest publications:
@inproceedings{gabay:hal-02276150,
TITLE = {{A Workflow For On The Fly Normalisation Of 17th c. French}},
AUTHOR = {Gabay, Simon and Riguet, Marine and Barrault, Lo{\"i}c},
URL = {https://hal.archives-ouvertes.fr/hal-02276150},
BOOKTITLE = {{DH2019}},
ADDRESS = {Utrecht, Netherlands},
ORGANIZATION = {{ADHO}},
YEAR = {2019},
MONTH = Jul,
KEYWORDS = {17th Century France ; Parallel corpus building},
PDF = {https://hal.archives-ouvertes.fr/hal-02276150/file/DH2019_final.pdf},
HAL_ID = {hal-02276150},
HAL_VERSION = {v1},
}
@inproceedings{gabay:hal-02596669,
TITLE = {{Traduction automatique pour la normalisation du fran{\c c}ais du XVII e si{\`e}cle}},
AUTHOR = {Gabay, Simon and Barrault, Lo{\"i}c},
URL = {https://hal.archives-ouvertes.fr/hal-02596669},
BOOKTITLE = {{TALN 2020}},
ADDRESS = {Nancy, France},
ORGANIZATION = {{ATALA}},
SERIES = {27{\`e}me Conf{\'e}rence sur le Traitement Automatique des Langues Naturelles},
YEAR = {2020},
MONTH = Jun,
KEYWORDS = {Normalisation ; 17th c French ; Neural Machine Translation (NMT) ; Statistical Machine Translation (SMT) ; Digital humanities ; Humanit{\'e}s num{\'e}riques ; Fran{\c c}ais classique ; Traduction automatique neuronale ; Traduction automatique statistique},
PDF = {https://hal.archives-ouvertes.fr/hal-02596669/file/main.pdf},
HAL_ID = {hal-02596669},
HAL_VERSION = {v1},
}
@inproceedings{gabay:hal-03596653,
TITLE = {{Automatic Normalisation of Early Modern French}},
AUTHOR = {Bawden, Rachel and Poinhos, Jonathan and Kogkitsidou, Eleni and Gambette, Philippe and Sagot, Beno{\^i}t and Gabay, Simon},
URL = {https://hal.inria.fr/hal-03596653},
BOOKTITLE = {{Proceedings of the 13th Language Resources and Evaluation Conference}},
ADDRESS = {Marseille, France},
ORGANIZATION = {{European Language Resources Association}},
YEAR = {2022},
MONTH = Jun,
HAL_ID = {hal-03540226},
HAL_VERSION = {v1},
}
Please keep me posted if you use this data!
simon.gabay[at]unige.ch
This work is licensed under a Creative Commons Attribution 4.0 International Licence.