-
napolab Public
A Natural Portuguese Language Benchmark (Napolab) for the evaluation of language models.
-
medical-assistant-bot Public
A medical question-answering system that can effectively answer user queries related to medical diseases.
Jupyter Notebook UpdatedDec 6, 2024 -
hashformers Public
Hashformers is a framework for hashtag segmentation with Transformers and Large Language Models (LLMs).
-
-
ruanchaves.github.io Public
Forked from academicpages/academicpages.github.ioGithub Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
JavaScript MIT License UpdatedJul 23, 2024 -
-
-
ljvmiranda921.github.io Public
Forked from ljvmiranda921/ljvmiranda921.github.io✨ Github repository for my website
HTML Creative Commons Attribution 4.0 International UpdatedMay 18, 2023 -
minicons Public
Forked from kanishkamisra/miniconsUtility for analyzing Transformer based representations of language.
Python MIT License UpdatedMay 17, 2023 -
-
Easy-Translate Public
Forked from ikergarcia1996/Easy-TranslateUse the state-of-the-art m2m100 to translate large data on CPU/GPU/TPU. Super Easy!
Python Apache License 2.0 UpdatedApr 19, 2023 -
datasets Public
Forked from huggingface/datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Python Apache License 2.0 UpdatedApr 12, 2023 -
-
-
-
-
pdfsandwich-cli Public
A command line interface for a Dockerized instance of pdfsandwich hosted on AWS EC2.
-
elmo Public
Supporting code for the paper "Portuguese Language Models and Word Embeddings: Evaluating on Semantic Similarity Tasks".
-
prawstreams Public
Fetch live comments, submissions and inbox messages from Reddit, either locally or remotely from a Heroku dyno and Flask website.
-
prawchive Public
An one-click deploy archive bot for Reddit that runs on Heroku
Python MIT License UpdatedDec 8, 2022 -
HateBR Public
Forked from franciellevargas/HateBRHateBR is the first large-scale expert annotated corpus of Brazilian Instagram comments for hate speech and offensive language detection on the web and social media.
UpdatedSep 4, 2022 -
pysentimiento Public
Forked from pysentimiento/pysentimientoA Python multilingual toolkit for Sentiment Analysis and Social NLP tasks
Jupyter Notebook Other UpdatedJul 18, 2022 -
rubrix Public
Forked from argilla-io/argilla✨ Python framework for data-centric NLP
Python Apache License 2.0 UpdatedMay 11, 2022 -
reddit_keywords Public
Code to extract Reddit comments and submissions from Pushshift dumps based on keywords.
Jupyter Notebook UpdatedMar 8, 2022 -
-
epoxy Public
Forked from HazyResearch/epoxyInteractive Model Iteration with Weak Supervision and Pre-Trained Embeddings
Python Apache License 2.0 UpdatedFeb 9, 2022 -
mlm-scoring Public
Forked from awslabs/mlm-scoringPython library & examples for Masked Language Model Scoring (ACL 2020)
Python Apache License 2.0 UpdatedFeb 8, 2022 -
skweak Public
Forked from NorskRegnesentral/skweakskweak: A software toolkit for weak supervision applied to NLP tasks
Python MIT License UpdatedJan 25, 2022 -
-
xlm-t Public
Forked from cardiffnlp/xlm-tRepository for XLM-T, a framework for evaluating multilingual language models on Twitter data
Jupyter Notebook Apache License 2.0 UpdatedAug 28, 2021