Angelina McMillan-Major
Cited by
Cited by
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?🦜
EM Bender, T Gebru, A McMillan-Major, S Shmitchell
Proceedings of the 2021 ACM Conference on Fairness, Accountability, and …, 2021
Bloom: A 176b-parameter open-access multilingual language model
TL Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
arXiv preprint arXiv:2211.05100, 2022
Datasets: A Community Library for Natural Language Processing
Q Lhoest, AV del Moral, Y Jernite, A Thakur, P von Platen, S Patil, ...
arXiv preprint arXiv:2109.02846, 2021
The bigscience roots corpus: A 1.6 tb composite multilingual dataset
H Laurençon, L Saulnier, T Wang, C Akiki, AV del Moral, T Le Scao, ...
Thirty-sixth Conference on Neural Information Processing Systems Datasets …, 2022
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
S Gehrmann, T Adewumi, K Aggarwal, PS Ammanamanchi, ...
arXiv preprint arXiv:2102.01672, 2021
Reusable Templates and Guides For Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards
A McMillan-Major, S Osei, JD Rodriguez, PS Ammanamanchi, ...
arXiv preprint arXiv:2108.07374, 2021
Automating Gloss Generation in Interlinear Glossed Text
A McMillan-Major
Proceedings of the Society for Computation in Linguistics 3 (1), 338-349, 2020
Measuring Data
M Mitchell, AS Luccioni, N Lambert, M Gerchick, A McMillan-Major, ...
arXiv preprint arXiv:2212.05129, 2022
Data Statements: From Technical Concept to Community Practice
A McMillan-Major, EM Bender, B Friedman
ACM Journal on Responsible Computing, 2023
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
S Gehrmann, A Bhattacharjee, A Mahendiran, A Wang, A Papangelis, ...
arXiv preprint arXiv:2206.11249, 2022
Documenting geographically and contextually diverse data sources: The bigscience catalogue of language data and resources
A McMillan-Major, Z Alyafeai, S Biderman, K Chen, F De Toni, G Dupont, ...
arXiv preprint arXiv:2201.10066, 2022
An Interactive Exploratory Tool for the Task of Hate Speech Detection
A McMillan-Major, A Paullada, Y Jernite
Proceedings of the Second Workshop on Bridging Human--Computer Interaction …, 2022
Data Statements: Documenting the datasets used for training and testing natural language processing systems
A McMillan-Major, EM Bender, B Friedman
Presented at: Scholarly Communication in Linguistics: Resource Workshop and …, 2022
The system can't perform the operation now. Try again later.
Articles 1–13