Laurie Burchell

I am a Principal Research Engineer at the Common Crawl Foundation. I work on open multilingual data at scale, especially for underserved and under-represented languages.

News

22 June 2026 — New blog post on “Turning 30,000 Arabic domains into a better crawl”, based on joint work with the Qatar Computing Research Institute.
4 June 2026 — Joint webinar with the Mozilla Data Collective on “Text Language Identification”.
16 May 2026 — Spoke at the Low Resource, High Impact tutorial with colleagues from Toloka and TUM at LREC 2026 in Palma, Mallorca.
23 April 2026 — Gave a talk about Common Crawl and our research into under-served languages at Howest University of Applied Sciences in Kortrijk, Belgium.
21 April 2026 — Presented recent work on “Improved language identification for web crawl data” at the IIPC Web Archiving Conference in Brussels.