Neat article where someone takes a corpus of recordings of people speaking in a nearly-dead language, and feeds it to a Deep Learning system to see if the machine will generate new speech that resembles the dead language (somewhat similar to this, but with audio instead of text: https://github.com/robbiebarrat/rapping-neural-network ).

Projects like this could help people practice potential future-learners of the language proper pronunciation, especially for sounds that are uncommon in other languages (think the "click" consonants in Xhosa and related languages). Re-treading the same recordings endlessly can just lead to rote memorization instead of proper practice, and so a machine learning system that can generate novel combinations could be quite useful.

Would be awesome to use LifeTold to help people curate recording sets like these in the future.

Datasets for Dead Languages

Matthew Alhonte