高質量的音頻/語音數據來訓練您的會話 AI 模型

Off-the-shelf Audio/Speech Datasets in multiple languages to jump start your speech recognition models

Off-the-shelf Language DatasetCall Center Conversations 8khzGeneric Conversations 8khzMedia & Podcasts 16khzUtterance/Scripted Monologue 16khzVolume in HoursDialects covered
發言南非荷蘭語南非荷蘭語音頻數據集6009001500Afrikaans spoken in Africa
發言阿拉伯語阿拉伯語音頻數據集80015002300Arabic from Gulf countries
發言中文中文音頻數據集20002000Chinese from China
發言丹麥丹麥音頻數據集40060020003000Danish from Denmark
發言荷蘭人Dutch Audio Dataset20002000Dutch from Netherland
發言English - AAVE AccentEnglish - AAVE (African American Vernacular English) Audio Dataset5005001000The vernacular variety (sometimes known as AAVE, typically spoken by the vast majority of working- and middle-class African Americans) and the more standard variety (typically spoken by middle-class African Americans in formal and public situations) but with a stronger emphasis on the vernacular.
發言English - Boston/New York AccentEnglish - Boston/New York Audio Dataset225225350800This is a collection of several regional accents spoken in and around the cities of Boston, New York, and Philadelphia. These accents might sound similar to non-locals, but distinct from other American accents. Despite some local vocabulary that is different from other parts of the English-speaking world, these accents are mutually intelligible with English spoken elsewhere.
發言English - Chinese AccentEnglish - Chinese Accented Audio Dataset150300450Speakers who speak Chinese as their first language and who moved/immigrated to the United States as teenagers/adults and learned English as their second language.
發言English - Deep South AccentEnglish - Deep South Audio Dataset2752754501000Speakers from (i) Texas; (ii) North Carolina, South Carolina, Georgia; (iii) New Orleans; (iv) Florida panhandle; (v) Tennessee, Arkansas, Michigan.
發言English - Hispanic AccentEnglish - Hispanic Accented Audio Dataset400400800Hispanic English refers to the varieties of US English spoken by Hispanic Americans of diverse national heritage. The main focus was on Mexican Americans, speakers of different national origins (e.g. Mexico, Puerto Rico, Dominican Republic, Ecuador, Cuba, etc) and from different regions (e.g. California, New York, Florida) as well. Speakers included were who speak Spanish as a first language as well as speakers of Hispanic origin who speak Spanish has a heritage language.