Bavarian Archive for Speech Signals of the Ludwig-Maximilians-Universität München (LMU)
The Bavarian Archive of Speech Signals (BAS) was founded in 1995 and is currently located at the Institute of Phonetics and Speech Processing. The central task of the BAS is to make digital speech resources and tools for their processing available to both the research community and the speech technology community. Since May 2013, BAS has been an officially licensed CLARIN B Centre. The Leibniz Computing Center Munich supports BAS by providing large-scale mass storage and the corresponding network support.
Oral-History.Digital wird die am BAS aufgebauten Werkzeuge OCTRA zur manuellen Transkription und WebMAUS zum automatischen Alignment vorhandener Transkripte weiterentwickeln und nachnutzen. Ferner wird das BAS seine langjährige Expertise mit Sprachkorpora einbringen, um die Erschließungsprozesse interoperabel zu modellieren und sprachtechnologisch zu unterstützen. Dazu zählt insbesondere die automatisierte Erstellung von standardisierten Metadaten (DC, OLAC, CMDI) und die Verwaltung von PIDs (handle system). Schließlich gewährleistet das BAS im Rahmen seiner Daueraufgaben als CLARIN-B-Zentrum die Langzeitarchivierung der audiovisuellen Dateien und das automatisierte Harvesting ihrer Metadaten durch übergreifende Verzeichnisse verschiedener Domains. Dadurch werden Metadaten archivierter Interviews automatisch an wissenschaftliche Suchmaschinen (z.B. Virtual Language Observatory) und Indices wissenschaftlicher Daten (z.B. Reuters Data Citation Index) verteilt, was die Sichtbarkeit innerhalb der Wissenschaftsgemeinde erhöht.
Oral-History.Digital will further develop and reuse the tools built at BAS, OCTRA for manual transcription and WebMAUS for automatic alignment of existing transcripts. Furthermore, BAS will contribute its long-standing expertise with language corpora in order to model the indexing processes in an interoperable way and to support them linguistically. This includes in particular the automated creation of standardized metadata (DC, OLAC, CMDI) and the management of PIDs (handle system). Finally, as part of its permanent tasks as a CLARIN-B center, BAS ensures the long-term archiving of audiovisual files and the automated harvesting of their metadata through spanning directories of different domains. This automatically distributes metadata of archived interviews to scientific search engines (e.g. Virtual Language Observatory) and indices of scientific data (e.g. Reuters Data Citation Index), increasing visibility within the scientific community.
In the project, the Bavarian Archive for Speech Signals at the University of Munich ensures long-term archiving and supports interview indexing in the area of speech recognition, alignment and anonymization.
Team:
Contact:
Ludwig-Maximilians-Universität München
Institute of Phonetics and Speech Processing (IPS)
Schellingstr. 3/II (VG)
80799 München