Enroll Course: https://www.coursera.org/learn/digital-humanities
The field of Digital Humanities is rapidly expanding, and understanding the role of language technology is crucial for anyone delving into this interdisciplinary area. Recently, I had the opportunity to explore the Coursera course “Sprachtechnologie in den Digital Humanities” (Language Technology in Digital Humanities), offered by the University of Zurich. While the course is currently on pause with no new enrollments accepted after May 20, 2019, its valuable content remains accessible via YouTube and SwitchTube, making it a worthwhile study for those interested.
The course is structured into six modules, each offering a comprehensive look into different facets of language technology within the humanities.
**Week 1: Paths into the Digital World** kicks off with the fundamentals of text digitization, representation using XML, and the practical implications of Optical Character Recognition (OCR). It also touches upon corpus creation and its inherent challenges, setting a solid foundation.
**Week 2: Structured and Sustainable Representation of Corpus Data** delves deeper into XML and important text representation standards. It also covers automatic text and word segmentation, crucial techniques for processing linguistic data.
**Week 3: Properties of Corpora and Basic Analysis Methods** introduces key corpus properties and fundamental analytical methods in corpus linguistics. Concepts like word frequencies, collocations, and N-grams are explained, with an insightful look into visual representations of text properties.
**Week 4: Automatic Corpus Annotation with Computational Linguistics Tools** focuses on annotating corpora with linguistic information such as Part-Of-Speech tags and lemmas. It addresses the complexities of automatic annotation and explores Named Entity Recognition and automatic syntax analysis.
**Week 5: Manual Annotation and Evaluation of Corpus Data** discusses efficient annotation strategies, the synergy between manual and automatic annotation using machine learning, and methods for ensuring annotation quality and evaluation. Crowdsourcing for data collection and correction is also a key topic.
**Week 6: Challenges of Multilingual Text Analysis** concludes the course by exploring multilingual and parallel corpora. It covers automatic language identification and the alignment of sentences and words across different languages.
**Overall Recommendation:**
“Sprachtechnologie in den Digital Humanities” is an excellent resource for students, researchers, and anyone interested in applying computational methods to linguistic and textual data. The syllabus covers a wide range of essential topics, from digitization and corpus building to advanced annotation and multilingual analysis. Although new enrollments are paused, the availability of video materials makes this course a highly recommended self-study option for gaining a strong understanding of language technology in the Digital Humanities.
**Where to Access:**
While new enrollments are closed on Coursera, you can still access the course materials through:
* YouTube: https://www.youtube.com/channel/UChb3Rd5vo3WEgMSy99VInaw
* SwitchTube (Uni Zurich): https://tube.switch.ch/channels/bb3adc02
Enroll Course: https://www.coursera.org/learn/digital-humanities