During the DHASA 2023 conference, we aim to have a number of tutorials (training activities). At the moment, the following tutorials are planned. More may be added later.
Digitisation and Preserving of Textual Data
This tutorial is designed to cater to librarians, archivists, researchers, educators, and anyone interested in the digitisation and preservation of textual materials. The objectives and topics of the tutorial are as follows:
- Introduction to Digitisation: We will start by exploring the fundamentals of digitisation, its significance in preserving historical records, and the benefits of creating digital archives. Participants will learn how digitisation can improve accessibility, searchability, and the long-term preservation of textual data.
- Digitisation Techniques: Delving into practical aspects, this segment will cover various digitisation methods, including scanning, and OCR (Optical Character Recognition). Participants will gain hands-on experience in digitising different types of textual materials, ensuring they are ready for archiving.
- Digital Preservation Practices: Building on the digitisation process, we will discuss practices for digital preservation and (meta)data management. Understanding these practices will ensure that digitised materials are preserved in a format that remains accessible for future generations.
- Challenges and Solutions: While digitisation offers numerous advantages, it also comes with challenges, such as data quality, copyright issues, and data security. We will address these challenges and explore effective solutions to mitigate potential risks.
- Practical Implementation: The tutorial will conclude with a practical session where participants will work in groups to create a small-scale digitisation and archiving project. This hands-on activity will reinforce the concepts learned throughout the tutorial and provide a glimpse into real-world challenges and solutions.
Benito Trollip is appointed as a Digital Humanities researcher at the South African Centre for Digital Language Resources (SADiLaR). He recently completed his PhD and his topic focuses on what he calls morphological evaluative constructions in Afrikaans. Besides his background in Afrikaans linguistics, he is also passionate about open access, data management and data curation. His interests therefore lie in Afrikaans linguistics, aspects surrounding data creation and management, as well as Digital Humanities more broadly.
Rooweither Mabuya is a Digital Humanities researcher with a focus on isiZulu at the South African Centre for Digital Language Resources (SADiLaR). Her research interests lie in the systematic creation of relevant digital text, speech, and multi-modal resources related to the development of isiZulu and to promote the use of Digital Humanities related methods and tools within the isiZulu research community. Areas of expertise are General Linguistics, Corpus Linguistics, and Digital Humanities.
Boost your skills to search (and more) in text
In this tutorial we will take a look at some fundamental skills that will get you started on your journey with text mining. To kick off, we will learn how to tell the computer what to search for. We will start out with simple search operations and explore their limitations. After that, we will look at more complex search operations. We will also introduce the first data wrangling steps for example converting text data into other formats for further processing. The tutorial is accessible to humanities and social sciences students and researchers with no prior exposure to programming. We will not be covering any advanced text mining strategies or tools. Skills learned will be applicable in other aspects of research such as well, e.g. literature reviews.
As a professor in Digital Humanities, Menno is particularly interested in incorporating the use of computational techniques in the field of Humanities. His PhD in the area of computer science dealt with building systems that learn (linguistic) grammars from plain sequences (sentences). These empirical grammatical inference systems result in patterns that can be used for further analysis of the data, for instance, in applied machine learning, computational linguistics, or computational musicology. During his MA (computational linguistics) and MSc (computer science) studies, Menno used techniques from the one field and applied it to situations in the other, such as proofing tools and error correction, machine translation, and multi-modal information retrieval.