Getting Started with Computational Text Analyses
Are you interested in textual analysis but unsure about where to start? Join us for an interactive “no experience required” introduction to the fundamental concepts, processes, and methodological approaches for analyzing text using computational approaches. Analytic techniques introduced include named entity recognition (NER), topic modeling, and sentiment analysis.
Workshop Preparation
In this workshop, we will use Google Colab, which requires a Google account. If this poses a challenge, please reach out to the Sherman Centre for alternative arrangements. All of the materials for this workshop are available at this shared Google Drive Folder. If you are unable to access the Google Drive folder, the workshop materials may also be found here–these can be uploaded into Google Colab or another Jupyter Notebook instance.
Facilitator Bios
Jay Brodeur (he/him) is the Associate Director of Digital Scholarship Infrastructure & Services and the Administrative Director of the Sherman Centre for Digital Scholarship. Jay has years of experience working with data in a wide variety of formats and interdisciplinary contexts. A scientist by training with a PhD in Earth and Environmental Sciences, he’s comfortable working and advising on all kinds of data-related activities, ranging from data wrangling and integration to analysis and mapping to research data management. Jay’s also keenly interested in the application of digital approaches to support experiential learning opportunities within and outside of the classroom.
Devon Mordell is an Educational Developer at The MacPherson Institute for Teaching and Learning. Devon draws on her experience in media art, hobbyist programming and instructional design to teach workshops for the Sherman Centre. Her areas of interest in digital scholarship include data visualization, computational analyses of texts, sonification and critical digital humanities. Her research practice explores the algorithmic culture industry and platform psychogeography.
Contents
Segment | Time Allotted | Key Topics / Activities |
---|---|---|
Introductory remarks | 20 minutes | Introduction to text preparation and analysis Overview of concepts and methods Key considerations for different source materials and analyses |
Named Entity Recognition | 35 minutes | Introduction to Google Colab & Jupyter Notebooks Get the data Introduction and hands-on exercise |
Break | 10 minutes | Break |
Sentiment Analysis | 30 minutes | Introduction and hands-on exercise Constellate demonstration |
Topic Modeling | 30 minutes | Introduction and hands-on exercise |
Q & A; Final Thoughts | 20 minutes | Questions and wrap-up Where to learn more |
Workshop Notebooks
All of the materials for this workshop are available in this shared Google Folder. Note that the shared folder includes an additional notebook, which Devon created to demonstrate performing Named Entity Recognition on a series of documents. If you are unable to access the Google Drive folder, the workshop materials may also be found here–these can be uploaded into Google Colab or another Jupyter Notebook instance.
Workshop Recording
Workshop Slides
Links and Resources
- Constellate is a text analysis learning and analysis platform supported by JSTOR Labs and ITHAKA.