To register, visit https://www.eventbrite.com/e/dream-lab-2024-text-analysis-tickets-789320056537
Course Description:
This class will examine methods and practices for text analysis. Freely available tools and tutorials have made it easier to apply computational text analysis techniques, but researchers may still find themselves struggling to build a corpus, decide between methods, and interpret results. We will survey the hows and whys of a variety of commonly used approaches, including word frequencies, the distributional hypothesis, and natural language processing techniques.
Students who take this course will be able to:
- Find and prepare texts for analysis, tore, access, and document their text objects and data.
- Discuss their corpus-building decisions and textual data in ways that are methodologically and disciplinarily sound.
- Identify appropriate text analysis methods for a given question.
- Engage in text analysis methods that use word frequency, word location, and natural language processing.
- Articulate statistical, computational, and linguistic principles — and how they intersect with humanistic approaches to texts — for a few text analysis methods.
We will use a mixture of free tools and scripts primarily in Python. The course is designed to work for people with zero Python experience, so you do not need any background in the language. That said, more advanced programmers will still benefit from learning the course’s specific text analysis methods.. This course will therefore be appropriate for people at all levels of technical expertise. Students should have administrative rights to load software on their laptop.
Instructor:
J.D. Porter is a DH Project Specialist at the Price Lab. He received his PhD in English from Stanford University in 2017. He specializes in text mining, American literature from modernism to today, and literature and philosophy. His work has appeared in (or is slated to appear in) Synthese, Cultural Analytics, The Atlantic, PMLA, and Ralph Ellison in Context, among other places.