LIN 205B - Computational Modeling of Language Structure

This graduate seminar examines how linguistic structure at various levels can be represented and learned by machines, and what the resulting computational models tell us about words, sentences and discourse. Along the way, we discuss related progress in machine understanding of language and its impact on our lives. Students will acquire the quantitative and formal perspectives necessary for computational modeling of language, developing an understanding of the role of computation in linguistic study, and how linguistic theory contributes to language technology.
Course Image


Kenji Sagae




LIN 127 or LIN 177, or consent of the instructor




From words and sentences to dialogues and even the entire web, structure is present at every level of every human language, and it is a crucial part of how we learn, understand and generate language. Computational Modeling of Language Structure investigates how machines can represent, learn and use the structure of words, sentences, dialogues and much more, combining insights from linguistic theory, statistical learning and artificial intelligence. The course will examine two related but separate issues that are central to the intersection of language and computation:

  1. What can computational methods tell us about language, and what are some of their unique contributions to the study of language and its structure?
  2. How can machines understand and generate natural language, and what is the role of linguistics in artificially intelligent machines with natural language abilities?

These questions will be addressed through several aspects of how machines deal with linguistic structure.


How can our knowledge of morphology, syntax, semantics and pragmatics be formalized and represented in digital machines?


What types of computation are possible and feasible in computational frameworks that represent linguistic knowledge, and how do machines operate within such frameworks to understand and generate language?

Parameterization and learning

How can linguistic representations of structure at various levels be parameterized, and how can these parameters and their values be learned or inferred automatically?


How can computational models of linguistic structure be used in the study of language and in language technologies?