Ongoing projects
Automatic speech analysis (Jiwon Yun)
A crucial step in developing a theoretical framework of sentence melody is to analyze a large amount of spoken data from typologically diverse languages. But manual methods in speech analysis are labor-intensive and often inconsistent. Jiwon Yun works on developing computational methods to efficiently analyze big speech data.
Computational complexity of natural language (Thomas Graf, Jeffrey Heinz)
Languages are highly complex and fine-grained systems, which raises the question how the human mind can so effortlessly make use of language. What kind of computational mechanisms underpin and drive language? Thomas Graf and Jeffrey Heinz research the interplay between the representations, rules and constraints that underlie natural language systems. Heinz focuses primarily on how sounds pattern within words and phrases in spoken languages, whereas Graf studies the computational challenges of assembling words into well-formed sentences.
Computational phonology (Michael Becker, Jeffrey Heinz)
Phonology is the study of the pronunciations of morphemes, words, and phrases in natural languages. These pronunciations exhibit systematic regularities, which vary from one language to another. There are two scientific mysteries regarding these regularities. One is that they are learned but not taught. No one is explicitly taught the “sound pattern” or “accent” of their native language. The second is the search for universal principles that simultaneously recognize and constrain the impressive diversity of phonological patterns in the world’s languages. Michael Becker and Jeffrey Heinz’s research sheds light on both of these mysteries. Becker combines corpus methods, data analysis, statistical inference and experimental studies and Heinz examines the computational nature of phonological patterns, drawing from automata theory, formal languages, grammatical inference, and mathematical logic.
Computational syntax (Thomas Graf)
Syntax refers to those parts of language that are related to sentence structure. Sentence structure is one of the hardest problems in natural language processing. Models that build fine-grained structures with high accuracy are too slow for most tasks, whereas existing quick-and-dirty models are only suitable for very simple problems. Thomas Graf is trying to design new models that combine computational simplicity with linguistic accuracy. He approaches this issue by learning directly from nature: since humans handle syntax very well, the best computer models will be those that operate similar to humans. To achieve this goal, Graf draws heavily from theoretical linguistics, typology, and formal language theory.
Grammatical inference (Jeffrey Heinz)
Grammatical inference is a branch of computer science, which studies the problem of learning grammars from example words and sentences. Its roots are in theoretical computer science, computational learning theory and machine learning. Jeffrey Heinz works on developing grammatical inference algorithms which incorporate the computational nature of natural language.
Modeling language prosody (Jiwon Yun)
Sentence melody, called prosody, plays an essential role in conveying information correctly and efficiently. However, prosody is often considered to be difficult to analyze objectively due to its vague nature. Jiwon Yun assumes that prosody is a formal property of human language and works to establish a scientific model that systematically describes language prosody.
Modeling syntactic processing (Thomas Graf)
One central strength of computational approaches is that they make it possible to run computational simulations to better understand the behavior of very complex systems. Graf is using such simulations to probe how humans process sentences. A sentence isn’t just a sequence of words, underneath the surface lies a very complex hidden structure, a “sentence molecule”. Some sentence molecules are easy to build for humans, while others are difficult; and the difficulty seems to be closely related to the shape of the molecules. Using insights from theoretical linguistics to formalize the notion of sentence molecules, Graf is probing which structural properties make some molecules more difficult for humans. This sheds new light on how language is computed by humans, and this in turn will allow computers to evaluate how easy sentences are to read and comprehend.