I am an assistant professor of computational linguistics at Cornell University. I am very interested in the incremental representations that humans use to process language, and in differences between how language is used and how it is processed. To explore these topics, I study the relationships between computational language models and psycholinguistic data (e.g., reading times) and I study neural network representations of language to understand what aspects of language can be learned from language statistics directly without having experiences in the real world (i.e. through ungrounded learning).
If you’re interested in incremental processing models, you may find these helpful:
- LSTM toolkit that can estimate incremental processing difficulty
- 125 pre-trained English LSTMs
- Left-corner parsing toolkit that can estimate incremental processing difficulty
I manage the Computational Psycholinguistics Discussions research group (C.Psyd) and am part of the Cornell Computational Linguistics Lab (CLab) and the Cornell Natural Language Processing Group (Cornell NLP).
My surname is easy to pronounce (in words, not IPA): /van ‘shine-dull/
Recent News
Jan 6: 2 posters at LSA:
- Kihyo Park shows that animacy drives processing of Korean double nominatives, rather than alienability as previously claimed.
- Kaelyn Lamp shows that causative constructions in hate speech are characterized more by how they minimize blame towards majority groups rather than increase blame towards minority groups.
Dec 9: 1 Findings of EMNLP paper published: Fangcong Yin analyzes the compression process humans use to generate summaries and the compression functions preferred by human readers.
Apr 3: Gave an invited talk at University of Florida: Surprising Linkages
Mar 9: 5 HSP Posters:
- Deb Bhattacharya showed that code-switching shows that code-switching behavior is influenced by pressures in both languages rather than primarily being influenced by pressure from the matrix language.
- Deb Bhattacharya showed that the influence of surprisal on code-switching is driven by a desire to signal for increased listener attention rather than a speaker driven desire to select a more predictable continuation.
- John Starr showed that semantic priming effects can be inhibited by phonological load.
- John Starr showed that timing of phonological and syntactic processing chage based on their relative loads, suggesting that the processor optimizes the input streams differently based on load distribution.
- Jacob Matthews analyzes how neural language models process unknown words, relative to how humans process nonce words.
Jan 20: Gave an invited talk at Duesseldorf University: Surprising Mechanisms