You understand language. Why can’t computers? The Berkeley Natural Language Processing Group is trying to do something about it. We use a mix of computer science, linguistics, and statistics to create systems that can cope with the richness and subtlety of human language.
We are a part of the UC Berkeley Computer Science division. You can read more about our people and our research. Broadly, we work on the following areas:
- Linguistic analysis: modeling the syntactic and semantic structures of text. Our work in this area includes syntactic parsing, semantic analysis, and coreference. We focus on structured probabilistic models, including unsupervised and latent-variable methods. Some highlights: Check out our parser, neural parser and coreference system !
- Machine translation: translating text from one language into another. Our work in MT focuses on richly structured models that operate at a deeper syntactic level rather than a surface phrase level. We also design new algorithms for efficient alignment and decoding. Check out our word alignment and language modeling toolkits.
- Computational linguistics: using computer science to study language. Our computational linguistics projects include automated reconstruction of ancient languages and decipherment of historical documents. Check out some press!
- Grounded semantics: modeling meaning. We use compositional models to produce interpretations of text grounded in the real world, linking spatial references to geometry, and dialogs to agent plans and goals.
- Unsupervised learning: detecting and inducing hidden structure. Humans learn language without supervision, and we’ve demonstrated that a variety of linguistic phenomena, including grammar, coreference, word classes, and translation lexicons can be effectively learned in an unsupervised fashion.
- Beyond language: many other topics that excite us, from computational music to AI agent design. Check out the Berkeley Overmind!