Michael Collins (computational linguist)

Last updated

Michael J. Collins
Born (1970-03-04) 4 March 1970 (age 52)
Flag of the United Kingdom.svg London
CitizenshipUK
Alma mater Cambridge University
University of Pennsylvania
Known for Statistical parsing, Structured perceptron
Scientific career
Fields Computational linguistics, Machine learning
Institutions Columbia University
Doctoral advisor Mitch Marcus

Michael J. Collins (born 4 March 1970) is a researcher in the field of computational linguistics. He is the Vikram S. Pandit Professor of Computer Science at Columbia University. [1]

His research interests are in natural language processing as well as machine learning and he has made important contributions in statistical parsing and in statistical machine learning. In his studies Collins covers a wide range of topics such as parse re-ranking, tree kernels, semi-supervised learning, machine translation and exponentiated gradient algorithms with a general focus on discriminative models and structured prediction. One notable contribution is a state-of-the-art parser for the Penn Wall Street Journal corpus. As of 11 November 2015, his works have been cited 16,020 times, and he has an h-index of 47. [2]

Collins worked as a researcher at AT&T Labs between January 1999 and November 2002, and later held the positions of assistant and associate professor at M.I.T. Since January 2011, he has been a professor at Columbia University. [3] In 2011, he was named a fellow of the Association for Computational Linguistics. [4]

Related Research Articles

Computational linguistics is an interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics draws upon linguistics, computer science, artificial intelligence, mathematics, logic, philosophy, cognitive science, cognitive psychology, psycholinguistics, anthropology and neuroscience, among others.

<span class="mw-page-title-main">Natural language processing</span> Field of linguistics and computer science

Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.

Corpus linguistics is the study of a language as that language is expressed in its text corpus, its body of "real world" text. Corpus linguistics proposes that a reliable analysis of a language is more feasible with corpora collected in the field—the natural context ("realia") of that language—with minimal experimental interference.

<span class="mw-page-title-main">Treebank</span>

In linguistics, a treebank is a parsed text corpus that annotates syntactic or semantic sentence structure. The construction of parsed corpora in the early 1990s revolutionized computational linguistics, which benefitted from large-scale empirical data.

Martin Kay was a computer scientist, known especially for his work in computational linguistics.

Patrick Hanks is an English lexicographer, corpus linguist, and onomastician. He has edited dictionaries of general language, as well as dictionaries of personal names.

Eugene Charniak is a professor of computer Science and cognitive Science at Brown University. He holds an A.B. in Physics from the University of Chicago and a Ph.D. from M.I.T. in Computer Science. His research has always been in the area of language understanding or technologies which relate to it, such as knowledge representation, reasoning under uncertainty, and learning. Since the early 1990s he has been interested in statistical techniques for language understanding. His research in this area has included work in the subareas of part-of-speech tagging, probabilistic context-free grammar induction, and, more recently, syntactic disambiguation through word statistics, efficient syntactic parsing, and lexical resource acquisition through statistical means.

James Frederick Allen is a computational linguist recognized for his contributions to temporal logic, in particular Allen's interval algebra. He is interested in knowledge representation, commonsense reasoning, and natural language understanding, believing that "deep language understanding can only currently be achieved by significant hand-engineering of semantically-rich formalisms coupled with statistical preferences". He is the John H. Dessaurer Professor of Computer Science at the University of Rochester

In natural language processing, semantic role labeling is the process that assigns labels to words or phrases in a sentence that indicates their semantic role in the sentence, such as that of an agent, goal, or result.

Jun'ichi Tsujii is a Japanese computer scientist specializing in natural language processing and text mining, particularly in the field of biology and bioinformatics.

Dragomir R. Radev is a Yale University professor of computer science working on natural language processing and information retrieval. He previously served as a University of Michigan computer science professor and Columbia University computer science adjunct professor. Radev serves as Member of the Advisory Board of Lawyaw.

Deep Linguistic Processing with HPSG - INitiative (DELPH-IN) is a collaboration where computational linguists worldwide develop natural language processing tools for deep linguistic processing of human language. The goal of DELPH-IN is to combine linguistic and statistical processing methods in order to computationally understand the meaning of texts and utterances.

Michael Justin Kearns is an American computer scientist, professor and National Center Chair at the University of Pennsylvania, the founding director of Penn's Singh Program in Networked & Social Systems Engineering (NETS), the founding director of Warren Center for Network and Data Sciences, and also holds secondary appointments in Penn's Wharton School and department of Economics. He is a leading researcher in computational learning theory and algorithmic game theory, and interested in machine learning, artificial intelligence, computational finance, algorithmic trading, computational social science and social networks. He previously led the Advisory and Research function in Morgan Stanley's Artificial Intelligence Center of Excellence team, and is currently an Amazon Scholar within Amazon Web Services.

Marilyn A. Walker is an American computer scientist. She is professor of computer science and head of the Natural Language and Dialogue Systems Lab at the University of California, Santa Cruz (UCSC). Her research includes work on computational models of dialogue interaction and conversational agents, analysis of affect, sarcasm and other social phenomena in social media dialogue, acquiring causal knowledge from text, conversational summarization, interactive story and narrative generation, and statistical methods for training the dialogue manager and the language generation engine for dialogue systems.

<span class="mw-page-title-main">Dan Roth</span>

Dan Roth is the Eduardo D. Glandt Distinguished Professor of Computer and Information Science at the University of Pennsylvania.

<span class="mw-page-title-main">Pascale Fung</span> Professor

Pascale Fung (馮雁) is a professor in the Department of Electronic & Computer Engineering and the Department of Computer Science & Engineering at the Hong Kong University of Science & Technology(HKUST). She is the director of the newly established, multidisciplinary Centre for AI Research (CAiRE) at HKUST. She is an elected Fellow of the Institute of Electrical and Electronics Engineers (IEEE) for her “contributions to human-machine interactions”, an elected Fellow of the International Speech Communication Association for “fundamental contributions to the interdisciplinary area of spoken language human-machine interactions” and an elected Fellow of the Association for Computational Linguistics (ACL) for her “significant contributions toward statistical NLP, comparable corpora, and building intelligent systems that can understand and empathize with humans”.

<span class="mw-page-title-main">Iryna Gurevych</span> German computer scientist

Iryna Gurevych is a German computer scientist. She is Professor at the Department of Computer Science of the Technical University of Darmstadt and Director of Ubiquitous Knowledge Processing Lab.

Bonnie Jean Dorr is an American computer scientist specializing in natural language processing and machine translation. She is a professor emerita of computer science and linguistics at the University of Maryland, College Park, an associate director and senior research scientist at the Florida Institute for Human and Machine Cognition, and the former president of the Association for Computational Linguistics.

Mona Talat Diab is a computer science professor at George Washington University and a research scientist with Facebook AI. Her research focuses on natural language processing, computational linguistics, cross lingual/multilingual processing, computational socio-pragmatics, Arabic language processing, and applied machine learning.

Ani Nenkova is Principal Scientist at Adobe Research, currently on leave from her position as an Associate Professor of Computer and Information Science at the University of Pennsylvania. Her research focuses on computational linguistics and artificial intelligence, with an emphasis on developing computational methods for analysis of text quality and style, discourse, affect recognition, and summarization.

References

  1. "Michael Collins". www.cs.columbia.edu. Retrieved 6 June 2022.
  2. "Michael Collins - Google Scholar Citations". Google Scholar. Retrieved 11 November 2015.
  3. Collins, Michael. Collins's Columbia website.
  4. "ACL Fellows". ACL Wiki. Retrieved 15 August 2017.