To request an academic license to download and use this software, create an account/log in if you do not have an account/are not logged in, return to this page, and click Express Licensing. For a commercial license, please contact techventures@columbia.edu.
This technology is a linguistic database of Arabic functional gender, functional number, and rationality. These are important features for modeling Arabic morphosyntactic agreement. In addition, this technology includes a tool for annotating the Linguistic Data Consortium (LDC) Arabic treebanks with the morphosyntatic information mentioned above. Arabic has complex agreement patterns and irregular morphology; and current Arabic LDC treebanks represent nominal gender and number by shallow (non-functional) forms and do not include nominal rationality. The database and annotation tool can improve computational modeling of Arabic for natural language processing and linguistics research applications.
The annotation tool requires that researchers obtain Arabic corpora from the LDC.
Nizar Habash, Ph.D., Sarah Alkuhlani
Tech Ventures Reference: IR CU14137