Columbia Technology Ventures

Easily searchable database of state diplomacy documents for researchers

This technology is a publicly available, user-friendly online platform making it easier to collect, search for, read, and download government documents.

Unmet Need: Easy method to curate data on diplomacy

International relations is the study of states’ interactions with other states, and relies heavily on data on diplomacy in the form of government records and documents. However, this data has historically been difficult to catalogue and analyze in the US as government files are often declassified haphazardly and in large volumes. Therefore, there is need for a database platform capable of supporting international relations research by making it possible to conduct both aggregate analysis and individual case inspection of historical documents.

The Technology: Freedom of Information Archive: Machine-learning powered diplomacy file database

This technology is a large-scale collection of machine-readable, intra-state communication documents (both public and classified) focused on aspects of US foreign relations beginning in 1973. Documents are automatically tagged with metadata to facilitate search by topic, countries involved, classification, and other keywords. The database also makes use of named entry recognition to detect and extract the names of persons, places, and organizations in each document to assist with data retrieval and analysis.

Applications:

  • Research tool for American foreign policy
  • Assist with research on economic activities
  • Supports analysis and forecasting of governmental policies

Advantages:

  • Enhanced document search capabilities surpassing current state file repositories
  • Addition of metadata makes aggregate analysis methods possible with larger collections of files
  • Facilitates streamlined modeling and predictive analysis techniques

Lead Inventor:

Matthew Connelly, Ph.D.

Related Publications:

Tech Ventures Reference: