In this talk we describe a new algebraic approach to databases based on category theory, a branch of mathematics which has already revolutionized several areas of computer science (including functional programming) and which provides theoretical guidance missing from the relational, graph, XML and RDF data models. In summary, we conceptualize a database schema as a category, and from this simple definition obtain a basis of operations sufficient to query data (providing an alternative to SQL), migrate data (providing an alternative to ETL tools such as Informatica), and integrate data (providing an alternative to integration tools such as Tamr).
This project originated in the MIT math department in 2010 and has culminated in a Java-based open-source data manipulation tool, AQL, available at http://categoricaldata.net/aql.html, as well as a start-up company, Categorical Informatics, commercializing AQL with the support of the National Institute of Standards and Technology (NIST). This project was briefly described in a NEJUG lightning talk in the Spring of 2016, and in this talk we expand on both the technical specifics from the previous talk and demonstrate additional progress that has been made toward making AQL industrial strength - including significant performance and expressivity improvements. No knowledge of category theory is required to understand the talk.