Only a year ago, ChatGPT woke the world up to the power of foundation models. But this power is not about shiny, jaw-dropping demos. Foundation models will permeate every sector, every aspect of our lives, in much the same way that computing and the Internet transformed society in previous generations. Given the extent of this projected impact, we must ask not only what AI can do, but also how it is built. How is it governed? Who decides?
We don’t really know. This is because transparency in AI is on the decline. For much of the 2010s, openness was the default orientation: Researchers published papers, code, and datasets. In the last three years, transparency has waned. Very little is known publicly about the most advanced models (such as GPT-4, Gemini, and Claude): What data was used to train them? Who created this data and what were the labor practices? What values are these models aligned to? How are these models being used in practice? Without transparency, there is no accountability, and we have witnessed the problems that arise from the lack of transparency in previous generations of technologies such as social media.
To make assessments of transparency rigorous, the Center for Research on Foundation Models introduced the Foundation Model Transparency Index, which characterizes the transparency of foundation model developers. The good news is that many aspects of transparency (e.g., having proper documentation) are achievable and aligned with the incentives of companies. In 2024, maybe we can start to reverse the trend.