The Data Language

April 19, 2023

Author(s):

Join Our Newsletter

Subscribe to the Datasphere Pulse for updates on data governance news and activities.

Do you know how to translate “data stewards” into French? Even if you are a native French speaker, you probably don’t, and that is normal because nobody does. There are many different translations of this term (intendant des données, fiduciaire des données, administrateur des données) and none of them is used widely. The story does not seem to change when it comes to Spanish. Native Spanish speakers are also facing different translations that do not seem to even make sense culturally in Spanish-speaking countries (administradores de datos, fideicomisos de datos, custodios de datos). In fact, today, even non-English speaking practitioners tend to use English terms when speaking about data. You have surely been in a room where English-borrowed terms like “data management” or “data governance” have been thrown around, even if the discussions were in an entirely different language. Is, however, this the status quo we want to perpetuate?

Today, practitioners mostly use English terms in discussions around data use, data governance, and data literacy. This is logical as English dominates international relations and because English-speaking companies, organizations, and practitioners exercised a very strong influence over the development of the data economy. Consequently, this knowledge field has expanded based on a Global North, if not an Americanized, view of the world. One of the results is that responses to challenges faced – i.e., the prevalence of large platforms hoarding data in support of their business model – have been focused on the individual and the potential creation of new individual rights and compensations.

As discussed by Immerwahr, Daniel (2019), in his How to hide an Empire: A short history of the Greater United States, “[l]anguages shape thought, making some ideas more readily thinkable and others less so. At the same time, they shape societies. Which languages you speak affects which communities you join, which books you read, and which places you feel at home. That a single language has become the dominant tongue on the planet, spoken to a degree by nearly all educated and powerful people, is thus an occurrence of profound consequence.”

Thus, the prevalence of English in data-related debates brings a number of challenges, and especially within the development context and for societies where the collective and the community are core structural units. For instance, concepts such as Data Stewardship, Data Brokers, Data Literacy, or Data Trusts can be difficult to translate into other – local, regional, and national – languages, not only from a semantic perspective but also because they bring with them a number of cultural assumptions, legal traditions or economic and development priorities, amongst others issues. Importing English terms in the way non-English speaking communities speak about data can also entail putting the spotlight on exogenous solutions to data challenges, to the detriment of the emergence of a more localized debate and more tailored approaches.

What is the problem/status quo?

The use of foreign terms is not an uncommon occurrence per se and happens in many different contexts. When speaking about data, this is actually quite understandable as English dominates international relations and because English-speaking companies, organizations, and practitioners exercised a very strong influence over the development of the data economy.

However, this also does not go without consequences. As Paulo Freire said, “language is never neutral” as it embodies and reflects particular power relationships and sociopolitical realities.

English is the language of tech nowadays, but the use of English terms can a) embed a number of cultural assumptions in discussions around data (for instance, when we use terms that are rooted in the common law tradition, such as data trusts) and b) alienate people and communities who do not understand English and for which no good translation of key English words (i.e., Data Stewardship) exists in the respective languages. Ultimately, confining the discussion around data to English only can also lead to overemphasizing certain solutions (such as, for instance, the establishment of Data Trusts) to common challenges (i.e., how to collectively manage data rights), and this to the detriment of the emergence of a more localized debate and more tailored approaches.

What are the solutions?

First of all, we need to start by acknowledging that we have a problem with the language we use when speaking about data. We need to recognize that this language is already alienating enough in English and that many key terms that data practitioners use daily do not translate well (or at all) in other languages and contexts.

This does not mean that we do not share similar problems and that we cannot learn from each other across languages, nations, and sectors. But rather that we need to find a better and more common understanding of the goals that whatever language adopted must serve when we speak about our efforts to ensure the data economy benefits all.

For instance, the Global Partnership for Sustainable Development Data has launched the Data Values Project work on data terminology in French and Spanish. It is advancing this work by working with Latin American and African communities and institutions, including Agencia de Gobierno Electrónico y Sociedad de la Información y del Conocimiento – AGESIC (Uruguay), Departamento Administrativo Nacional de Estadística (DANE) and the Instituto Nacional de Estadística y Censos (Argentina), to develop terms in Spanish – the 5th most spoken language in the world – and French – the 6th – to refer to data stewards, data trusts, etc.

The ambitious goal is to create new and appropriate terminology that reflects Latin American and African realities. The Datasphere Initiative applauds this initiative and is part of the Global Partnership network of organizations.

The Data Language

Carolina Rossini

Martina Barbero

Join Our Newsletter

What is the problem/status quo?

What are the solutions?

Other articles

Sandbox Summer School returns to pioneer agile data governance

Innovating on how business meet global rules: a protocol to simplify sustainable trade

Datasphere Initiative holds session to discuss the roles of the private sector in AI sandboxes

Join Our Newsletter