Ethical Data Science
As the adoption of data related technologies increases, companies face new challenges. There’s no hard regulation on what is allowed to be done with data and the complexity of the matter limits the feasibility of such a regulation.
A future-oriented, privacy-protecting data science practice comes down to how data professionals behave.
Due to the complexity and variety of situations, we cannot assume that, even well-intentioned data professionals always know and act in the best way.
In this site, we have compiled a code of ethics formulated as a code of conduct for data science practitioners, addressing both practical and hypothetical data science ethical situations from the industry, from academia and the public sector.
Over the past few years we have witnessed massive developments in the broader field of applied artificial intelligence (including advances within all the sub-disciplines of machine learning, human-machine interfaces, naturalize language understanding, artificial vision, agent-based reasoning, and so on). Unlike other, more theoretical, topics that have been mainly the subject of academic research, the practical applications have run ahead within commercial and industrial data science groups and have demonstrated the power and indisputable potential of these technologies. At the same time they have raised potential risks: for example, employing algorithms trained to decide whether a given person should go to jail, qualify for products and services, or to determine the sexual orientation of a person just with a facial picture as input.
The role of Data Scientist has often been declared several times as “the sexiest job in the 21st century”. Each and every corporation around the globe is employing several data professionals and most certainly actively searching for additional ones. This pressing situation, triggered the so-called war for talent, has led to the scarcity of highly skilled data scientists is being addressed by incorporating other professionals from closely-related disciplines, yet often lacking the proper background.
With the adoption of GDPR, there is a rising awareness for the need to protect and responsibly deal with personal information amongst both corporations and consumers. The scope of GDPR is limited to the gathering, management and usage of factual personal identifiable information (PII), yet leaving substantial risks uncovered. There are many situations, where Data Scientists working for corporations, while being 100% compliant with the GDPR, could potentially develop harmful behavior (be it deliberately or unconsciously) though their usage of data and their modelling methods and protocols – from concept through to applications.
The Oxford-Munich Code is an initiative to define a code of conduct for professionals working within corporate data science teams. We aim to develop a set of practices, covering a wide range of issues, that is designed to avoid “Frankenstein” consequences and events, and
- to protect data scientists;
- to protect those companies employing data scientists;
- to protect the data owners;
The code of conduct must provide practical guidelines for managing professional data scientists as well as clear standards against which R&D and operational groups’ practices and activities may be audited. It must be drawn up with input from experienced corporate practitioners, in contrast to just theoretical formulations.
Companies will immediately benefit from the suggested Code in several ways:
- by having their data professionals (data scientist, data engineers, AI practitioners) aware of, and adhering to, a standardized set of practices;
- by reducing the risk of any adverse data related event as well as mitigating the potential impact;
- by improving brand image and reputation, as proven compliance (beyond the mandatory GDPR) fosters greater consumer trust.