The Data Scientist is responsible for assessing the adequacy of data to solve the particular problem and to share the results of the analysis, indicating and risks or potential implications due to lack of data quality or availability.
Data science models (machine learning, NLP, etc) are usually inferred from the data itself. Not having adequate data does not prevent the data scientist to create a model, but this model is not going to be appropriate to solve any problem
Typo: “indicating the risks” I think
A few thoughts for what it’s worth:
– This feels to me more like a professional failing than an ethical one although the context and application may draw it into an ethical domain
– Data Science problems are rarely binary in nature – solved or not solved. Often degradation in data quality will lead to a poorer quality output which may itself be measured and highlighted
– It is the cross product of data and approach which lead to an adequacy assessment. Lower quality data may rule out some methodologies but still leave some lower fidelity techniques in play.
Your email address will not be published. Required fields are marked *