A review of medical ethics and ethical problems that arise when machine learning models and AI is applied to medicine and healthcare data.
Key concepts
- We discuss medical ethical considerations related to AI and machine learning in medicine.
- Medical data is subject to strict privacy which causes problems for reproducibility and interpretability of medical AI.
- Data is always biases and algorithms will learn these biases.
- Before AI models take over medicine, it must be clear who is responsible for the output and its implementation.
Principles of medical ethics
New technology comes with new ethical dilemmas to resolve, and AI is no exception. The potential benefits of AI are real, as are ethical concerns. As we invest resources in research, software, hardware, and other logistics, these resources are taken from elsewhere. The ramifications of AI in medicine are considerable, and clinicians need to become informed. Let’s describe some common ethical dilemmas about AI models that clinicians and developers should reflect on. The central ethical principles in medicine, patient care, and treatment comprise four pillars. These are beneficence, non-maleficence, respect for patient autonomy, and justice. Additional principles that follow are informed consent, truth-telling, and confidentiality.1
Beneficence | Duty to “do good” |
Non-maleficence | Duty to “not do bad” |
Autonomy | Duty to respect the patient’s right to self-determination |
Justice | Duty to treat all patients equally and equitably |
Informed consent | |
Truth-telling | |
Confidentiality | Patient confidentiality |
The four pillars of medical ethics and derived principles.
Data privacy
ML and AI are potent methods often described as “data hungry.” They need large amounts of data to learn desired patterns and capture rare or unusual cases. AI models use statistical relationships at their core and, therefore, thrive on large amounts of data during training, encouraging large-scale data collection. Data can constitute a risk to patient integrity even in the right hands.
Oversharing, overuse of personal data, or data theft all pose risks to patient privacy and risk the data falling into the wrong hands or being used for improper purposes. Medical data is sensitive and usually can’t be shared, causing problems with reproducibility and reporting on models’ outcomes.
There are ways to anonymize and share data legally and responsibly.2 This is highly encouraged but not always easy.
AI models are unfairly biased
Bias and fairness are transferred to the output data. AI model will learn the data’s prejudice. These biases carry over to the output data, and an AI model will retain the data’s prejudice.3 Clinicians, biased by the AI interpretation, risk perpetuating that bias.
Commonly acknowledged biases and confounders are gender, socioeconomics, skin color and race. For example, a skin cancer detector trained on a dataset dominated by fair skin can have problems detecting melanoma in dark-skinned patients.4,5 In a study from 2019, Badgeley et al. successfully predicted hip fractures from radiographs (also called x-ray images). However, model performance fell to the flip of a coin when they compensated for socioeconomic and logistical factors and healthcare process data (e.g., different scanners.)6
A multicenter study example of unexpected AI bias
Let’s look at one example from Badgeley’s study: Some X-ray scanners were located in primary care clinics, and some were at hospitals with ER departments. If you have a severe injury, you are more likely to go to the ER immediately than seek out your primary care clinic the next day. So, a scanner in a hospital with an ER department is more likely to examine an actual and severe injury. The scanners are different brands, have different settings, etc., creating a unique scanner fingerprint. Instead of detecting fractures, the model learned to recognize each scanner and, indirectly, the scanners’ likelihood of scanning a fracture. Combining many such features can give you a good statistical model that doesn’t look for actual fractures in X-rays.
Older women are more prone to having osteoporosis or osteopenia – reduced bone mineralization and bone strength – than younger women. Osteoporosis and osteopenia are essentially the same thing but different degrees. If you have osteoporosis, you are at increased risk of fractures. It is often possible to see hints of osteopenia in a radiograph, signaling that the patient is older and at increased risk of a fracture. A model can learn that fractures are more likely with older and less dense bone and female gender, instead of what a fracture looks like in a radiograph.
Bias comes from the source and handling of data and design choices during algorithm creation. Above all, it is vital to recognize, examine, and reflect on AI studies’ biases.7
Patient and doctor autonomy
AI poses a risk to patient autonomy and integrity. When AI models produce difficult-to-explain outcomes based on unknown data, it becomes difficult to base decisions on their output. AI models also pose a risk to clinician autonomy. As AI systems become more prevalent, there is a risk that society will divert the responsibility for decision-making to incompletely understood algorithms. Clinicians and healthcare systems might be forced to implement and follow them against better judgment, implicitly forcing patients to subject themselves to AI.8
Safety and interpretability
The power of AI systems comes from their ability to use large amounts of data to create complex models that consider thousands of parameters. However, as developed today, AI models are challenging to understand and interpret. AI models are mostly “black boxes.” What happens inside the model is usually unknowable. However, other medical technology and human analyses can also be considered black boxes as we speak about human experience and intuition. In each case, it is often impossible to backtrack the process entirely.
Understanding ML models is an active field of research. One way to address the challenge is to learn to create interpretable models.9
One popular way to understand AI models is to visualize the activating regions. I.e., the areas that lead to the classification decision get mapped in vivid colors. These can be called heat, saliency, or class activation maps. Another method is to produce bounding boxes that constitute the region of interest. However, whether the correct or incorrect part is displayed does not explain why the model reacted to that region.9 Such auxiliary maps can capture some AI mispredictions, but far from all. Other methods to achieve interpretability include showing similar reference cases or deriving uncertainty measures.10
Transparency in AI is crucial for actual clinical implementations where errors could have critical implications. We could supply standardized “model facts labels” along with the AI tool to critically assess AI results in the clinical workflow.11This is similar to the facts labels accompanying drugs to inform practitioners on suitable usage. Transparency also helps compensate for the sensitive nature of the data used to train and test them, which cannot be shared.
Responsibility and liability for AI and ML models
Who is responsible and liable for AI interventions is not always clear. A model that is 95% accurate is wrong 5% of the time. It is common for an AI model that is excellent at a task to fail at examples that are obvious to a human observer. Those are easy to detect. Some errors are within normal parameters. If the patient accepted the AI intervention, we might consider this an unfortunate but acceptable risk.
However, if an AI model suggests a course of action, but the underlying rationale is unclear, clinicians might not follow it. Suppose the recommendation was correct, and not following them caused harm to the patient. Are clinicians responsible? Suppose they followed the AI recommendation, which resulted in a clinical error, constituting malpractice. Who is liable, legally but also morally? Currently, most AI interventions are tools that assist clinicians rather than replace them, and the physician remains responsible.
If AI is ever to replace healthcare professionals, AI developers will need to take over the medical responsibility for the outputs and recommendations of their algorithms.
Summary
We have reflected on some central ethical problems and concerns of algorithmic medicine and AI interventions. Medical ethics is essential for understanding and evaluating AI studies, including the limitations of the studies. Ethical considerations can limit individual AI systems, but those limitations are sometimes necessary to safeguard patients. And patients are the ultimate beneficiaries of medical AI.
Sources
Numbered sources
1. Currie G, Hawk KE, Rohren EM. Ethical principles for the application of artificial intelligence (AI) in nuclear medicine. Eur J Nucl Med Mol Imaging. 2020 Apr 1;47(4):748–52.
2. Hedlund J, Eklund A, Lundström C. Key insights in the AIDA community policy on sharing of clinical imaging data for research in Sweden. Scientific Data. 2020 Oct 6;7(1):331.
3. Mittelstadt BD, Allo P, Taddeo M, Wachter S, Floridi L. The ethics of algorithms: Mapping the debate. Big Data & Society. 2016 Dec 1;3(2):2053951716679679.
4. Adamson AS, Smith A. Machine Learning and Health Care Disparities in Dermatology. JAMA Dermatol. 2018 Nov 1;154(11):1247.
5. Kamulegeya LH, Okello M, Bwanika JM, Musinguzi D, Lubega W, Rusoke D, et al. Using artificial intelligence on dermatology conditions in Uganda: A case for diversity in training data sets for machine learning. bioRxiv. 2019 Oct 31;826057.
6. Badgeley MA, Zech JR, Oakden-Rayner L, Glicksberg BS, Liu M, Gale W, et al. Deep learning predicts hip fracture using confounding patient and healthcare variables. npj Digital Medicine. 2019 Apr 30;2(1):1–10.
7. Beil M, Proft I, van Heerden D, Sviri S, van Heerden PV. Ethical considerations about artificial intelligence for prognostication in intensive care. Intensive Care Med Exp [Internet]. 2019 Dec 10 [cited 2020 Oct 17];7. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6904702/
8. Lupton M. Some ethical and legal consequences of the application of artificial intelligence in the field of medicine. Trends Med [Internet]. 2018 [cited 2020 Oct 17];18(4). Available from: https://www.oatext.com/some-ethical-and-legal-consequences-of-the-application-of-artificial-intelligence-in-the-field-of-medicine.php
9. Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence. 2019 May;1(5):206–15.
10. Pocevičiūtė M, Eilertsen G, Lundström C. Survey of XAI in digital pathology. In 2020 [cited 2023 Nov 6]. p. 56–88. Available from: http://arxiv.org/abs/2008.06353
11. Sendak MP, Gao M, Brajer N, Balu S. Presenting machine learning model information to clinical end users with model facts labels. npj Digit Med. 2020 Mar 23;3(1):1–4.