Publication date: October 20, 2024
The risks associated with artificial intelligence in healthcare are extremely diverse and complex. These may include not only potential harm to patients, but also impact on medical practice, physician effectiveness, and medical ethics. For example, artificial intelligence algorithms can influence the clinical decisions of physicians by suggesting diagnoses or therapies based on the analysis of patients’ medical data. However, if these algorithms are not sufficiently accurate or are not properly tested, they can lead to misdiagnoses or therapeutic recommendations, which in turn can negatively affect patients and treatment outcomes.
Therefore, it is necessary to properly classify and identify the risks associated with each artificial intelligence tool in health care. Some algorithms can represent low risk when used in simple tasks, such as analysing laboratory test results. However, if they are used in more complex situations, for example in the diagnosis of diseases, the risk can be much greater.
It is also important to consider the clinical context and specifics of each AI application. For example, algorithms that analyse medical images, such as computed tomography (CT) or magnetic resonance imaging (MRI), can vary in complexity and impact on clinical decisions. In some cases, these algorithms can only be used as an auxiliary tool to assist physicians in interpreting medical images, while in others they can be a decisive factor in making therapeutic decisions.
Currently, the EU has regulations on medical devices 2017/45 (MDR) and regulation 2017/746 on in vitro diagnostic medical devices (IVDR), which were issued in 2017. They are the first step towards regulating the use of artificial intelligence in healthcare. However, it is necessary to take into account certain aspects of artificial intelligence, such as continuous model learning or identification of algorithmic errors. In 2021, the European Commission (EC) presented a long-awaited proposal for the regulation of artificial intelligence (AI), which aims to harmonize the rules governing this technology in all European countries. This proposal responds to the growing importance of artificial intelligence in various areas of life and the economy, but also to concerns about security, transparency and ethics related to its applications.
The EC document introduces three risk categories for artificial intelligence tools: unacceptable, high and low or minimal. AI tools classified as unacceptable are those that are contrary to EU values and should therefore be banned. High-risk artificial intelligence tools, on the other hand, can only be allowed to be used if they meet specific requirements for security, transparency and supervision. The EC proposal sets out specific requirements and responsibilities for high-risk AI tools. For example, it is required to use high-quality training, validation and testing data and to ensure traceability and auditability of the system. In addition, these tools must be transparent to users, ensure human oversight of decision-making processes and be robust, accurate and cyber secure.
In addition, the EC encourages the creation of codes of conduct for minimum risk AI tools that are not covered by mandatory requirements but can contribute to improving the quality and transparency of these technologies.
In the context of identifying risks associated with artificial intelligence, various stakeholders propose structured approaches based on self-assessment that include specific checklists and sets of questions. For example, the independent high-level group on artificial intelligence (AI HLEG), appointed by the European Commission, published the ALTAI checklist in the prices for assessing the dignity of trust in artificial intelligence. This list includes seven categories on human activities, technical robustness, privacy management, transparency, diversity, environmental and social well-being and accountability. These questions concern, among others, the effects of the AI system or the assessment of data diversity.
In 2021, a multidisciplinary team of researchers from Australia published the first self-assessment checklist on artificial intelligence in healthcare. This list was intended to assist clinicians in assessing the readiness of algorithms for routine use in healthcare, and to identify areas for further development and adaptation. Examples of questions in this list relate to the purpose and context of the algorithm, the quality of the data, the effectiveness of the algorithm, whether the algorithm can improve patient care and whether it can cause harm. However, this self-assessment list is not as detailed as the AI HLEG general intelligence checklist. In order to create a comprehensive and standardized checklist of risk assessments for artificial intelligence in healthcare, it may be necessary to combine both approaches.
In this context, consensus guidelines for trustworthy artificial intelligence in medicine have recently been developed to support the design and application of ethical solutions in artificial intelligence in healthcare. These guidelines, organized according to six principles, include specific recommendations and a self-assessment checklist. These questions concern, among others,
the integrity of algorithm design,
the universality of data sets,
the traceability of the data processing process,
the usability of tools,
the robustness of model evaluation,
the explanatory nature of the results.
The need to further adapt the risk assessment of artificial intelligence to specific fields of medicine is also noted. For example, in the field of radiology, leading radiology associations have come together to engage in ethical living related to the use of artificial intelligence, resulting in a growing need to develop codes of ethics and practice.
The checklists used present a variety of risk categories and assessment questions. Standardisation of these approaches, their adaptation and validation within professional associations and independent groups in specific fields can lead to more robust risk identification and management processes. It is also important to regularly update these checklists to take into account technological, process and regulatory developments in the field of artificial intelligence in medicine.
Assessing and managing the risks associated with medical artificial intelligence are crucial to ensuring the safety and effectiveness of the use of these technologies in healthcare. Previous approaches to assessment have focused mainly on the study of the accuracy and robustness of artificial intelligence models under controlled laboratory conditions. However, other important aspects, such as ensuring patient safety, fairness of algorithms and protection of privacy, are equally important but more difficult to assess in the traditional way.
In response to these challenges, the American Agency for Food and Drug Administration (FDA) in 2021 proposed an action plan to better regulate and improve the supervision of medical AI software. Their efforts focus on the development of assessment methodologies and improvement of machine learning algorithms. At the same time, numerous research teams around the world, including North America, Europe and Asia, and international associations such as the International Association of Medical Informatics, are engaged in research on new approaches to the assessment of medical artificial intelligence algorithms.
One of the key areas of this research is the standardization of the definition of clinical tasks that are implemented by AI algorithms. Research teams, for example from Stanford University, propose to standardize these definitions, which would enable an objective and comparative assessment of AI medical solutions. In practice, there are many different ways to define these tasks, which makes it difficult to both assess the performance of algorithms and compare their effectiveness.
The implementation of standardized definitions of clinical tasks may require cooperation between various medical institutions and scientific societies. For example, the European Society of Cardiology, the European Society of Radiology and the European Society of Medical Oncology can play a key role in the development of common standards. Thanks to this, AI algorithm developers will have a clear framework for work, which in turn can contribute to increasing trust and acceptance of these technologies in the medical environment.
Previous approaches have focused mainly on the accuracy of models in laboratory conditions, which does not always reflect their true behaviour in clinical practice. However, it is now recognised that, in order to reliably assess their usefulness and safety, a wider range of factors must be taken into account. These factors include, among others, classification accuracy, reliability, suitability, determination and transparency. It is worth noting that each of these elements is important for the effectiveness and safety of algorithms in clinical practice.
Another important aspect of the evaluation is clinical utility, which requires validation with end users. AI algorithms and their interfaces should be intuitive and easy to use to increase their acceptance and effectiveness in clinical practice. Usability tests can provide detailed information about user satisfaction, how comfortable they are with the algorithm, and the impact of technology on their productivity and diagnostic performance.
In addition, assessing the impact on patient outcomes and analysing cost-effectiveness are key to assessing the effectiveness and value of medical artificial intelligence, It is also worth considering the dynamic nature of AI algorithms, which can change with access to new data, which requires continuous monitoring of their performance. It is therefore essential that the evaluation framework is flexible and adaptable to changing conditions and clinical needs.
Division of the evaluation process
The process of evaluating medical solutions based on artificial intelligence (AI) has been proposed in several publications that recommend a multi-step approach. In this approach, algorithms go through several stages of evaluation, each with its own goals and degree of complexity.
Four main steps in the validation of artificial intelligence in the field of diagnostic imaging have been proposed.
The first step is to assess the feasibility of the algorithm under controlled laboratory conditions, usually on a small set of test data. It is compared with existing solutions or with the results obtained by experienced doctors. Here, algorithms do not have to be completely reliable yet, and the goal is only to assess the possibility of their use. Next, it proceeds to simulate the real-life conditions in the laboratory and refine the algorithm to increase its effectiveness. Tests simulate different clinical conditions to see how the algorithm behaves in different situations. The security of the algorithm and the reactions of end users are also assessed at this stage. After successful laboratory validation, the algorithms go on to be evaluated in a clinical environment. The aim is to confirm that the real-world operation of an algorithm corresponds to its behaviour under controlled conditions. Any feedback from this stage is used to further improve the algorithm. The final step is to maintain continuous evaluation and monitoring of the algorithm for continuous improvement. AI manufacturers integrate monitoring and auditing systems into their solutions to detect, correct and report errors. Algorithms are regularly updated and tested under controlled conditions before being reused in clinical practice.
Transparency of documentation and reporting is crucial to enable critical evaluation by developers, researchers and other stakeholders. In addition, properly documented processes can help if one needs to reproduce an algorithm or AI results.
There are reporting standards for predictive models in health care, such as TRIPOD, which was developed to clearly report the process of building predictive models and assess their potential bias and usefulness. TRIPOD is widely accepted in the biomedical community and covers key points to be considered in the reporting of a predictive model study, such as data source, participant characteristics, model building methods, and outcomes. However, in the context of the increasing use of machine learning techniques in artificial intelligence, there is a need to adapt existing standards to the specific requirements of these new technologies. The TRIPOD extension to TRIPOD-AI and the CONSORT-AI guidelines are examples of initiatives that seek to align reporting standards with models based on machine learning. CONSORT-AI, for example, proposes to include detailed descriptions of AI interventions and analysis of execution errors.
In addition, MINIMAR standards focus on the minimum information necessary to understand AI solutions in healthcare. They contain key reporting elements divided into categories such as the research population, patient demographics and model properties. Such standards aim to promote transparency, accuracy and trust by incorporating key information from artificial intelligence research into a single detailed document.
Finally, the development of these standards not only facilitates the evaluation and comparison of different AI models, but also ensures repeatability of research, community trust, and clinical translation, this is important in the context of the use of AI in healthcare and other areas.
As previously mentioned, Currently, medical devices based on artificial intelligence (AI) are subject to the regulations set by the MDR (Medical Device Regulation) and IVDR (In Vitro Diagnostic Medical Devices Regulation), regulations, which were introduced in 2017. However, in order to adapt to the new challenges and opportunities associated with the growing use of AI technology in health care, in 2021 the European Commission (EC) proposed a new regulation on artificial intelligence, which was adopted in March this year.
This regulation aims to introduce more specific requirements and obligations, especially for medical-related AI technologies that are considered high-risk technologies. This includes, among others, the need to create and implement quality management systems, subject to conformity assessment, as well as possible re-evaluation of AI systems after significant changes and conducting post-market monitoring.
In the new regulatory framework, medical AI tools are particularly highlighted as highly risky, meaning they are subject to increased scrutiny and oversight. However, while current AI regulations apply to equal fields, artificial intelligence in healthcare faces specific technical, clinical, and socio-ethical challenges. It is therefore essential that the regulatory framework is extended and adapted to the specificities of medical artificial intelligence, as has been noted and presented in different countries such as the United States, Japan and China.
In 2021, the American Agency for the Food and Drug Administration (FDA) has published an action plan on AI and machine learning (AI/ML) in the context of medical devices, paying attention to the need to adapt regulations and practices to the specifics of medical AI and to increase attention to the patient. To achieve this, a multi-faceted risk assessment must be an integral part of the medical artificial intelligence certification process. This is especially important because risks and limitations vary depending on specific areas of medicine, such as radiology, surgery or mental health.
In addition, there is a need to harmonise and strengthen the validation of medical artificial intelligence technology, taking into account various aspects of risks and limitations, including accuracy, robustness, clinical safety and clinical acceptance. The proposal is also to introduce external validation by independent entities, which will allow a more objective assessment of AI medical tools taking into account the variability of clinical practices and socio-economic-ethical contexts.
Moreover, in the future acceptance and implementation of medical artificial intelligence (AI) tools, various stakeholders will play a key role, in addition to the AI group itself. These include doctors, patients, social scientists, health care managers, and artificial intelligence regulators. It is therefore necessary to develop new approaches that will promote the broad involvement of all stakeholders in the development, verification and implementation of AI in healthcare, taking into account diverse needs and contexts.
In this context, future AI algorithms should be developed through co-creation, that is, strong and continuous collaboration between AI developers and clinical end users and other experts, such as biomedical ethicists, at all stages from design to implementation. This cooperation should include regular meetings, consultations and evaluations to ensure that the solutions developed are in line with the real needs and requirements of end-users. Integrating the approach will enable AI algorithms to better match the needs and culture of healthcare professionals, as well as earlier detection of potential threats. This, in turn, will allow for the optimisation of clinical outcomes of end-users, while taking into account social, ethical and legal requirements.
Strong user engagement will allow better consideration of expected interactions between them and AI algorithms, which is important for visual interfaces. These should be designed with clinical end-users in mind to allow for a clear explanation of the predictions of machine learning models in healthcare. This approach minimizes human error and increases AI acceptance.
Finally, multi-stakeholder engagement and co-creation will focus on social issues such as equality and equity that are specific to specific AI applications in healthcare. This requires collaboration between experts in various fields, health professionals, social scientists and community members, especially those who are under-represented. This is crucial to ensure that the AI technologies being developed take into account population diversity and societal needs, contributing to more equitable and equitable access to innovative health solutions.
One of the proposed solutions is the introduction by medical regulators of artificial intelligence called „the artificial intelligence passport”, which would allow a standard description and identification of medical tools based on artificial intelligence.
Such a passport should contain key information from five categories:
model,
data,
evaluation,
use
maintenance.
It is important that these passports are harmonised, allowing for consistent traceability at different levels of health care management.
Monitoring of responsibilities in the field of medical artificial intelligence
Responsibility in the field of artificial intelligence, especially in the medical context, is extremely important because it concerns the health and life of people. The introduction of artificial intelligence tools into medical practice holds many promises to improve the effectiveness and efficiency of health care. Nevertheless, along with these benefits, there are also potential risks and risks that need to be addressed.
It is important that all parties involved in the development, implementation and use of artificial intelligence in medicine bear responsibility for its operation and possible effects. Manufacturers, algorithm developers, medical practitioners and other participants in the process should be clearly defined their roles and responsibilities in ensuring the safety and effectiveness of these technologies. To achieve this, a coherent legal framework is needed that not only defines accountability, but also ensures enforcement at national and international level. Existing regulations, such as the GDPR, provide some basis, but in the context of medical artificial intelligence, they may turn out to be insufficient. There is therefore a need to develop specialised regulations that take into account unique aspects related to medical data and decision-making processes.
Regular audits and risk assessments are an important step towards increasing accountability for AI tools in healthcare. These activities should include all stages of the development and use of artificial intelligence, from data collection, through the process of creating and testing algorithms, to their implementation and daily use. These audits should not only be internal but also external, carried out by independent audit organisations, which will ensure an objective assessment and contribute to increasing confidence in these technologies.
Moreover, it is necessary to maintain high standards of transparency, integrity, accuracy and security in AI decision-making processes. Mechanisms for monitoring and archiving artificial intelligence-based decisions should be an integral part of medical systems, enabling continuous tracking and identification of possible errors or irregularities.
Although artificial intelligence and machine learning (ML) have made significant advances in recent years, their use in medicine and healthcare brings both hopes and challenges. On the one hand, AI and ML can revolutionize disease diagnosis, therapy, and healthcare management by analysing vast collections of medical data. On the other hand, however, there are numerous risks that need to be addressed and addressed to ensure the safe and effective use of these technologies in clinical practice.
The report highlights the need for further research and development to maximize the benefits of medical AI while addressing existing limitation issues. One of the key research areas is the explanatory nature of AI, the ability of humans to interpret and understand how algorithms work, especially in a clinical context. Despite the growing interest in this issue, its complexity resulting from the diversity of medical data is a serious challenge, requiring an interdisciplinary approach and continuous adaptation of methods to the needs of clinical practice.
Another important aspect is the elimination of bias in the data, which is crucial for the reliability and effectiveness of AI algorithms. To this end, research is being carried out on methods for detecting and eliminating bias, and a tool is being developed to support fair decision-making based on medical data. In addition, the development of adaptive AI methods will enable the tools to be adapted to different population groups and local conditions, which is important for the global application of these technologies. Collaboration between the different fields of medical science and practice is crucial for the successful implementation of new AI-based solutions in a variety of clinical environments.
However, as AI becomes an increasingly integral part of healthcare, it is also important to ensure the security and privacy of medical data. In this context, the development of cyber-attack-resistant technologies is becoming essential for maintaining trust in AI in medicine. Finally, the development of uncertainty estimation tools is crucial to provide clinicians with reliable information regarding AI predictions, especially in situations where clinical decisions are made based on them. Implementing these solutions requires concerted efforts by researchers, medical practitioners, and government and regulatory institutions to ensure that medical AI does indeed benefit patients and society as a whole.
Although the European Union has made significant investments in the development of artificial intelligence in recent years, there are still clear differences in the degree of advancement of this technology between individual European countries. These inequalities can be attributed not only to structural differences in research programmes or technology availability, but also to different levels of investment, both from the public and private sectors.
These discrepancies in the context of artificial intelligence in the field of medicine are particularly visible. Here, the development and introduction of innovations is largely based on access to extensive and high-quality biomedical databases and the use of the latest technological developments. Thus, existing differences may further exacerbate existing health inequalities across the European Union, such as differences in life expectancy or population health indicators.
Faced with these calls, member states, especially those from Eastern Europe, should take concrete and decisive steps to support the development of artificial intelligence in the healthcare sector. These programmes should include a number of initiatives to enhance the technological, research and industrial capacity of emerging EU countries in the field of artificial intelligence in healthcare. It is also worth considering creating a research infrastructure and providing access to data where it is missing. The European Commission can play a key role in coordinating these actions and promoting the harmonisation of approaches.
The differences in the level of medical advancement of artificial intelligence between different European countries, especially between Eastern and Western Europe, are not only a reflection of specific local conditions, but also broader social ones, economic and health inequalities in different regions of the continent. Therefore, in order to effectively reduce these divisions, an approach involving not only medical and technological aspects, but also policy actions that address systemic inequalities in European society is necessary.
In conclusion, it can be noted that the legal regulations on artificial intelligence in medicine are extremely important in the context of ensuring the safety, effectiveness and ethicality of the use of this technology in healthcare. Legal approaches to date have focused mainly on aspects of patient data security and protection, such as the general regulation on the protection of personal data (GDPR) in Europe. However, as artificial intelligence develops and applies in medicine, more and more attention should also be paid to issues of accountability, transparency, fairness and transparency of AI algorithms, as well as equal access to new technologies.
A legal framework should be developed that strikes a balance between promoting innovation and protecting patients and ensuring the fair and ethical use of artificial intelligence in medicine. Support for research into the impact of AI on health care and the development of guidelines for the collection, processing and analysis of medical data are key to ensuring effective regulation. In addition, it is important that these regulations are flexible and adapted to the rapid pace of technological development, while ensuring a high level of patient protection and medical ethics.
Finally, international cooperation in the field of AI regulation in medicine is essential to ensure consistency and compliance of legal standards on a global scale, especially in the context of data flows between countries. Activities in this field can contribute to building trust in artificial intelligence in health care and maximizing its benefits to society.