introxAI: Introduction to Explainable AI

Introduction to Explainable AI

Introduction to Explainable AI in Healthcare

1. What is Artificial Intelligence?

Artificial Intelligence refers to a branch of computer science that is about understanding and building intelligent systems, systems with the capabilities to perform tasks that are classically associated with humans, for example playing chess, driving a car, or making a medical diagnosis.

1.1 A Historical Overview of Artificial Intelligence in Healthcare

Using artificial intelligence in healthcare is not a new idea, although, what exactly is meant by the general term AI has changed over the years.

Early applications of AI in healthcare go back to the 70s. At that time, AI was mostly rule-based systems, which could already solve tasks like analyzing electrocardiograms or recommending treatments to patients. In such early systems experts explicitly encoded their knowledge in rules, which could then automatically be applied to new cases. Using such systems can lead to less variance in treatments of patients, as patients with the same symptoms receive the same treatment. Furthermore, if the rules are created by domain experts, they can also be understood by other domain experts and can be used to teach less experienced personnel. On the other hand, these systems could only encode available knowledge and had to be updated manually if new discoveries were made.

In the 2000s, with more and more medical datasets readily available, AI mostly meant machine learning. Machine learning is a sub-field of AI that is concerned about using algorithms and statistics to identify patterns in data. As such, machine learning systems use pairs of examples that consist of input features and the desired output. They then apply these algorithms, for example decision trees or logistic regression, to find patterns in the input features that they can use to arrive at the desired output.

In contrast to the earlier expert systems, machine learning systems learn their decision rules from the data. This also means that they are not limited by current medical knowledge, but can be used to discover previously unknown patterns in the data, which can then in turn lead to new medical insights. However, an important part of these machine learning models is to develop good input features from which the model learns its patterns. This is where the domain experts can include their knowledge and tell the model which features are relevant to the problem at hand, where to look for patterns, and what to ignore.

And while the machine learning models at the time were not as easily understandable as the rule based systems, they were still simple enough so human experts could follow their reasoning.

Since the 2010s, AI mostly means deep learning. Deep learning is a sub-field of machine learning that uses a special kind of model, the artificial neural network, to identify patterns in the data. Neural networks rely on very large datasets (often many tens of thousands of examples) to identify patterns, instead of expressive input features defined by domain experts. This means that they are well suited to domains where describing expressive features is difficult, such as the detection of lung diseases from radiology images.

Neural networks are different from earlier machine learning models, as they generally require much more data to learn from, as well as more computational power to train them. In fact, most of the progress with neural networks simply comes from the recent availability of computers that can train larger and larger networks. However, due to their complexity it can be extremely difficult to get an understanding for how neural networks make their predictions. In addition, it can be difficult to get enough examples for them to learn, as medical data is often heavily regulated or not readily available in electronic form. Furthermore, even with datasets available, getting the desired output for each of the input examples can be difficult and time consuming and often requires specialized knowledge.

1.2 A High-level Introduction to Neural Networks

Schematic overview of a deep neural network with an input layer, two hidden layers, and an output layer.

Neural networks are among the best performing and most popular AI approaches for many current medical tasks. A schematic overview of a neural network can be seen in the image above. A neural network consists of an input layer, which is connected to one or more hidden layers, of which the last one is connected with an output layer. Networks with many hidden layers are called deep networks, which is also the origin of the “deep” in deep learning. Each of the neurons in a layer of the network has weighted connections to neurons in previous layers of the network. The intuition is that each layer of the network learns to represent useful features for the following layers, and these features get more complex with increasing depth of the network. In the example of image classification, the first layer could learn to identify the positions of edges in the image, the next layers then combine the relative positions of edges into shapes, and then use the shapes to identify higher level concepts.

At the beginning, all the weights of the connections are randomized. This randomness of weights leads to the network making random predictions for all input examples. By chance, some of these predictions will be more correct than others. It is then possible to compute from the errors the network makes on the training examples, how the weights of the connections need to be adapted, so the network makes less mistakes in the future. This correction is applied many times, until the network achieves a sufficiently low error on the training examples.

When then after the training new input data is put into the network, it can use these weighted connections to predict the most likely output, based on the data it has seen so far. This means that the more similar the new input is to the input examples used for training, the better the network works and the more diverse data the network has seen, the better it can generalize what it learnt.

The biggest strength of neural networks is that they can identify very complicated connections between inputs and outputs. However, this is also a drawback, as understanding these complicated connections is very difficult for humans. Because of the difficult or almost impossible to understand connections between input and output, neural networks models are often considered black-box models. Something goes into the box, and something comes out of the box, but the detailed mechanisms in the box remain obscured.

Fortunately, there is also research being made towards making the results more understandable for humans.

2. Explainable Artificial Intelligence

One active field in the study of AI systems is how to make their decisions explainable. In the following, explainability is a characteristic of an AI system that allows human users to understand and reconstruct how and why the AI arrived at its decision.

2.1 Why Explanations are useful

During the development of an AI system, explainability can be used to sanity check AIs. If the AI can make its reasoning understandable to human experts, they can use their domain knowledge to make sure that the AI is focusing on the correct predictors for the task at hand.

Once the AI system is in use, explainability can make cooperation between humans and AI more efficient. If the AI and the clinician disagree on a decision, the explanation can help the clinician to put the AI’s decision into context, as clinical tasks are often extremely complex and it is almost impossible to incorporate all desired properties of an AI system into the design and training process. With explanations they can better judge whether the AI’s decision is applicable to the case at hand, or if it likely made a mistake.

In addition, in case the AI should achieve a performance superior to that of humans, they can use explanations to learn from the AI. This could for example be the case for junior clinicians that can then learn from well performing AI systems in addition to learning from their senior colleagues, making a much more intensive training possible. It might even be possible to gain new insights in cases where the AI is performing better than even very senior domain experts.

In short, explainability helps clinicians to put the decisions made by the AI system into context and use it as a qualified second opinion. It was even shown that with such a qualified second opinion, the diagnostic accuracy of a clinician and an AI system could improve above that of either clinician or AI system alone.

But explanations are not only important for clinicians. They can also use them to better explain to patients why a specific treatment is recommended. They can also support the patient in their decision making process by helping them to understand their individual risks and possible outcomes, sensitize them on the available options and help them make an informed decision for further actions according to their goals and priorities. This in turn can help boost patients’ motivation and willingness to act on risk-relevant information and recommended treatments.

2.2 How to Explain AI Systems?

There are different types of explanations available. In general they are separated into explainable models and post-hoc explanations. Explainable models are AIs where the internal functioning is directly accessible by the user, for example decision trees or rule-based systems. Explainable models have in common that the user can directly follow and understand their decision process and the explanations are a natural part of the AI.

Opposed to explainable models are so-called ‘black-box’ models. These are models which are too large or too complex for humans to understand their exact workings. Deep neural networks are commonly called black-box models, as it is not feasible for humans to follow the complex calculations a deep neural network makes for its prediction.

Unfortunately, there are cases where deep neural networks perform better than interpretable models. To still get some insight on how they arrive at their decision, one can use post-hoc explanations.

One way of creating such post-hoc explanations is to use the AI system to train an interpretable model to approximate its outputs. This interpretable model can then be used to approximate the reasoning of the black-box AI.

Another way is to use attribution-based explanations. These explanations give weights to the inputs, according to their importance for the final classification. A common application of attribution based explanations are so called attention maps in image processing. Attention maps highlight areas in the image that were important for the AI’s classification. In this example, the areas where the AI is putting much attention to are marked in blue, the areas where the AI is putting less attention to are marked in red. Below is an image, where such an attention map is displaying which parts of the image were influential to the AI's decision to rate the image as skin cancer.

Picture of a skin that is classified as cancerous along with the attribution map of the neural network's attention.

One advantage of AI systems for image analysis is that there were many methods developed to visualize areas of the image the system considers important for its decision process. This in turn helps clinicians to check if the AI pays attention to the correct parts of the image, estimate whether a specific diagnosis is applicable, or even learn new predictors from the AI.

Another type of explanations are example based explanations. The most popular instance of this type of explanation are counterfactual explanations. These explanations are hypothetical statements of the form ‘if feature X was different by Y, the decision would have been Z instead’. In this way they are easy to communicate with patients and show ways how the decision could be changed.

2.3 Criticisms of Explainable AI

While explanations can provide additional advantages to AI systems, there are also criticisms of the current state of possible explanations.

The most frequent criticism is that state-of-the-art deep learning models are not inherently interpretable. Therefore, explanations are often relying on approximations, which in turn leads to the possibility that the explanations do not adequately capture the model’s decision process. Strictly speaking, there is also no guarantee that the post-hoc explanation uses the same features for its explanation that the model is using for its decision. This possibility of false explanations can act contrary to the original intention of explanations and make the model less trustworthy.

Explanations are also another part of the model that must be validated. Some studies suggest that explanations might not actually lead to more compliance with an AI’s decisions, but instead less likely to recognize erroneous predictions. They are also another input the clinician must consider and can therefore have a part in information overload as clinicians can only focus on a limited number of inputs for their decision.

Explanations can also be misunderstood. While an AI system can provide explanations for its decisions, these explanations are always based on correlations. Clinicians in turn might incorrectly assume a causality, and arrive at invalid conclusions.

Another argument against explanations is that they might simply not be required as clinicians might not care how an information was generated, as long as they know that it is sufficiently accurate and validated. In this regard AIs could be compared to complex laboratory analysis where clinicians might also not know how exactly the result was produced, as long as it is accurate and useful. Furthermore, there can be cases where no “truth” is available or where clinicians do not agree on a single way to identify a condition. In such cases, it is unclear how an explanation should look like.

Unfortunately, just using explainable models provides no solution to this. In practice even explainable models are only explainable as long as they don’t grow over a certain complexity, it is easily imaginable that a decision tree could get so large that its explainability is not longer useful to a human. And while counterfactual explanations mirror the human explanation process, their creation is very computationally expensive, which makes them inapplicable in most systems.

Finally, how a good explanation looks is heavily dependent on whoever is interacting with the system. A useful explanation for an AI developer likely looks very different from one for a clinician which in turn is different from one for a patient, due to their varying backgrounds and requirements on which dimensions of the decision process should be explained.

3. Applications of AI in Healthcare

Recent advances in AI, especially deep neural networks, made it possible for computer systems to achieve performance comparable with that of human experts. This has many potential benefits. For clinicians, this means that AI systems can be used to deliver better and faster care. AI can be used as a second expert opinion, that helps healthcare workers better avoid errors, it can support inexperienced personnel by explaining their actions, and it can help clinicians focus their time and power on the patients that need it the most. This way, AI systems can help reduce the often very high workload on clinicians, help them make better decisions in less time and enable them to spend more time on the interaction with patients.

AI systems can also easily be shared across hospitals, cities, and countries. This can make knowledge transfer between regions easier and can also serve as a consistent baseline for conditions where no standard way for diagnosis exists or where there are frequent disagreements between experts.

Patients can also benefit from AI systems. AI can factor in a very high amount of different parameters to propose optimized treatments for each individual patient. In addition, AI systems can make high-quality medical advice readily available, without the often long wait times for access to medical specialists or the high costs. This is especially relevant in areas with limited availability of trained clinicians or bad access to healthcare in general, such as low-income countries or remote areas, where AI systems could improve the healthcare situation for large parts of the population.

3.1 Supporting Clinicians with AI

Currently, the area where AI systems are the most successful is the analysis of medical images, for example in radiology, dermatology or pathology. In these areas, clinical imaging is a main part of the diagnosis process. This means that most, if not all, of the information required to make the correct diagnosis is contained in the image. Before the advance of deep learning systems those tasks were difficult to solve with a computer, as mathematically describing robust and useful features on images is a very challenging task. Fortunately, many medical departments have large databases of historical images, which can then in turn provide many thousands of examples to train the deep learning system and make the application of AI to a variety of tasks possible.

In addition, it is also possible to pre-train AI systems on non-medical image datasets. This way, the AI learns how to extract useful features from images, before it is fine-tuned on medical images. The idea is that detecting interesting parts in images, for example identifying shapes, can be learned from a variety of images. During the fine-tuning process the AI system then learns how to apply these features to the domain specific images and which of them are useful for the diagnostic process. The main advantage of this approach is that it can drastically reduce the amount of examples required to successfully train an AI system, which in turn also enables the application of AI systems to domains where only small datasets are available or the labeling process is very demanding.

In controlled laboratory settings AI has successfully been used to perform imaging tasks such as detection of lung diseases from chest x-rays, the diagnosis of skin lesions, or cell segmentation on pathological images. In many cases the AI is able to perform on a similar level as clinicians with multiple years of experience in the respective domain.

Deep neural networks can also be used to recognize patterns in time-series data. Modern monitoring devices can measure many different parameters with a high frequency, which can make it difficult for humans to adequately understand all dimensions of this data. Fortunately, this large availability of data is very beneficial to the development of AI systems. They can for example be employed to monitor the ECG data of patients for early warning signs of cardiac arrests and enable quicker responses. They can also be used in connection with electronic health records to predict the possibility of the onset of sepsis or to select the best available treatment, while incorporating more context relevant information and data of comparable cases than possible for clinicians.

3.2 Supporting Administration with AI

But not only clinicians can benefit from AI systems. These systems can also help with administrative tasks. AI can be used to automate routine tasks such as patient data entry or review of laboratory results. They can also be used to optimize scheduling or patient prioritization, which in turn enables more efficient care. It is even possible to use AI to estimate the length of patient stays and then use this information for more efficient planning.

Another very beneficial use of AI systems is automatic transcription. This way clinicians’ notes can be automatically added into digital systems, which frees up time required for manual entry. Another possibility is to directly describe speech data, which helps with even more efficient note-taking and documenting patient progress in electronic health records.

4. Challenges for AI in Healthcare

Artificial intelligence systems, especially those based on deep neural networks are able to solve a multitude of tasks in healthcare with near expert level proficiency. However, there are also some restrictions to their application.

4.1 Technical Challenges

Deep neural networks are very complex models, and as such require a large amount of data for training. If not enough data is used, or a too complex model is selected, it can happen that the model does not learn general rules, but simply learns to remember the examples. This is dangerous, as during training this looks like the model normally learning, and eventually reaching extraordinary high performance. However, as the model simply remembers the examples it was trained on, it does not learn any general rules and therefore does not work well when it is applied to new cases. This is dangerous, as the high model performance during training can lead to a false confidence. It is also why an AI model should not be used directly. Instead, its performance should be carefully verified with new examples that were not used for training before some claims regarding its performance can be made. In technical terms this is called ‘overfitting’.

AI systems are also really good at identifying patterns that help them solve their task. However, there is no guarantee that the patterns it identifies are actually useful for the case at hand and not just so-called spurious correlates, patterns that look useful during training but have no meaning afterwards. A famous case of this is the application of AI to radiology, where patients in critical condition were monitored with different equipment than patients in not so critical conditions. Instead of learning good predictors to determine the criticality of the condition, the AI simply learned to differentiate images taken from the two different machines.

4.2 Biases and Unfair Decisions

It is also important to note that the mistakes AI systems and humans make can be very different. In medical imaging, the majority of human mistakes are due to perceptual mistakes, where the clinician is for example not looking carefully enough at the whole image, or missing some subtle details. It was also shown that if clinicians find one abnormality, they are less likely to spot a second one. However, AI systems are always looking at the complete image and should find all abnormalities, independent of how many are present on the image. This in turn means that if an AI has a similar clinical performance to a human expert, it will make different errors.

These errors can come from different decisions during model design, training, and interaction with the model.

The first influential decision is the selection of which features to collect. It is important that the collected variables are related to the problem and are enough to solve the problem. At the same time it is important that the labels are correct, as otherwise the AI will learn the wrong thing.

The AI will learn very well how to make predictions for new cases similar to the ones it was trained on. The less similar a new case is, the less applicable the AI’s prediction is. This means that it is very important to capture a diverse dataset that contains cases from as many different populations as possible. The dataset should also be representative of the population for which the AI later makes predictions. Without special care, the AI is likely to work worse for minority groups.

Another thing to keep in mind is that the population can change over time which makes it important to do regular checks if the data the AI was trained on is still representative of the current population the AI is used on. On a related matter, if optimal treatments change over time, the data the AI is trained on should also reflect this. Otherwise it can happen, that the AI recommends treatments which were popular in the past, simply because so much data for these treatments is available and they often proved to be the (at the time) best solution.

It is also important that the collected dataset contains a variety of different patients. The AI can only learn to make predictions for cases it has previously seen. If a new case is too different from the ones it was trained on, its prediction will not be accurate, even if it is very confident. This is also related to the evaluation of the AI system. Ideally the evaluation should be conducted under as real as possible conditions with a clinical trial, so the performance of the AI system is not only measured against benchmarks in a lab, but also in real-world scenarios on the actual population it will be used on later. This is important as systemic errors in the AI system can not only come from biases in the training data, but also from algorithmic choices, such as the training objective or optimizations.

Another thing to consider is that simply using an AI system (or any other automated system) can introduce new classes of errors. If the AI is correct most of the time, this can lead to the users simply following the AI’s predictions, instead of carefully evaluating them. This then results in errors of the AI being much less likely to be caught by the humans that use it; a phenomenon called automation bias. Similarly, if the AI makes too many obvious errors, the users might always discard the AI’s predictions, even if it might catch mistakes in the users assessments.

4.3 Effective and Efficient Working with AI

Even if the AI system works perfectly and its performance was verified in clinical studies, there is still the challenge of convincing the clinicians to make use of its abilities. Clinicians might be skeptical towards the performance of the system, or they might not understand its decision process. It could also be received as a criticism of their abilities, or lead to fears of being replaced by a machine.

Another frequent problem is that the AI is not well integrated in the existing systems or workflows and therefore results in additional work for the users, instead of reducing their workload.

To increase acceptance and use of AI systems it is therefore important to include clinicians (or other users) early into the design process, so they can provide input on what tasks the system should solve and how the results should be presented to them. It is also important to openly communicate the intention, capabilities and limitations of an AI system to those that might use it.