FAQ's about neural networks to predict prostate cancer

How does this service work?

This service consists of a series of "calculators" that make foreasts about medical conditions by associating information about a patient (e.g., you) with a database of information about hundreds or thousands of other patients. The calculators use so-called "artificial intelligence" to make the predictions.

Where do you get your data?

Each calculator is based on data collected from real patients. For example, our lymph node spread model was developed using data from thousands of patients who had surgery to remove their prostates at the Johns Hopkins Medical Institutions. These data are compiled and studied to determine their characteristics. Then, an artificial neural network is created to model that data.

What is artificial intelligence?

Artificial Intelligence (AI) is a branch of computer science that aspires to develop software that simulates the functioning of human brain to solve various problems. Artificial Intelligence can mimic the human brain in terms of architecture, design and functioning. It can be used to recognize patterns and images; construct a decision tree to solve a problem; classify data; predict outcomes; study thematic evolution of a process and construct cost effective models.

What is an artificial neural network?

Artificial neural networks (ANNs) are software constructs designed to mimic the way the human brain learns. The brain is made up of billions of interconnected neurons. Similarly, ANNs are made up of virtual interconnected nodes. Computer scientists have developed many different classes of ANNs with a variety of architectures and training algorithms. We have used a class called "Multi-Layer Perception" (MLP) with a back-propagation training algorithm. This is a schematic representation of a simple, generic ANN.

The input layer nodes accept input variables (analogous to independent variables). One or more hidden layers of nodes do the majority of processing. Values from the hidden layer are processed and presented as an output value at one or more output nodes (analogous to dependent variables).

The figure at right represents a generic hidden node. Each interconnection has a weight or coefficient associated with it (W₁, W₂ and W₃ in the example above). These weights serve as multipliers for the values passing to the nodes through each connection from the previous layer (the values coming in from the input layer are represented by x₁,x₂, and x₃ in the example figure above). When numbers are entered into the input layer, they are multiplied by the weights at each connection and then summed at the hidden nodes (this summation is symbolized by the sigma in the example above). The resulting sum is passed through a "squashing" function, such as a logistic function, before being passed on to the next layer of nodes. Finally, a number emerges at the output node with a value that depends on the input values and the weights assigned to each interconnection.

How do you develop your artificial neural networks?

ANNs are not programmed like conventional computer programs, but learn from experience. The ANN learns during a training phase in which cases with known inputs and outputs are shown to the ANN sequentially and repeatedly. A training algorithm adjusts the weights at each connection with the goal of reducing the error between the known output values and the actual values the ANN generates with the weights it has at the moment. At first, the outputs produced by the ANN are somewhat arbitrary. But, over time, as cases are reintroduced repeatedly hundreds or thousands of times, the ANN begins to get some of the answers right. The training algorithm continues to change the weights until most of the answers are correct and training is then stopped.

The next phase is to test or validate the ANN. This is done with a set of cases that the ANN has never seen. Based on the ANNs performance on this set (called the validation set), it is determined whether the ANN has learned appropriately.

What else are artificial neural networks good for?

ANNs have been used in a variety of fields including economics, finance, meteorology and engineering. In the past 10 years, the application of ANNs has been investigated in medicine. ANNs are especially good at picking up subtle patterns in large data sets with multiple variables. And, under certain circumstances, can perform better than traditional statistical techniques. However, we believe that ANNs will never replace statistics, but will serve to complement them.

What are the limitations of artificial neural networks?

An ANN can create a model based only on the data with which it is developed. Therefore, patients with clinical variables that are outside the range of the variables used to develop the model will not be able to use this application. For example, the upper limit of PSA level for the lymph node spread model is 84.2 ng/mL. This model would not be valid for a patient with a pre-treatment PSA that is higher than 84.2 ng/mL. Further, this model is designed to predict the risk of lymph node spread in men with clinically localized prostate cancer only (not for advanced disease or for men who have not been diagnosed with prostate cancer).