Heart Disease Prediction

  • Date: Apr 2018
  • Category: Data Science
  • Key Tags: Neural Network, Machine Learning

Use ConvNetJS to train Heart Disease Data Set. Heart Disease Prediction is a web based application, it can predict heart disease by analysing people's health

It is a project belongs to Big Data Management and Analytics Lab, UT Dallas. Director: Prof. Latifur Khan Team memebers: Ahmad Mustafa, Runze Zhang

Introduction

In this project, we use ConvNetJS to train Heart Disease Data Set. Heart Disease Prediction is a web based application, it can predict heart disease by analysing people's health information. This program is mainly written in JavaScript(Model Trainning and diagnosis of heart disease).

Data Set

Heart Disease Data Set This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to this date. The "goal" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4. Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1,2,3,4) from absence (value 0). The names and social security numbers of the patients were recently removed from the database, replaced with dummy values. One file has been "processed", that one containing the Cleveland database. All four unprocessed files also exist in this directory.

we use the following 14 attributes:

  1. #3 (age)
  2. #4 (sex)
  3. #9 (cp)
  4. #10 (trestbps)
  5. #12 (chol)
  6. #16 (fbs)
  7. #19 (restecg)
  8. #32 (thalach)
  9. #38 (exang)
  10. #40 (oldpeak)
  11. #41 (slope)
  12. #44 (ca)
  13. #51 (thal)

Output:

  1. #58 (num) (the predicted attribute)

Training

ConvNetJS Input Layer:

layer_defs.push({type:'input', out_sx:1, out_sy:1, out_depth:13});

Output Layer:

layer_defs.push({type:'svm', num_classes:2});

Official Document Neural Net Classification Example:

In a classification setting the network is asked to provide a prediction among a fixed set of distinct classes. Lets create a simple 2 layer neural network binary classifier (i.e. two distinct classes) that takes 2-dimensional data points. The first layer of every network must be an 'input' layer in which we declare the size of the input data. ConvNetJS layers are based on Vol class that represents a 3-dimensional volume of numbers. The 3 dimensions are (sx, sy, depth), but if you're not working with images we will always keep sx = 1, sy = 1, and only worry about depth. Therefore, we will declare the size of input volume to be 1x1x2 (out_sx = 1, out_sy = 1, out_depth = 2). The next three layers will be fully connected layers ('fc' for short) of neurons, and the last layer will be a classifer layer (called 'softmax') which outputs probabilities.

var layer_defs = [];
  // input layer of size 1x1x2 (all volumes are 3D)
  layer_defs.push({type:'input', out_sx:1, out_sy:1, out_depth:2});
  // some fully connected layers
  layer_defs.push({type:'fc', num_neurons:20, activation:'relu'});
  layer_defs.push({type:'fc', num_neurons:20, activation:'relu'});
  // a softmax classifier predicting probabilities for two classes: 0,1
  layer_defs.push({type:'softmax', num_classes:2});
  // create a net out of it
  var net = new convnetjs.Net();
  net.makeLayers(layer_defs);
  // the network always works on Vol() elements. These are essentially
  // simple wrappers around lists, but also contain gradients and dimensions
  // line below will create a 1x1x2 volume and fill it with 0.5 and -1.3
  var x = new convnetjs.Vol([0.5, -1.3]);
  var probability_volume = net.forward(x);
  console.log('probability that x is class 0: ' + probability_volume.w[0]);
  // prints 0.50101

Tained Model Parameters:

{  
     "layers":[  
        {  
           "out_depth":13,
           "out_sx":1,
           "out_sy":1,
           "layer_type":"input"
        },
        {  
           "out_depth":2,
           "out_sx":1,
           "out_sy":1,
           "layer_type":"fc",
           "num_inputs":13,
           "l1_decay_mul":0,
           "l2_decay_mul":1,
           "filters":[  
              {  
                 "sx":1,
                 "sy":1,
                 "depth":13,
                 "w":{  
                    "0":-17.89480016346953,
                    "1":-65.20715434097517,
                    "2":-143.49570806471428,
                    "3":-31.1386520983572,
                    "4":-21.70700828658741,
                    "5":13.59546748520739,
                    "6":-40.36852901427123,
                    "7":96.0189094078226,
                    "8":-53.0165269333858,
                    "9":-104.15109372402621,
                    "10":-29.65522120885926,
                    "11":-133.97185827311105,
                    "12":-285.53574878526405
                 }
              },
              {  
                 "sx":1,
                 "sy":1,
                 "depth":13,
                 "w":{  
                    "0":17.887956492054883,
                    "1":65.20678160647515,
                    "2":143.51466962346882,
                    "3":31.13940595462754,
                    "4":21.68578653275807,
                    "5":-13.612917691852742,
                    "6":40.34568187773178,
                    "7":-96.01223218499257,
                    "8":53.0419593853359,
                    "9":104.16976753309304,
                    "10":29.652454924281546,
                    "11":133.98130289247052,
                    "12":285.5339302344702
                 }
              }
           ],
           "biases":{  
              "sx":1,
              "sy":1,
              "depth":2,
              "w":{  
                 "0":-6.3200619040626025,
                 "1":6.3200619040626025
              }
           }
        },
        {  
           "out_depth":2,
           "out_sx":1,
           "out_sy":1,
           "layer_type":"svm",
           "num_inputs":2
        }
     ]
  }

Cross-validation

If set folds=2, the example result:

  • The Number of Folds= 2
  • The number of correct= 210
  • The number of incorrect= 87
  • Accuracy= 0.7070704689998422
  • FPR= 0.3624997734376416
  • FNR= 0.21167867760680467
  • TPR= 0.7883205924667208
  • TNR= 0.637499601562749

Testing

Users can input data to get prediction result, by importing new data, the platform can update model automatically.