DS 6040 - Project Proposal Update

Michael Davies (mld9s)
Akeem Wells (ajw3rg)

1. Couple of sentences reorienting me to your project

Predicting Heart Disease

Overview: “Heart disease has become a major health problem in both developed and developing countries, and it is cited as the number one cause of death throughout the world each year.” Given the risk of heart disease in modern society, detection of cardiovascular disease and identifying its risk level for adults is a critical task.

Objectives: We will implement a model to classify whether a patient is normal or has heart disease. More specifically, we will develop a binary classification model that predicts the posterior probability that an individual has heart disease (given our data and model).

2. Have you obtained the data you need, and if so, what does it look like?

In short, we obtained the data and have completed preliminary cleaning, which can be seen below.

Data

The data we selected comes from:

Variables:

3. Broadly, what Bayesian model/approach you are planning on using, and if you have already begun analyzing the data.

We plan to implement a Hierarchical Bayes Approach. This is appropriate given that our data contains the same feature but is drawn from from distinct locations: Budapest-Hungary, Zurich-Switzerland, Basel-Switzerland and the VA Medical Center (Long Beach and Cleveland).

Questions we have at this point are:

Imports

Import Data

Data merging and cleaning

Merge all countries into one dataset

Clean dtypes

Check class balances on response var

Initial EDA