AI-in-Healthcare-of-Heart-Disease

Research in Heart Disease Prediction Using AI (Computer Vision)

Abstract

Heart disease is the leading cause of death for people around the world today. Diagnosis for various forms of heart disease can be detected with numerous medical tests, however, predicting heart disease without such tests is very difficult. Machine learning can help process medical big data and provide hidden knowledge which otherwise would not be possible with the naked eye. The aim of this project is to explore how machine learning algorithms can be used in predicting heart disease by building an optimized model. The research questions are; 1) What Machine learning algorithms are used in the diagnosis of heart disease? 2) How can Machine Learning techniques be used to minimize misdiagnosis (additional tests, and wrong treatment all resulting in greater monetary impact to the patient), 3) How can Machine Learning be used to detect early abnormalities, thus benefiting both patients and the healthcare system? We collected our dataset from the UCI repository and used Random Forest Classification algorithm for predicting heart disease. Then, we modified one of the hyperparameters called ‘N_Estimator’ to improve the model further. The findings and conclusion for each question are; 1) Machine learning algorithms used in predicting heart disease are Naïve Bayes, Decision Trees, Support Vector Machine, Bagging and Boosting, and Random Forest, concluding that these algorithms can achieve high accuracy in predicting heart disease. 2) Machine learning algorithms can analyze a large amount of data to assist iv medical professionals in making more informed decisions cost-effectively. 3) Machine Learning algorithms allowed us to analyze clinical data, draw relationships between diagnostic variables, design the predictive model, and tests it against the new case. The predictive model achieved an accuracy of 89.4 percent using Random Forest Classifier’s default setting to predict heart diseases. Furthermore, emerging areas for future research that emerged from this study include the opportunity for training and testing using our model with a larger dataset and modifying different hyperparameters for further improvement

Introduction

Rising healthcare costs have been a major issue for developed nations. (Dadgostar, 2019). According to CDC, an estimated 859,000 people in the US die from cardiovascular disease or 1 in every 3 deaths. Cardiovascular diseases cost $216 billion in the healthcare system and $147 lost in productivity (Mayo, 2022). This cost has been a major concern in the US, and therefore early detection is important. In light of the rapid advancement of biotechnology, and an era of big data generated for healthcare by mainly EHR (electronic health records) in various structures, it is increasingly more important to intelligently use this information to make sense of hidden patterns, detect abnormalities, and predict heart diseases. Artificial intelligence has certainly made computers smarter. Machine learning which is a subset of artificial intelligence plays an important role in mining large datasets and extracting valuable knowledge from them. Training a machine appropriately with proper train data set, the machine’s algorithm can learn patterns and therefore detect any abnormalities in the initial stage of a disease which can help patients save overall cost and time. This project will examine the opportunities of machine learning and data mining in the healthcare industry 2 especially in heart diseases, how early diagnosis can minimize healthcare costs; and how data generated by EHR can provide insights for medical professionals in terms of detecting abnormalities for potential chronic diseases. We begin by providing a brief research background, followed by the problem statement, research questions, objectives, and the organization of this culminating experience project.

Aim

To predict heart disease according to input parameter values provided by user and dataset stored in database.

Objective

The primary objective of this project is to analyze the occurrence of heart disease based on various features that describe it and to explore how Machine Learning algorithms can be leveraged to build an optimized model for accurate heart disease prediction and diagnosis.

PROBLEM STATEMENT

Modern information technology tools and techniques, such as AI, machine learning, and data mining, can significantly support healthcare professionals in minimizing deaths caused by heart disease while ensuring cost-effectiveness. For instance, machine learning algorithms can analyze large databases to identify frequent patterns that might eventually lead to heart disease and related fatalities. The pandemic has underscored the vital importance of health, reminding us of the indiscriminate impact of COVID-19 on individuals across all walks of life. To prepare for future health challenges, analyzing health and medical data becomes paramount. For this purpose, a dataset comprising information from 303 individuals has been curated to facilitate insights and advance healthcare practices.

RESEARCH METHODOLOGY

We introduce the following Machine Learning algorithms used in predicting heart disease; SVM, Naïve Bayes, Decision Trees, Bagging and Boosting, and Random Forest, we also list some of the advantages and disadvantages of using these algorithms. Finally, we move forward using Random Forest Classifier algorithm for building our optimized model.

Dataset Information

Conclusion:

This study demonstrates that machine learning, especially the Random Forest algorithm with 93% accuracy, is highly effective in predicting heart disease. Other algorithms like Logistic Regression and SVM also showed promising results. Integrating these models into web-based platforms can aid early detection, leading to better patient outcomes and reduced healthcare costs. Future research should explore larger datasets, advanced preprocessing, and deep learning techniques to further enhance prediction accuracy and reliability in healthcare applications.

Technology Used:

Run Locally

Clone the project

  git clone https://parthadee.github.io/Healthcare-Research/

Model Training(Machine Learning)

def prdict_heart_disease(list_data):
    csv_file = Admin_Helath_CSV.objects.get(id=1)
    df = pd.read_csv(csv_file.csv_file)

    X = df[['age','sex','cp','trestbps','chol','fbs','restecg','thalach','exang','oldpeak','slope','ca','thal']]
    y = df['target']
    X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.8, random_state=0)
    nn_model = GradientBoostingClassifier(n_estimators=100,learning_rate=1.0,max_depth=1, random_state=0)
    nn_model.fit(X_train, y_train)
    pred = nn_model.predict([list_data])
    print("Neural Network Accuracy: {:.2f}%".format(nn_model.score(X_test, y_test) * 100))
    print("Prdicted Value is : ", format(pred))
    dataframe = str(df.head())
    return (nn_model.score(X_test, y_test) * 100),(pred)

Output Screen-shots

Dashboard in PoweBI