ML-powered Platform for Early Disease Detection

undefined

Highlights

Early AFib detection

80% precision

Explainable ML models

Adaptability for new use cases

Customer location
  • Israel Israel
Project duration
  • 4 months

Healthtech’s new frontier: Reshaping diagnostics

Unnecessary healthcare expenses, excess use of resources, let alone premature mortality – these are the direct results of missed or delayed diagnoses.

Our client, an Israel-based healthtech startup, is leading the charge against the problem. With a philosophy rooted in the age-old wisdom that prevention is better than cure, the startup is harnessing the power of tech to help physicians detect diseases at an early, asymptomatic stage.

To power their innovation efforts, the client needed top-tier expertise in machine learning for healthcare to sift through the vast expanse of healthcare data accumulated over multiple years, uncovering patterns and valuable insights. That’s exactly what Symfa helped with. Not only did we bring cutting-edge machine learning development capabilities to the table, our experts had experience in the healthcare domain, which made us a perfect match for the challenge.

Solving the healthcare data puzzle

The project’s challenges were directly related to the very nature of healthcare data.

Turning health data into diagnosis

From removing imbalances in the dataset to identifying and training the best ML models for the task at hand, the team delivered a robust solution set to become an invaluable tool in the physicians’ toolkit.

Work done

01

Balancing the data

To rebalance the dataset, our ML engineers leveraged upsampling and downsampling, increasing the minority class while decreasing the majority class. Another approach to the problem at hand was to focus on the most recent two years out of the 20-year data span, as this period contained the bulk of the data.

02

Feature construction

Once the original descriptors in the dataset were defined, the team started feature engineering – building a new set of around 2,000 features and variables to be further used in an ML model.

03

Model training

Among a great variety of ML approaches, our experts focused on more explainable models like decision trees (random forest) and regression models (XGBoost) as well as the SHAP method to increase transparency and predictability of the resulting ML algorithms.

04

Reports preparation

Alongside with model training, our engineers compiled reports that included confusion matrices and precision-recall curves to help healthcare professionals identify true positive and true negative cases.

Technologies

  • Pandas
  • Dask
  • Featuretools
  • tsfresh
  • random forest
  • XGBoost
  • TabNet
  • SHAP
  • StackingClassifier

A robust ML model that can be easily adapted to new use cases

The developed ML solution can reliably and accurately detect cases of atrial fibrillation (AFib), enabling physicians to prevent stroke and heart failures, while accelerating access to crucial treatment for patients. In addition, the model can be easily trained on new data to expand its area of application and help the client gain a stronger foothold on the healthtech market.

80%precision
20%recall
20years of data for sampling
2,000descriptors for feature construction

Latest projects

BACK TO PORTFOLIO

Contact us

Our team will get back to you promptly to discuss the next steps