1. Data Collection
I have extracted data from the Health app on my iPhone for this analysis to identify data of abnormal exercise. Data ranges from 2018-2025.
Below is the list of available features I had to select from. I ended up choosing Steps (HKQuantityTypeIdentifierStepCount), Distanace (HKQuantityTypeIdentifierDistanceWalkingRunning), Speed (HKQuantityTypeIdentifierWalkingSpeed), and Flights Climbed (HKQuantityTypeIdentifierFlightsClimbed).
Available Features
['HKCategoryTypeIdentifierAppleStandHour',
'HKQuantityTypeIdentifierHeartRateVariabilitySDNN',
'HKQuantityTypeIdentifierActiveEnergyBurned',
'HKQuantityTypeIdentifierWalkingStepLength',
'HKQuantityTypeIdentifierWalkingSpeed',
'HKQuantityTypeIdentifierRestingHeartRate',
'HKQuantityTypeIdentifierFlightsClimbed',
'HKQuantityTypeIdentifierHeight',
'HKQuantityTypeIdentifierWalkingAsymmetryPercentage',
'HKQuantityTypeIdentifierBodyMass',
'HKQuantityTypeIdentifierAppleExerciseTime',
'HKQuantityTypeIdentifierWalkingHeartRateAverage',
'HKQuantityTypeIdentifierWalkingDoubleSupportPercentage',
'HKQuantityTypeIdentifierBasalEnergyBurned',
'HKQuantityTypeIdentifierHeadphoneAudioExposure',
'HKQuantityTypeIdentifierStepCount',
'HKQuantityTypeIdentifierHeartRate',
'HKQuantityTypeIdentifierAppleWalkingSteadiness',
'HKQuantityTypeIdentifiMerDistanceWalkingRunning',
'HKCategoryTypeIdentifierHighHeartRateEvent']
2. Preprocessing
Data was aggregated to daily values using preprocess.py.
Typically my first approach for finding anomalies in my data would be to find the mean and standard deviation, compute z-scores, and flag if the value is 3 standard deviations away from the mean. However this only works if the data is normally distributed. Let's first check distribution.

Histogram of Steps

Histogram of Distance

Histogram of Speed (mph)

Histogram of Flights Climbed
3. Data Adjustments
Looking at this, it seems as if Apple devices did not start tracking speed until late 2020, so records prior to that year were excluded to maintain consistency across all features.

Steps Over Time

Distance Over Time

Speed Over Time

Flights Over TIme
4. Model Training
Trained an Isolation Forest with a 5% contamination rate using model.py.
All functionality (data I/O, model training, inference) was modularized in separate scripts
(utils.py, preprocess.py, model.py, inference.py).
Here displays all datapoints and their anomaly scores:
Anomaly Scores
5. Results
The model flagged several anomalous days. Here are the anomalies plotted over time for each feature used.

Step Anomalies

Distance Anomalies

Speed Anomalies

Flights Anomalies
Here are scatterplots allowing us to see which features are influencing the anomaly detection. High and low ends for steps typically delineate an anomaly as well as low values for speed and high values for flights climbed.

Steps vs Distance

Steps vs Speed

Steps vs Flights

Speed vs Flights
Average Comparison: Normal vs. Anomalous Days
The table below summarizes average feature values for normal and anomalous days.
| Feature | Normal Avg | Anomalous Avg | Difference (%) |
|---|---|---|---|
| Steps | 4009.2 | 10,760.5 | +168% |
| Distance (miles) | 1.68 | 4.24 | +152% |
| Speed (mph) | 2.42 | 1.42 | -41% |
| Flights Climbed | 4.62 | 11.51 | +149% |
As a final sanity check, I wanted to see if my recent trip to Korea would flag any of the days as anomalies as I knew I did a lot of walking on the trip. As seen below 7/8 of the days were identified as an anomalous exercise day.
Korea Trip Exercise
6. Future Work: Toward MLOps
This project is modularized with scalability in mind. If I had access to cloud infrastructure, I would plan to extend it into an automated MLOps pipeline — enabling scheduled retraining, monitoring model drift, and automated anomaly inference on new health data. With the detection of an anomaly, a notification would be sent out telling the user to make sure to stay hydrated and get enough rest to recover.