Airport Taxi Demand Forecasting
Hourly time-series forecast for airport taxi demand using lag features, seasonality analysis, and temporal validation. Benchmarked ARIMA, random forest, and XGBoost to beat the RMSE target.
Key metrics.
Random Forest
strongest final result in the notebook
42.9
below the project threshold
<= 48
required forecast quality bar
1 hour
predicting next-hour demand
What the project tries to solve.
Forecast the number of taxi orders in the next hour so airport driver supply can be positioned more effectively during demand spikes.
Status: Study project, being refactored for portfolio use.
Notebook: airport_taxi_demand_forecasting.ipynb
Repo path: data-science-projects/airport-taxi-demand-forecasting
This is the clearest first notebook to turn into a polished case study because the business question is easy to understand, the forecast horizon is concrete, and the validation story naturally teaches good time-series practice.
This project shows time-series judgment rather than generic tabular modeling. It highlights temporal feature engineering, seasonality awareness, and the difference between proper forecasting validation and ordinary shuffled cross-validation.
How I approached it.
Built lagged and calendar-based features from hourly order history.
Explored seasonality, trend, and autocorrelation to shape the forecasting setup.
Benchmarked ARIMA, random forest, and XGBoost against the same RMSE target.
Used temporal validation rather than ordinary shuffled cross-validation to preserve forecasting integrity.
What I would improve next.
Rewrite the notebook into a tighter narrative with clearer conclusions attached to each chart.
Add a cleaner walk-forward validation section and compare against a stronger naive baseline.
Package the best model into a simple forecast endpoint or dashboard mockup.