This project uses simple linear regression to predict housing prices based on square footage. It is a beginner-friendly machine learning model built with Python, following best practices in data preprocessing, model training, and evaluation.
predict-home-prices/
├── Project1.py # Python script with full code
├── home_dataset.csv # Dataset containing house prices and square footage
└── README.md # Project documentation
The dataset used (home_dataset.csv) contains two columns:
area– Size of the house in square feetprice– Price of the house in INR (lakhs)
This is a small synthetic dataset, ideal for learning the basics of linear regression.
- Python (3.x)
- Pandas
- NumPy
- Matplotlib
- Scikit-learn (for Linear Regression)
- Load the dataset using Pandas
- Visualize the relationship between house area and price
- Train a linear regression model using scikit-learn
- Predict price for a house with a given area
- Plot the regression line over the scatterplot
- Export model using
joblib(optional)
Raw Data Plot (House Prices vs. Size): Shows the actual distribution of the dataset before model fitting.
Linear Regression Fit: Below is the linear regression output showing the relationship between house size (sq.ft) and price (millions $): The red line shows the predicted house prices based on the linear model.
- Applied Linear Regression using
scikit-learn - Understood the relationship between variables (area vs. price)
- Practiced data visualization with Matplotlib
- Explored basic model evaluation and prediction
- Use multiple linear regression with more features (e.g., number of bedrooms, location)
- Implement model evaluation metrics like R² score and RMSE
- Deploy the model using Flask or Streamlit
Based on the tutorial by Codédex

