App Interface#
View app: Streamlit
About the project#
Datasource: Kaggle
In this project, we will use linear regression with the scikit-learn
and XGBoost
libraries in Python to predict housing prices in Melbourne. The goal is to predict house prices based on user-provided data.
We will use data cleaning and preprocessing techniques to prepare the data for analysis.
Next, we will use linear regression to model the relationship between house characteristics and its market price. To assess the model’s quality, we will use metrics such as R² and MSE.
Finally, we create a web application using Streamlit that can predict the value of a property based on user-provided data. We will analyze the results and understand the project’s business vision, which can be useful for real estate professionals, investors, and even individuals looking to buy a home.
Data Description#
The variables in the dataset are:
Rooms
: Number of rooms.Price
: Price in dollars.Method
: S - property sold; SP - property sold prior; PI - property passed in; PN - sold prior not disclosed; SN - sold not disclosed; NB - no bid; VB - vendor bid; W - withdrawn prior to auction; SA - sold after auction; SS - sold after auction price not disclosed; N/A - price or highest bid not available.Type
: br - bedroom(s); h - house/cottage/villa/semi/terrace; u - unit/duplex; t - townhouse; dev site - development site; o res - other residential.SellerG
: Real estate agent.Date
: Date sold.Distance
: Distance from CBDRegionname
: General Region (West, North West, North, North East etc.)Propertycount
: Number of properties that exist in the suburb.Bedroom2
: Scraped # of bedrooms (from different source)Bathroom
: Number of bathroomsCar
: Number of carspotsLandsize
: Land sizeBuildingArea
: Building sizeCouncilArea
: Governing council for the area
Project Steps#
- Importing libraries and loading dataset
- Data Preprocessing
- Exploratory Data Analysis
- Model training and evaluation with
XGBoost
- Visualizing predictions
- Making sample prediction
- Saving model using
joblib