Sitemap

Level Up Coding

Coding tutorials and news. The developer homepage gitconnected.com && skilled.dev && levelup.dev

How to Get Home Property Data to Analyze Your Market using Python

Python tutorial to get property data for on and off-market deals

8 min readMay 25, 2022

--

Press enter or click to view image in full size
Photo by Blake Wheeler on Unsplash

Property data is awesome to analyze.

But how do we get property data without being a real estate agent?

We need to use web scraping to retrieve home data from public real estate sites.

Can we do better? Web scraping is tedious!

Yes! We can use APIs that web scrape public facing real estate sites already for us. We can query and consume the structured data.

What type of data can we get?

In this article, we will get over 200+ data points for a list of properties including number of bedrooms, square footage, lot size, Zestimate, and more.

This post will get data for properties that are for sale (on-market) and properties that are not for sale (off-market) using Python.

Press enter or click to view image in full size
Photo by Campaign Creators on Unsplash

Problem Statement

We have a list of property addresses. We need to get property detail for each address.

The property detail will include home characteristics, sold history, tax data, and property estimates.

This will allow us to analyze properties in our real estate market.

Can we append property information to our list?

Yes, we can.

Data Source

We will use the Zillow.com API by APIMaker to get property detail.

This API already does the web scraping for us. It provides property information through several end points.

Disclosure: I am not the creator of the API, I am solely a consumer.

Framework

We will follow a four-step framework to gather property information.

Press enter or click to view image in full size
Image by author created in Google Slides
  1. Upload a file with a list of property addresses
  2. Find the associated Zillow unique id (ZPID)
  3. Search ZPID and get property details from the API
  4. Append property details to our original file

Prerequisite

  1. Sign up for free Rapid API account to get an API key
  2. Subscribe to Zillow.com API to request property data

The Zillow.com API provides an option to subscribe for 20 FREE API Credits / mo (one API Credit = one API Call).

Press enter or click to view image in full size
Image by author (screenshot from Rapid API pricing)

Supporting Video

Follow along in my Python tutorial video.

Video created by author on YouTube

Python Tutorial

If you do not have an existing Python environment, then I highly suggest to first clone the notebook (at the bottom of the article).

This will allow you to run the Python code in Google Colab (free!). It is a cloud-based environment that lets you run code without having to install Python locally.

I. Install Packages

The first step is installing the necessary packages.

Press enter or click to view image in full size
Code snippet for Installing Packages (Image by author created using snappify.io)

II. Import Libraries

Next, import the required libraries.

Press enter or click to view image in full size
Code snippet for Imports (Image by author created using snappify.io)

III. Locals & Constants

Sign up for a free RapidAPI account and subscribe to Zillow.com API.

Create a variable to hold our API key.

Press enter or click to view image in full size
Code snippet for API key (Image by author created using snappify.io)

IV. Data

Single Property Search

Let’s start off with retrieving property data for a single address.

To request data from the API we need to provide the zpid of the address. We will get this ID by replicating a google search.

Steps:

  1. Get ZPID (unique identifier for each property stored in the URL)
  2. Get Property Details data
Press enter or click to view image in full size
Image by author (screenshot from Zillow.com)

Let’s generate our property search phrase to enter into the google search function.

We add “ zillow home details” in our search string in order to get the Zillow link at the top of our list of URLs.

Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)
Press enter or click to view image in full size
Code output (Image by author created as a screenshot)

Input the query string in the google search function and set the stop value to “3” in order to return the top three search results.

Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

Select the first URL, which is the most relevant in the search.

This returns the unique URL for our property. The ZPID is located at the end of the URL string.

Press enter or click to view image in full size
Code output (Image by author created as a screenshot)

Let’s extract the ZPID in a few steps wrapped in one line of code:

  1. Split our URL by “/”
  2. Search for the object with “zpid” in the string
  3. Split the object by “_” to get the ZPID
Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

Here is our unique ID to pass into our API.

Code output (Image by author created as a screenshot)

We now make a request to the API to get data on our property address.

Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

We transform our response into a JSON format.

This outputs a lengthy set of data.

Press enter or click to view image in full size
Code output (Image by author created as a screenshot)

Let’s transform this dataset into a pandas dataframe (rows and columns).

This will allow us to view our data in table format that we can download later on.

Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

For our single property address, we have 259 columns of data! Wow!

Press enter or click to view image in full size
Code output (Image by author created as a screenshot)

Let’s select a few columns from our dataset to view.

Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

We have information on home characteristics as well as property estimates.

Press enter or click to view image in full size
Code output (Image by author created as a screenshot)

Check out my post on how to calculate cash flow based on property estimates.

List of Properties

We tested our framework to get data for a single property.

Now, it is time to upload our own list of properties and get data for each row.

Steps:

  1. Upload CSV file — Check out PropStream for on and off-market deals
  2. Get ZPID (unique identifier for each property stored in the URL)
  3. Get Property Details data
Press enter or click to view image in full size
Photo by Michael Tuszynski on Unsplash

For this example, I uploaded a file of property addresses for tax delinquent owners. I downloaded this data from PropStream.

Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

In the file, we have a list of properties. There are four columns related to property address — Address, City, State, and Zip.

Let’s pass the property address columns into our code!

Press enter or click to view image in full size
Code output (Image by author created as a screenshot)

Functions

We need to recreate the same steps we performed to get data for our single property.

Let’s set up functions to repeat the process of retrieving the unique ID and requesting data from the API.

Function #1 — Get ZPID using Google Search

Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

Function #2 — Get property detail from the API

Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

Here we set up a for loop to perform actions for each row in our spreadsheet.

Steps:

  1. Map address related columns to variables (street, city, state, zip_code)
  2. Call Function #1 to get the ZPID
  3. Pause script to not overwhelm the Google Search requests
  4. Call Function #2 to get property detail from the API
  5. Transform the JSON object to a dataframe and append it to a list
Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

We have a list of 5 dataframes in our df_list object. Each dataframe represents the response we received from the property details API.

Let’s concatenate these dataframes to create one single table.

Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

This looks great! We have all the property detail information like Zestimate in a single table.

But, how do we merge this new table with our original dataset?

Press enter or click to view image in full size
Code output (Image by author created as a screenshot)

Let’s merge our original and new dataframe on the unique column — ZPID.

Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

This gives us a dataframe of 300+ columns.

Definitely not user friendly!

Press enter or click to view image in full size
Code output (Image by author created as a screenshot)

Let’s trim down the number of columns in our merge by selecting a subset of the property details columns.

Imagine that for our use case we only need property estimates to calculate metrics like cash flow.

We select 3 columns: ZPID, Zestimate, and rentZestimate.

Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

Now we have a trimmed down our dataset to 44 columns.

We can see our new columns appended at the end of our dataframe— Zestimate and rentZestimate.

Press enter or click to view image in full size
Code output (Image by author created as a screenshot)

V. Visualize

By creating a Plotly box plot we can view the distribution of the Zestimate values.

Press enter or click to view image in full size
Code snippet (Image by author created using snappify.io)

The property estimates in our dataset range from 200K to 560K.

This can help us target certain properties over others.

Press enter or click to view image in full size
Code output (Image by author created as a screenshot)

VI. Automation

Check out my no code solution to upload your custom file and get property details.

Video output (Video by author on AnalyticsAriel YouTube channel)

Conclusion

Leveraging APIs are great way to retrieve property data.

Using property datasets alongside economic data from the Census can provide insight on how your real estate market is performing and what future trends exist.

Check out my YouTube channel — AnalyticsAriel to get more insight on real estate data sources and data analytics!

Clone notebook

Sources

--

--

Level Up Coding
Level Up Coding
Ariel Herrera
Ariel Herrera

No responses yet