TDM 20200: Mapping: Project 8 — Spring 2025

Motivation: We continue to learn how to plot maps in Python.

Context: We will focus on maps of Tippecanoe County.

Scope: Mapping

Learning Objectives:
  • We continue to learn how to make maps in Python

Make sure to read about, and use the template found here, and the important information about project submissions here.

Dataset(s)

In this project we will use the following mapping data:

  • /anvil/projects/tdm/data/tippecanoe/

Questions

Load Pandas as Geopandas as follows:

import pandas as pd
import geopandas as gpd

Question 1 (2 pts)

Review the examples from Project 7, Questions 3, 4, 5, where we learned how to put colors into a map.

Read the map data for Tippecanoe County addresses into a variable called mydata as follows:

mydata = gpd.read_file('/anvil/projects/tdm/data/tippecanoe/AddressPoints.shp')

Using pd.set_option('display.max_columns', None) you will have the ability to see all of the columns.

Display the head of mydata.

What is the shape of mydata (i.e., how many rows and columns are there)?

Create a new column called mycolors, which you should make 'orange' by default.

For all of the rows in which the GEOCITY column is equal to 'WEST LAFAYETTE', change the value of mycolors to 'green'.

For all of the rows in which the GEOCITY column is equal to 'LAFAYETTE', change the value of mycolors to 'purple'.

Plot all of the boundaries in mydata using the orange, green, and purple colors that you just created, as follows: mydata.plot(color = mydata['mycolors'])

Deliverables
  • Display the head of mydata.

  • Give the shape of mydata (i.e., the number of rows and columns).

  • Plot all of the boundaries in mydata using orange, green, and purple colors, as specified above.

  • Be sure to document your work from Question 1, using some comments and insights about your work.

Question 2 (2 pts)

Read the map data for Tippecanoe County boundaries into a variable called mydata as follows:

mydata = gpd.read_file('/anvil/projects/tdm/data/tippecanoe/Boundaries.shp')

Display the head of mydata.

What is the shape of mydata (i.e., how many rows and columns are there)? Notice that this data set is much smaller than the AddressPoints data from Question 1.

Notice that there is only 1 row in mydata that has Shape_Area bigger than 1 billion. In other words, if we create mymysteryDF = mydata[mydata['Shape_Area'] > 1000000000] then mymysteryDF is a data frame with just 1 row and 14 columns. Plot the region from mymysteryDF.

Deliverables
  • Display the head of mydata.

  • Give the shape of mydata (i.e., the number of rows and columns).

  • Plot the boundary of the 1 shape which has 'Shape_Area' larger than 1 billion.

  • Be sure to document your work from Question 2, using some comments and insights about your work.

Question 3 (2 pts)

Read the map data for Tippecanoe County parcels into a variable called mydata as follows:

mydata = gpd.read_file('/anvil/projects/tdm/data/tippecanoe/Parcels.shp')

Display the head of mydata.

What is the shape of mydata (i.e., how many rows and columns are there)?

This data looks really beautiful in its natural state. Go ahead and plot the data from the file first, before making any modifications.

After you plot the data in its natural blue condition, then (just like we did in Question 1) make a new column called mycolors, which you should make 'orange' by default.

For all of the rows in which the PROP_CITY column is equal to 'WEST LAFAYETTE', change the value of mycolors to 'green'.

For all of the rows in which the PROP_CITY column is equal to 'LAFAYETTE', change the value of mycolors to 'purple'.

Plot all of the boundaries in mydata using the orange, green, and purple colors that you just created, as follows: mydata.plot(color = mydata['mycolors'])

Deliverables
  • Display the head of mydata.

  • Give the shape of mydata (i.e., the number of rows and columns).

  • First plot the data in its natural blue condition.

  • Afterwards, plot the data again, but this time, displaying mydata using orange, green, and purple colors, as specified above.

  • Be sure to document your work from Question 3, using some comments and insights about your work.

Question 4 (2 pts)

Read the map data for Tippecanoe County parcels into a variable called mydata as follows:

mydata = gpd.read_file('/anvil/projects/tdm/data/tippecanoe/StreetCenterlines.shp')

Display the head of mydata.

What is the shape of mydata (i.e., how many rows and columns are there)?

Just like we did in Question 1, make a new column called mycolors, which you should make 'orange' by default.

For all of the rows in which the GEOCITYRIG column is equal to 'WEST LAFAYETTE', change the value of mycolors to 'green'.

For all of the rows in which the GEOCITYRIG column is equal to 'LAFAYETTE', change the value of mycolors to 'purple'.

Plot all of the roads in mydata using the orange, green, and purple colors that you just created, as follows: mydata.plot(color = mydata['mycolors'])

Deliverables
  • Display the head of mydata.

  • Give the shape of mydata (i.e., the number of rows and columns).

  • Plot mydata using orange, green, and purple colors, as specified above.

  • Be sure to document your work from Question 4, using some comments and insights about your work.

Question 5 (2 pts)

Go back to any of the 4 data sets from Questions 1, 3, 4, and make a plot of your own choosing, but instead of highlighting the maps according to the cities, this time (please) highlight something about the zip codes in the maps.

In the data from Question 1, the zip codes are stored in Post_Code and ESRI_ZIP and DLGF_PRO_1 and GEOZIP.

(The data from Question 2 does not have zip codes.)

In the data from Question 3, the zip codes are stored in PROP_ZIP AND DLGF_PRO_1 and ESRI_ZIP.

In the data from Question 4, the zip codes are stored in PostCode_L AND PostCode_R and TIGER_ZIPL and TIGER_ZIPR and ESRI_ZIP and GEOZIPLEFT and GEOZIPRIGH.

Deliverables
  • Make a map of your own choosing, highlighting something about the zip codes from 1 of the maps listed above, and using 1 of the zip code columns.

  • Be sure to document your work from Question 5, using some comments and insights about your work.

Submitting your Work

Please make sure that you added comments for each question, which explain your thinking about your method of solving each question. Please also make sure that your work is your own work, and that any outside sources (people, internet pages, generating AI, etc.) are cited properly in the project template.

Congratulations! Assuming you’ve completed all the above questions, you are learning to apply your web scraping knowledge effectively!

Prior to submitting your work, you need to put your work into the project template, and re-run all of the code in your Jupyter notebook and make sure that the results of running that code is visible in your template. Please check the detailed instructions on how to ensure that your submission is formatted correctly. To download your completed project, you can right-click on the file in the file explorer and click 'download'.

Once you upload your submission to Gradescope, make sure that everything appears as you would expect to ensure that you don’t lose any points. We hope your first project with us went well, and we look forward to continuing to learn with you on future projects!!

Items to submit
  • firstname_lastname_project8.ipynb

It is necessary to document your work, with comments about each solution. All of your work needs to be your own work, with citations to any source that you used. Please make sure that your work is your own work, and that any outside sources (people, internet pages, generating AI, etc.) are cited properly in the project template.

You must double check your .ipynb after submitting it in gradescope. A very common mistake is to assume that your .ipynb file has been rendered properly and contains your code, markdown, and code output even though it may not.

Please take the time to double check your work. See here for instructions on how to double check this.

You will not receive full credit if your .ipynb file does not contain all of the information you expect it to, or if it does not render properly in Gradescope. Please ask a TA if you need help with this.