TDM 20200: Mapping: Project 8 — Spring 2025
Dataset(s)
In this project we will use the following mapping data:
-
/anvil/projects/tdm/data/tippecanoe/
Questions
Load Pandas as Geopandas as follows:
import pandas as pd
import geopandas as gpd
Question 1 (2 pts)
Review the examples from Project 7, Questions 3, 4, 5, where we learned how to put colors into a map.
Read the map data for Tippecanoe County addresses into a variable called mydata
as follows:
mydata = gpd.read_file('/anvil/projects/tdm/data/tippecanoe/AddressPoints.shp')
Using pd.set_option('display.max_columns', None)
you will have the ability to see all of the columns.
Display the head of mydata
.
What is the shape of mydata
(i.e., how many rows and columns are there)?
Create a new column called mycolors
, which you should make 'orange'
by default.
For all of the rows in which the GEOCITY
column is equal to 'WEST LAFAYETTE'
, change the value of mycolors
to 'green'
.
For all of the rows in which the GEOCITY
column is equal to 'LAFAYETTE'
, change the value of mycolors
to 'purple'
.
Plot all of the boundaries in mydata
using the orange, green, and purple colors that you just created, as follows: mydata.plot(color = mydata['mycolors'])
-
Display the head of
mydata
. -
Give the shape of
mydata
(i.e., the number of rows and columns). -
Plot all of the boundaries in
mydata
using orange, green, and purple colors, as specified above. -
Be sure to document your work from Question 1, using some comments and insights about your work.
Question 2 (2 pts)
Read the map data for Tippecanoe County boundaries into a variable called mydata
as follows:
mydata = gpd.read_file('/anvil/projects/tdm/data/tippecanoe/Boundaries.shp')
Display the head of mydata
.
What is the shape of mydata
(i.e., how many rows and columns are there)? Notice that this data set is much smaller than the AddressPoints data from Question 1.
Notice that there is only 1 row in mydata
that has Shape_Area
bigger than 1 billion. In other words, if we create mymysteryDF = mydata[mydata['Shape_Area'] > 1000000000] then mymysteryDF
is a data frame with just 1 row and 14 columns. Plot the region from mymysteryDF
.
-
Display the head of
mydata
. -
Give the shape of
mydata
(i.e., the number of rows and columns). -
Plot the boundary of the 1 shape which has
'Shape_Area'
larger than 1 billion. -
Be sure to document your work from Question 2, using some comments and insights about your work.
Question 3 (2 pts)
Read the map data for Tippecanoe County parcels into a variable called mydata
as follows:
mydata = gpd.read_file('/anvil/projects/tdm/data/tippecanoe/Parcels.shp')
Display the head of mydata
.
What is the shape of mydata
(i.e., how many rows and columns are there)?
This data looks really beautiful in its natural state. Go ahead and plot the data from the file first, before making any modifications.
After you plot the data in its natural blue condition, then (just like we did in Question 1) make a new column called mycolors
, which you should make 'orange'
by default.
For all of the rows in which the PROP_CITY
column is equal to 'WEST LAFAYETTE'
, change the value of mycolors
to 'green'
.
For all of the rows in which the PROP_CITY
column is equal to 'LAFAYETTE'
, change the value of mycolors
to 'purple'
.
Plot all of the boundaries in mydata
using the orange, green, and purple colors that you just created, as follows: mydata.plot(color = mydata['mycolors'])
-
Display the head of
mydata
. -
Give the shape of
mydata
(i.e., the number of rows and columns). -
First plot the data in its natural blue condition.
-
Afterwards, plot the data again, but this time, displaying
mydata
using orange, green, and purple colors, as specified above. -
Be sure to document your work from Question 3, using some comments and insights about your work.
Question 4 (2 pts)
Read the map data for Tippecanoe County parcels into a variable called mydata
as follows:
mydata = gpd.read_file('/anvil/projects/tdm/data/tippecanoe/StreetCenterlines.shp')
Display the head of mydata
.
What is the shape of mydata
(i.e., how many rows and columns are there)?
Just like we did in Question 1, make a new column called mycolors
, which you should make 'orange'
by default.
For all of the rows in which the GEOCITYRIG
column is equal to 'WEST LAFAYETTE'
, change the value of mycolors
to 'green'
.
For all of the rows in which the GEOCITYRIG
column is equal to 'LAFAYETTE'
, change the value of mycolors
to 'purple'
.
Plot all of the roads in mydata
using the orange, green, and purple colors that you just created, as follows: mydata.plot(color = mydata['mycolors'])
-
Display the head of
mydata
. -
Give the shape of
mydata
(i.e., the number of rows and columns). -
Plot
mydata
using orange, green, and purple colors, as specified above. -
Be sure to document your work from Question 4, using some comments and insights about your work.
Question 5 (2 pts)
Go back to any of the 4 data sets from Questions 1, 3, 4, and make a plot of your own choosing, but instead of highlighting the maps according to the cities, this time (please) highlight something about the zip codes in the maps.
In the data from Question 1, the zip codes are stored in Post_Code
and ESRI_ZIP
and DLGF_PRO_1
and GEOZIP
.
(The data from Question 2 does not have zip codes.)
In the data from Question 3, the zip codes are stored in PROP_ZIP
AND DLGF_PRO_1
and ESRI_ZIP
.
In the data from Question 4, the zip codes are stored in PostCode_L
AND PostCode_R
and TIGER_ZIPL
and TIGER_ZIPR
and ESRI_ZIP
and GEOZIPLEFT
and GEOZIPRIGH
.
-
Make a map of your own choosing, highlighting something about the zip codes from 1 of the maps listed above, and using 1 of the zip code columns.
-
Be sure to document your work from Question 5, using some comments and insights about your work.
Submitting your Work
Please make sure that you added comments for each question, which explain your thinking about your method of solving each question. Please also make sure that your work is your own work, and that any outside sources (people, internet pages, generating AI, etc.) are cited properly in the project template.
Congratulations! Assuming you’ve completed all the above questions, you are learning to apply your web scraping knowledge effectively!
Prior to submitting your work, you need to put your work into the project template, and re-run all of the code in your Jupyter notebook and make sure that the results of running that code is visible in your template. Please check the detailed instructions on how to ensure that your submission is formatted correctly. To download your completed project, you can right-click on the file in the file explorer and click 'download'.
Once you upload your submission to Gradescope, make sure that everything appears as you would expect to ensure that you don’t lose any points. We hope your first project with us went well, and we look forward to continuing to learn with you on future projects!!
-
firstname_lastname_project8.ipynb
It is necessary to document your work, with comments about each solution. All of your work needs to be your own work, with citations to any source that you used. Please make sure that your work is your own work, and that any outside sources (people, internet pages, generating AI, etc.) are cited properly in the project template. You must double check your Please take the time to double check your work. See here for instructions on how to double check this. You will not receive full credit if your |