19 Creating Maps Showing the Evolution of World City Populations

This practical will outline how to create a loop which produces a series of maps from historical population data. This data has been sourced from http://www.nature.com/articles/sdata201634

First we need to load in the packages that this practical will require. If you cannot load them in - check that you have them installed and run the library() function again.

#Load Packages 

library("ggplot2")
library("ggthemes")
library("rworldmap")
#> ### Welcome to rworldmap ###
#> For a short introduction type :   vignette('rworldmap')
library("classInt")
library("gridExtra")
library("grid")
library("cowplot")
#> 
#> ********************************************************
#> Note: As of version 1.0.0, cowplot does not change the
#>   default ggplot2 theme anymore. To recover the previous
#>   behavior, execute:
#>   theme_set(theme_cowplot())
#> ********************************************************
#> 
#> Attaching package: 'cowplot'
#> The following object is masked from 'package:ggthemes':
#> 
#>     theme_map

Next, lets set our working directory and load in the data

#Set your working directory 

#Load in Popultion Data
cities <- read.csv("worksheet_data/world_city_pop/alldata.csv")

If we have a brief look at the data using the head() function, we can get an idea of what this looks like. Unfortunately the data is not correctly formatted, so we need to make sure that our coordinate columns are numeric

#Make coordinates Numeric 
cities$X <- as.numeric(as.vector(cities$Longitude))
cities$Y <- as.numeric(as.vector(cities$Latitude))

For the maps we will create, we need a basemap, onto which we can plot the population points. To do so, we use the rworldmap package.

#Get Basemap 
world <- fortify((getMap(resolution = "low")))

#Create the basemap 
base <- ggplot() +
  geom_map(data = world, map = world, aes(x = long, y = lat, map_id = id, group = group), fill = "#CCCCCC") +
  theme_map()

We will now conduct some data preparation to make sure our data is in the appropriate format before creating a loop. In this example, we would like to create a map which shows the populations in each city for each century. Therefore, we create a century column, which we will use in our loop to aggregate population counts at teh century level. The century_name column allows us to give each map a title with the appropriate century. We also create a list called size_breaks, which we call when plotting each map, and will depict the breaks in the populiaton bubble sizes.

#Create New Column for Centuries & Century Names 
cities$century <- as.numeric(substr(cities$year,1,nchar(cities$year)-2))
cities$century[is.na(cities$century)] <- 0
cities$century_name <- as.numeric(paste0(substr(abs(cities$year),1,nchar(abs(cities$year))-2), "00"))
cities$century[which(cities$century == 00)] <- 0

#Size Breaks 
size_breaks <- c(10000, 50000, 100000,500000,1000000,5000000,10000000)

Now we can create the loop which will create a series of maps in an output folder. Before running the loop - it is essentail that you have this ‘outputs’ folder in your working direct so that with each iteration of the loop, a new map will be saved into this folder. The Loop we create plots the population for the most recent data that we have avaibale for each city. If there is no population data available for the century in question, the last recorded population is kept and mapped in the the visualisation process.

count <- 1
for(i in unique(cities$century)){
  century<-cities[which(cities$century==i),]
  
  if (count==1)
    {
      Data<-cities[which(cities$century==i),]
      Data <- aggregate(Data$pop, by=list(Data$City, Data$X, Data$Y), FUN="mean")
      names(Data)<-c("City","X","Y","pop")
    }else{
      New_Data<-cities[which(cities$century==i),]
      #replace old rows
      Data2<-merge(Data, New_Data, by="X", all=T)
      
      New_Vals<-Data2[!is.na(Data2$pop.y),c("City.y","X","Y.y","pop.y")]
      names(New_Vals)<-c("City","X","Y","pop")
      
      Old_Vals<-Data2[is.na(Data2$pop.y),c("City.x","X","Y.x","pop.x")]
      names(Old_Vals)<-c("City","X","Y","pop")
      
      Data<-rbind(Old_Vals,New_Vals)
      
    }
  
  Data <- aggregate(Data$pop, by=list(Data$City, Data$X, Data$Y), FUN="mean")
  names(Data)<-c("City","X","Y","pop")
  
  century_name <- paste0(cities$century_name[which(cities$century == i)])
  
  if(i<0)
  {
    title <- paste0("Cities Population in the Year ",century_name," BC")
  }else{
    title <- paste0("Cities Population in the Year ",century_name," AD")
  }
  
  Map<-base+
    geom_point(data=Data, aes(x=X, y=Y, size=pop), colour="#9933CC",alpha=0.3, pch=20)+
    scale_size(breaks=size_breaks,range=c(2,14), limits=c(min(cities$pop), max(cities$pop)),labels=c("10K","50k","100K","500k","1M","5M","10M+"))+
    coord_map("mollweide", ylim=c(-55,80),xlim=c(-180,180))+
    theme(legend.position="bottom",legend.direction="horizontal",legend.justification="center",legend.title=element_blank(),plot.title = element_text(hjust = 0.5,face='bold',family = "Helvetica"))+ggtitle(title)+guides(size=guide_legend(nrow=1))
  
  png(paste0("outputs/",i,"_moll.png"), width=30,height=20, units="cm", res=150)
  print(Map)
  dev.off()
  
  count <- count + 1
}

And there you go. You’ve create a loop which, through each iternation, creates a map of the population in cities for each century we have data. Make sure you understand what the loop is doing through each step. Could draw a flow diagram for this loop? We can put all of the images that this loop produces into a GIF creater, to help us create a cool animation to show the evolution of city populations.