Using R to create a word cloud from my PhD thesis text

Median Property Prices 2005-16 - PART 2!

A few weeks back I made a blog post with this nice little .gif below, of change over time in Median Melbourne Property Prices ($) from 2005-2016 - see my previous blog on 29 Sep 2016 :

Well I’ve just come back to looking at that data set and this time I’ve plotted the % change per annum and overall, and also absolute $ change from 2005-2016 on some interactive plots. These plots allow you to zoom in, hover over a suburb to see more info, or click on a suburb to open a new window and explore that suburb in more detail.

The R code I used to make the plots below is here.

Explore below, it’s interesting to see that SYNDAL has the greatest per annum and overall % growth, however it’s TOORAK that has by far has the highest absolute $ growth over the same period of time.

The ggiraph package gives ggplot2 nice reactivity to user input

Using the plotly package to give your ggplot2 plots simple reactivity to user input

First make up some fake revenue data for a company with a number of shops operating in each State from 2012 to 2015:

### Install/load required packages
#List of R packages required for this analysis:
required_packages <- c("ggplot2", "stringr", "plotly", "dplyr")
#Install required_packages:
new.packages <- required_packages[!(required_packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
#Load required_packages:
lapply(required_packages, require, character.only = TRUE)

#Set decimal points and disable scientific notation
options(digits=3, scipen=999) 

#Make up some fake data
df<-data_frame(state=rep(c("New South Wales", 
                 "Victoria", 
                 "Queensland",
                 "Western Australia",
                 "South Australia",
                 "Tasmania"), 36)) %>%
    group_by(state) %>%
    mutate(year=c(rep(2012, 9), rep(2013,9),rep(2014, 9),rep(2015, 9))) %>%
    group_by(state, year) %>%
    mutate(`store ID` = str_c("shop_#",as.character(seq_along(state)))) %>%
    group_by(state, year, `store ID`) %>%
    mutate(`Revenue ($)` =  ifelse(state=="New South Wales", sample(x=c(1000000:9000000), 1),
                            ifelse(state=="Victoria", sample(x=c(1000000:7000000), 1),
                            ifelse(state=="Queensland", sample(x=c(1000000:5000000), 1),
                            ifelse(state=="Western Australia",sample(x=c(100000:2000000), 1),
                            ifelse(state=="South Australia",sample(x=c(100000:900000), 1),       
                            ifelse(state=="Tasmania", sample(x=c(100000:2000000), 1), NA)))))))

Now visualise this data using ggplot:

ggplot(df, aes(state, `Revenue ($)`, colour=state, label = `store ID`)) +
    geom_boxplot() + 
    geom_point() +
    theme(axis.title.x =  element_blank(),
          axis.text.x  =  element_blank(), 
          axis.title.y = element_text(face="bold", size=12),
          axis.text.y  = element_text(angle=0, vjust=0.5, size=11),
          legend.title = element_text(size=12, face="bold"),
          legend.text = element_text(size = 12, face = "bold"),
          plot.title = element_text(face="bold", size=14)) + 
    ggtitle("Store Revenue per State from 2012 to 2015") +
    facet_wrap(~year)

Now make the plot reactive to the user’s mouse by wrapping plotly’s ggplotly() function around it:

p<-ggplotly(ggplot(df, aes(state, `Revenue ($)`, colour=state, label = `store ID`)) +
    geom_boxplot() + 
    geom_point() +
    theme(axis.title.x =  element_blank(),
          axis.text.x  =  element_blank(), 
          axis.title.y = element_text(face="bold", size=12),
          axis.text.y  = element_text(angle=0, vjust=0.5, size=10),
          legend.title = element_text(size=12, face="bold"),
          legend.text = element_text(size = 12, face = "bold"))+
    facet_wrap(~year))


##Publish to plotly
# plotly_POST(p, filename = "dans_plotly_example")

This type of simple plot made using plotly and ggplot2 in R are great because they have some basic “reactivity” to user input, (e.g. hover mouse over data point and lable appears with info. about data point like “store ID”” for example), but they do not need to be hosted on a server - they are simple enough to be knitted into a stand-alone HTML document.

Median Melbourne Property Prices ($) from 2005-2016

I stumbled across an interesting raw dataset from Victorian Government, Australia today, It had house and apartment prices in Melbourne, Australia from 2005-March 2016 http://www.dtpli.vic.gov.au/property-and-land-titles/property-information/property-prices

Thought it worth a look…

So I wrote some R code to import the data from excel spreadsheet, tidy it, and then make this animated plot which I think is cool because you can get some insight from it in a much faster than just looking through the raw data in an excel spreadsheet. I thought it was interesting how much faster Houses are going up in value compared to Apartments. Also interesting how most of the price increases are in on the SouthEast side of Melbourne. I will drill into this dataset further when I get time, zooming in to certain suburbs, plotting vacant land over time (also included in the raw data), etc etc.

The R code I used to make the plot below is here

Click the plot to enlarge