This tutorial was written using the kohonen
package version 2.0.19. Some of the code will not work in the most recent version of this package. To install 2.0.19, run the following:
packageurl <- "https://cran.r-project.org/src/contrib/Archive/kohonen/kohonen_2.0.19.tar.gz"
install.packages(packageurl, repos = NULL, type = "source")
I hope to update all of the SOM tutorials to run properly on kohonen
v3 in the near future.
Self Organizing Maps (SOMs) are a tool for visualizing patterns in high dimensional data by producing a 2 dimensional representation, which (hopefully) displays meaningful patterns in the higher dimensional structure. SOMs are “trained” with the given data (or a sample of your data) in the following way:
The key feature this algorithm gives to the SOM is that points that were close in the data space are close in the SOM. Thus SOMs may be a good tool for representing spatial clusters in your data.
require(kohonen)
require(RColorBrewer)
The Kohonen package allows for quick creation of some basic SOMs in R. Our examples below will use player statistics from the 2015/16 NBA season. We will look at player stats per 36 minutes played, so variation in playtime is somewhat controlled for. These data are available at http://www.basketball-reference.com/. We’ve already cleaned the data. Kohonen
functions will require using numeric fields with no missing entries.
library(RCurl)
NBA <- read.csv(text = getURL("https://raw.githubusercontent.com/clarkdatalabs/soms/master/NBA_2016_player_stats_cleaned.csv"),
sep = ",", header = T, check.names = FALSE)
Before we create a SOM, we need to choose which variables we want to search for patterns in.
colnames(NBA)
## [1] "" "Player" "Pos" "Age" "Tm" "G" "GS"
## [8] "MP" "FG" "FGA" "FG%" "3P" "3PA" "3P%"
## [15] "2P" "2PA" "2P%" "FT" "FTA" "FT%" "ORB"
## [22] "DRB" "TRB" "AST" "STL" "BLK" "TOV" "PF"
## [29] "PTS"
We’ll start with some simple examples using shot attempts:
NBA.measures1 <- c("FTA", "2PA", "3PA")
NBA.SOM1 <- som(scale(NBA[NBA.measures1]), grid = somgrid(6, 4, "rectangular"))
plot(NBA.SOM1)