This tutorial was written using the `kohonen`

package version 2.0.19. Some of the code will not work in the most recent version of this package. To install 2.0.19, run the following:

```
packageurl <- "https://cran.r-project.org/src/contrib/Archive/kohonen/kohonen_2.0.19.tar.gz"
install.packages(packageurl, repos = NULL, type = "source")
```

I hope to update all of the SOM tutorials to run properly on `kohonen`

v3 in the near future.

Self Organizing Maps (SOMs) are a tool for visualizing patterns in high dimensional data by producing a 2 dimensional representation, which (hopefully) displays meaningful patterns in the higher dimensional structure. SOMs are “trained” with the given data (or a sample of your data) in the following way:

- The size of map grid is defined.
- Each cell in the grid is assigned an initializing vector in the data space.
- For example, if you are creating a map of a 22 dimensional space, each grid cell is assigned a representative 22 dimensional vector.
- Initiation can either be random or following specific methods.

- Data are repeatedly fed into the model to train it. Each time a training vector is entered, the following process is undertaken:
- The grid cell with the representative vector that is closest to the training vector is identified.
- All of the representative vectors of grid cells nearby the identified one are slightly adjusted towards the training vector.

- Several parameters of convergence force the adjustments to get smaller and smaller as training vectors are fed in many times, causing the map to stabilize into a representation.

The key feature this algorithm gives to the SOM is that points that were close in the data space are close in the SOM. Thus SOMs may be a good tool for representing spatial clusters in your data.

```
require(kohonen)
require(RColorBrewer)
```

The Kohonen package allows for quick creation of some basic SOMs in R. Our examples below will use player statistics from the 2015/16 NBA season. We will look at player stats per 36 minutes played, so variation in playtime is somewhat controlled for. These data are available at http://www.basketball-reference.com/. We’ve already cleaned the data. `Kohonen`

functions will require using numeric fields with no missing entries.

```
library(RCurl)
NBA <- read.csv(text = getURL("https://raw.githubusercontent.com/clarkdatalabs/soms/master/NBA_2016_player_stats_cleaned.csv"),
sep = ",", header = T, check.names = FALSE)
```

Before we create a SOM, we need to choose which variables we want to search for patterns in.

`colnames(NBA)`

```
## [1] "" "Player" "Pos" "Age" "Tm" "G" "GS"
## [8] "MP" "FG" "FGA" "FG%" "3P" "3PA" "3P%"
## [15] "2P" "2PA" "2P%" "FT" "FTA" "FT%" "ORB"
## [22] "DRB" "TRB" "AST" "STL" "BLK" "TOV" "PF"
## [29] "PTS"
```

We’ll start with some simple examples using shot attempts:

```
NBA.measures1 <- c("FTA", "2PA", "3PA")
NBA.SOM1 <- som(scale(NBA[NBA.measures1]), grid = somgrid(6, 4, "rectangular"))
plot(NBA.SOM1)
```