I recently started playing Pokémon again - “Pokémon Let’s Go Eevee” on the Nintendo Switch to be specific. In the classic Pokémon games, you have a team of 6 Pokémon that you use to battle against other trainers. In battles, type match-ups are very important, as some types of moves are “super effective” against other types. For example, fire moves are super effective against grass Pokémon, which means they do double the damage they normally would. If you can set your team up so that you’re always optimally matched, you’re going to have a much easier time.

But there are 18 types and you only get 6 Pokémon on your team. This leads to the question - what are the combinations of 6 types that make you super effective against the most types of Pokémon?1 It turns out this is a question a lot of people have asked.

I knew there was a chart out there that matches up every attacking type against every defending and tells you whether they’re super effective, normal, not very effective, or doesn’t have any effect. So I decided to use my R skills to answer this question (many thanks to my brother David Robinson for his guidance at various points). Along the way, we’ll do a quick exploratory analysis, learn about combinatorials, and leave the tidyverse to use matrices and some base functions.

Data Exploration

I found a csv of the Pokémon type chart on GitHub. Using read_csv() on the url didn’t work, and rather than try to debug it, I decided to “cheat”" and use the magic package datapasta package. On Github, I clicked to edit the file, copied everything in it, and then used tribble_paste(), which output my clipboard into the code that would create a tibble I called type_comparisons.

[Added 8/26]: As Jim Hester kindly pointed out, read_csv() will work if I use it on the raw link, generated by clicking the “Raw” button.

library(tidyverse)
# this didn't work
# type_comparisons <- read_csv("https://github.com/robinsones/pokemon-chart/blob/master/chart.csv")
library(datapasta)
# use tribble_paste()
type_comparisons <- tibble::tribble(
     ~Attacking, ~Normal, ~Fire, ~Water, ~Electric, ~Grass, ~Ice, ~Fighting, ~Poison, ~Ground, ~Flying, ~Psychic, ~Bug, ~Rock, ~Ghost, ~Dragon, ~Dark, ~Steel, ~Fairy,
       "Normal",       1,     1,      1,         1,      1,    1,         1,       1,       1,       1,        1,    1,   0.5,      0,       1,     1,    0.5,      1,
         "Fire",       1,   0.5,    0.5,         1,      2,    2,         1,       1,       1,       1,        1,    2,   0.5,      1,     0.5,     1,      2,      1,
        "Water",       1,     2,    0.5,         1,    0.5,    1,         1,       1,       2,       1,        1,    1,     2,      1,     0.5,     1,      1,      1,
     "Electric",       1,     1,      2,       0.5,    0.5,    1,         1,       1,       0,       2,        1,    1,     1,      1,     0.5,     1,      1,      1,
        "Grass",       1,   0.5,      2,         1,    0.5,    1,         1,     0.5,       2,     0.5,        1,  0.5,     2,      1,     0.5,     1,    0.5,      1,
          "Ice",       1,   0.5,    0.5,         1,      2,  0.5,         1,       1,       2,       2,        1,    1,     1,      1,       2,     1,    0.5,      1,
     "Fighting",       2,     1,      1,         1,      1,    2,         1,     0.5,       1,     0.5,      0.5,  0.5,     2,      0,       1,     2,      2,    0.5,
       "Poison",       1,     1,      1,         1,      2,    1,         1,     0.5,     0.5,       1,        1,    1,   0.5,    0.5,       1,     1,      0,      2,
       "Ground",       1,     2,      1,         2,    0.5,    1,         1,       2,       1,       0,        1,  0.5,     2,      1,       1,     1,      2,      1,
       "Flying",       1,     1,      1,       0.5,      2,    1,         2,       1,       1,       1,        1,    2,   0.5,      1,       1,     1,    0.5,      1,
      "Psychic",       1,     1,      1,         1,      1,    1,         2,       2,       1,       1,      0.5,    1,     1,      1,       1,     0,    0.5,      1,
          "Bug",       1,   0.5,      1,         1,      2,    1,       0.5,     0.5,       1,     0.5,        2,    1,     1,    0.5,       1,     2,    0.5,    0.5,
         "Rock",       1,     2,      1,         1,      1,    2,       0.5,       1,     0.5,       2,        1,    2,     1,      1,       1,     1,    0.5,      1,
        "Ghost",       0,     1,      1,         1,      1,    1,         1,       1,       1,       1,        2,    1,     1,      2,       1,   0.5,      1,      1,
       "Dragon",       1,     1,      1,         1,      1,    1,         1,       1,       1,       1,        1,    1,     1,      1,       2,     1,    0.5,      0,
         "Dark",       1,     1,      1,         1,      1,    1,       0.5,       1,       1,       1,        2,    1,     1,      2,       1,   0.5,      1,    0.5,
        "Steel",       1,   0.5,    0.5,       0.5,      1,    2,         1,       1,       1,       1,        1,    1,     2,      1,       1,     1,    0.5,      2,
        "Fairy",       1,   0.5,      1,         1,      1,    1,         2,     0.5,       1,       1,        1,    1,     1,      1,       2,     2,    0.5,      1
     )

To make it easier to explore the data, I’m going to start by tidying it.

tidied_comparison <- type_comparisons %>%
  gather(Defending, outcome, -Attacking)

tidied_comparison %>%
  slice(103:109) %>%
  knitr::kable()
Attacking Defending outcome
Rock Ice 2
Ghost Ice 1
Dragon Ice 1
Dark Ice 1
Steel Ice 2
Fairy Ice 1
Normal Fighting 1

We now have a dataset of 324 rows, with each Attacking-Defending combination and what the outcome is. Here, outcome is 2 if it’s super effective (what we’re interested in), 1 if normal, .5 if not very effective, and 0 if no effect.

What types are super effective against the most other types?

tidied_comparison %>%
  group_by(Attacking) %>%
  summarize(nb_super_effective = sum(ifelse(outcome == 2, 1, 0))) %>%
  arrange(desc(nb_super_effective)) %>%
  knitr::kable()
Attacking nb_super_effective
Fighting 5
Ground 5
Fire 4
Ice 4
Rock 4
Bug 3
Fairy 3
Flying 3
Grass 3
Steel 3
Water 3
Dark 2
Electric 2
Ghost 2
Poison 2
Psychic 2
Dragon 1
Normal 0

Fighting and Ground are both super effective against 5 different types, while Normal isn’t super effective against any.

Are there any types where only one Attacking type is super-effective?

tidied_comparison %>%
  filter(outcome == 2) %>%
  add_count(Defending) %>%
  arrange(n) %>%
  head(4) %>%
  knitr::kable()
Attacking Defending outcome n
Fighting Normal 2 1
Ground Electric 2 1
Electric Water 2 2
Grass Water 2 2

Yes - if we want to be super effective against Normal and Electric types, we need Fighting and Ground types respectively.

Building Pokémon Teams

The first step is to build out all the hypothetical teams of 6. If you remember your introduction to statistics days, this is a combinatorial problem: we have 18 options (although we don’t expect “Normal” to show up since it’s not super effective against anything), need to choose 6, and the order doesn’t matter (e.g. 1 to 6 is the same as 6 to 1). We can do this in R with the function combn:

all_combinations <- combn(18, 6)
dim(all_combinations)
## [1]     6 18564

all_combinations is a 6 by 18,564 matrix: each column is a different combination of types. For example, let’s look at the first two columns:

all_combinations[, 1:2]
##      [,1] [,2]
## [1,]    1    1
## [2,]    2    2
## [3,]    3    3
## [4,]    4    4
## [5,]    5    5
## [6,]    6    7

The first column is one team with the types 1 through 6, while the second is a team with 1 through 5 and 7.

Now we need to take this and understand how many types each team is super effective against.

Matrix Magic

I originally was thinking of calling this post “Going back to the Base[ics],” since I’m moving out of the tidyverse and into the world of matrices, but there’s really nothing basic about this. Let’s walk through it step by step.

First, we’re going to take our table and make it a matrix. We can’t just do as.matrix() directly, as it will make the Attacking column the first column, while we want that to be the rownames, so we’ll do it in two steps.

m <- as.matrix(type_comparisons[, -1])
rownames(m) <- type_comparisons$Attacking

Next, because we only care about whether the entry is 2 or not, we’ll change every entry that’s a 2 to be 1 and every entry that’s not to be 0 (the 1L * makes it 1 or 0 instead of TRUE or FALSE).

super_effective_m <- (m == 2) * 1L
super_effective_m
##          Normal Fire Water Electric Grass Ice Fighting Poison Ground
## Normal        0    0     0        0     0   0        0      0      0
## Fire          0    0     0        0     1   1        0      0      0
## Water         0    1     0        0     0   0        0      0      1
## Electric      0    0     1        0     0   0        0      0      0
## Grass         0    0     1        0     0   0        0      0      1
## Ice           0    0     0        0     1   0        0      0      1
## Fighting      1    0     0        0     0   1        0      0      0
## Poison        0    0     0        0     1   0        0      0      0
## Ground        0    1     0        1     0   0        0      1      0
## Flying        0    0     0        0     1   0        1      0      0
## Psychic       0    0     0        0     0   0        1      1      0
## Bug           0    0     0        0     1   0        0      0      0
## Rock          0    1     0        0     0   1        0      0      0
## Ghost         0    0     0        0     0   0        0      0      0
## Dragon        0    0     0        0     0   0        0      0      0
## Dark          0    0     0        0     0   0        0      0      0
## Steel         0    0     0        0     0   1        0      0      0
## Fairy         0    0     0        0     0   0        1      0      0
##          Flying Psychic Bug Rock Ghost Dragon Dark Steel Fairy
## Normal        0       0   0    0     0      0    0     0     0
## Fire          0       0   1    0     0      0    0     1     0
## Water         0       0   0    1     0      0    0     0     0
## Electric      1       0   0    0     0      0    0     0     0
## Grass         0       0   0    1     0      0    0     0     0
## Ice           1       0   0    0     0      1    0     0     0
## Fighting      0       0   0    1     0      0    1     1     0
## Poison        0       0   0    0     0      0    0     0     1
## Ground        0       0   0    1     0      0    0     1     0
## Flying        0       0   1    0     0      0    0     0     0
## Psychic       0       0   0    0     0      0    0     0     0
## Bug           0       1   0    0     0      0    1     0     0
## Rock          1       0   1    0     0      0    0     0     0
## Ghost         0       1   0    0     1      0    0     0     0
## Dragon        0       0   0    0     0      1    0     0     0
## Dark          0       1   0    0     1      0    0     0     0
## Steel         0       0   0    1     0      0    0     0     1
## Fairy         0       0   0    0     0      1    1     0     0

The all_combinations matrix we created before is essentially a set of indices for the super_effective_m matrix. For example, column 1 of all_combinations are the numbers 1 through 6, which means we want to get rows 1 through 6 of super_effective_m. Remember, each row of super_effective_m is an attacking type on our team, and each column is a defending type. We then want to get the sum of each column and know how many columns have a sum of more than 0, meaning at least one of our attacking types was super effective against it. We’ll make a function, super_effective_nb:

super_effective_nb <- function(indices) {
  sum(colSums(super_effective_m[indices, ]) > 0)
}

Now we can use apply() to get a vector, for all 18k+ teams, of how many types they’re super effective against. If you’re not familiar with apply(), the first argument is what we’re applying our function to, the second is whether it should apply to the rows or columns (we choose 2 for column, since each column is the team), and the third is the function.

super_effective_results <- apply(all_combinations, 2, super_effective_nb)

What are the combinations that are super effective against the maximum number of types possible?

which(super_effective_results == max(super_effective_results))
##  [1] 14323 14325 15610 15612 16454 16459 16852 16854 16989 16994

We see there are 10 possible combinations of six types. Let’s take a look at them by getting those columns from all_combinations.

best_combos <- all_combinations[, super_effective_results == max(super_effective_results)]
best_combos
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,]    4    4    5    5    5    5    6    6    6     6
## [2,]    6    6    6    6    8    8    7    7    7     7
## [3,]    7    7    7    7    9    9    8    8    9     9
## [4,]    9    9    9    9   13   13    9    9   10    10
## [5,]   10   10   10   10   14   16   10   10   14    16
## [6,]   14   16   14   16   18   18   14   16   17    17

We now have a matrix, best_combo, where each column is a team. For example, we see a team of types 4, 6, 7, 9, 10, and 14 cover the maximum number of defending types. But what is type 4? To answer that, we take the row names from super_effective_m and index it by best_combos.

rownames(super_effective_m)[best_combos]
##  [1] "Electric" "Ice"      "Fighting" "Ground"   "Flying"   "Ghost"   
##  [7] "Electric" "Ice"      "Fighting" "Ground"   "Flying"   "Dark"    
## [13] "Grass"    "Ice"      "Fighting" "Ground"   "Flying"   "Ghost"   
## [19] "Grass"    "Ice"      "Fighting" "Ground"   "Flying"   "Dark"    
## [25] "Grass"    "Poison"   "Ground"   "Rock"     "Ghost"    "Fairy"   
## [31] "Grass"    "Poison"   "Ground"   "Rock"     "Dark"     "Fairy"   
## [37] "Ice"      "Fighting" "Poison"   "Ground"   "Flying"   "Ghost"   
## [43] "Ice"      "Fighting" "Poison"   "Ground"   "Flying"   "Dark"    
## [49] "Ice"      "Fighting" "Ground"   "Flying"   "Ghost"    "Steel"   
## [55] "Ice"      "Fighting" "Ground"   "Flying"   "Dark"     "Steel"

This gets us a character vector though. It’s in order, so we know that the first six is team 1, the second six team 2, etc., but it’s not displayed very well. We can use matrix to turn this into a matrix instead, specifying that we want 6 rows.

strongest_teams <- matrix(rownames(super_effective_m)[best_combos], nrow = 6)
strongest_teams
##      [,1]       [,2]       [,3]       [,4]       [,5]     [,6]    
## [1,] "Electric" "Electric" "Grass"    "Grass"    "Grass"  "Grass" 
## [2,] "Ice"      "Ice"      "Ice"      "Ice"      "Poison" "Poison"
## [3,] "Fighting" "Fighting" "Fighting" "Fighting" "Ground" "Ground"
## [4,] "Ground"   "Ground"   "Ground"   "Ground"   "Rock"   "Rock"  
## [5,] "Flying"   "Flying"   "Flying"   "Flying"   "Ghost"  "Dark"  
## [6,] "Ghost"    "Dark"     "Ghost"    "Dark"     "Fairy"  "Fairy" 
##      [,7]       [,8]       [,9]       [,10]     
## [1,] "Ice"      "Ice"      "Ice"      "Ice"     
## [2,] "Fighting" "Fighting" "Fighting" "Fighting"
## [3,] "Poison"   "Poison"   "Ground"   "Ground"  
## [4,] "Ground"   "Ground"   "Flying"   "Flying"  
## [5,] "Flying"   "Flying"   "Ghost"    "Dark"    
## [6,] "Ghost"    "Dark"     "Steel"    "Steel"

For our final step, we’re actually going to make this a tibble, so I can look at which types appear the most often across the different team possibilities.

strongest_teams %>%
  as_tibble() %>%
  gather(team, type) %>%
  count(type, sort = TRUE) %>%
  knitr::kable()
type n
Ground 10
Fighting 8
Flying 8
Ice 8
Dark 5
Ghost 5
Grass 4
Poison 4
Electric 2
Fairy 2
Rock 2
Steel 2

We see all 10 of the teams need a ground type, where 8 have a Fighting, Flying, or Ice type. On the other hand, Electric, Fairy, Rock, and Steel are only each used by two teams.

Conclusion

While this is a bit of a silly use case, the code we walked through and lessons learned could be applied to a lot of different projects. When I advise people to make a portfolio of data science projects if they’re looking for a job, I sometimes get asked, “But how do I find something to work on?” I recommend looking in your own life and interests where you could use data science. If you’re a runner and use an activity tracker, graph how your run distances and times are related to the weather. If you’re active on a subreddit, you could use the reddit API to get the last 500 posts and do a text analysis. The possibilities are limitless!

I also played around with getting this code even faster and trying to do everything in a tidy way instead. Including those methods made the post run a little too long, so I may follow up with a part 2 of this post. To make this analysis more useful, I could also take into account how common types are - for example, only a few Pokémon have a Dragon type, so it’s less important to have types that are super effective against Dragon.

[1] Pokémon players will know that you can have more than 6 types on your team, both because some Pokémon have two types and because Pokémon can learn moves of other types (e.g. a Normal type Pokémon may be able to learn a Dark move). But for the purposes of this analysis I simplified it.