Yu-Gi-Oh the card game, the game of strategy, mind tricks, spells and traps.
Data Source - Yu-Gi-Oh! Cards
Starting off we want to prep our work area with the libraries we want to use and the data that we will be using.
library(tidyverse)
Registered S3 methods overwritten by 'dbplyr':
method from
print.tbl_lazy
print.tbl_sql
── Attaching packages ─────────────── tidyverse 1.3.1 ──
✓ ggplot2 3.3.3 ✓ purrr 0.3.4
✓ tibble 3.1.1 ✓ dplyr 1.0.5
✓ tidyr 1.1.3 ✓ stringr 1.4.0
✓ readr 1.4.0 ✓ forcats 0.5.1
── Conflicts ────────────────── tidyverse_conflicts() ──
x dplyr::filter() masks stats::filter()
x dplyr::lag() masks stats::lag()
data <- read.csv('cards.csv')
head(data)
summary(data)
id name
Min. : 10000 Length:11183
1st Qu.: 24466676 Class :character
Median : 49511705 Mode :character
Mean : 51356349
3rd Qu.: 74892018
Max. :501000007
type desc atk
Length:11183 Length:11183 Min. : 0
Class :character Class :character 1st Qu.: 800
Mode :character Mode :character Median :1500
Mean :1464
3rd Qu.:2100
Max. :5000
NA's :3797
def level race
Min. : 0 Min. : 0.000 Length:11183
1st Qu.: 500 1st Qu.: 3.000 Class :character
Median :1200 Median : 4.000 Mode :character
Mean :1225 Mean : 4.501
3rd Qu.:1800 3rd Qu.: 6.000
Max. :5000 Max. :13.000
NA's :4126 NA's :4127
attribute scale archetype
Length:11183 Min. : 0.000 Length:11183
Class :character 1st Qu.: 2.000 Class :character
Mode :character Median : 4.000 Mode :character
Mean : 4.381
3rd Qu.: 7.000
Max. :13.000
NA's :10905
linkval linkmarkers image_url
Min. :1.000 Length:11183 Length:11183
1st Qu.:2.000 Class :character Class :character
Median :2.000 Mode :character Mode :character
Mean :2.383
3rd Qu.:3.000
Max. :6.000
NA's :10854
image_url_small ban_tcg
Length:11183 Length:11183
Class :character Class :character
Mode :character Mode :character
ban_ocg ban_goat
Length:11183 Length:11183
Class :character Class :character
Mode :character Mode :character
By just giving our data the glance over we can tell that there are a total of 11,183 cards currently in the TCG and OCG.
unique(data$type)
[1] "Spell Card"
[2] "Effect Monster"
[3] "Normal Monster"
[4] "Flip Effect Monster"
[5] "Trap Card"
[6] "Union Effect Monster"
[7] "Fusion Monster"
[8] "Pendulum Effect Monster"
[9] "Link Monster"
[10] "XYZ Monster"
[11] "Synchro Tuner Monster"
[12] "Tuner Monster"
[13] "Synchro Monster"
[14] "Gemini Monster"
[15] "Normal Tuner Monster"
[16] "Spirit Monster"
[17] "Ritual Effect Monster"
[18] "Token"
[19] "Skill Card"
[20] "Ritual Monster"
[21] "Toon Monster"
[22] "Pendulum Normal Monster"
[23] "Synchro Pendulum Effect Monster"
[24] "Pendulum Tuner Effect Monster"
[25] "XYZ Pendulum Effect Monster"
[26] "Pendulum Effect Fusion Monster"
[27] "Pendulum Flip Effect Monster"
We can observe that there are 27 different types of cards that are utilized in game.
unique(data$attribute)
[1] "" "EARTH" "WATER" "WIND" "DARK"
[6] "LIGHT" "FIRE" "DIVINE"
Monster cards have these attributes, while other types of cards do not.
data$attribute[data$attribute == ""] <- "NONE"
Normally Cards with no attribute will appear as blank for visualization purposes I transformed them into a readable variable.
ggplot(data = data) +
geom_bar(mapping = aes(type), stat = "count", fill='#606c38', colour = '#283618') +
theme(axis.text.x = element_text(angle = 50, vjust = 1, hjust = 1))
Looks like Effect Monsters are the most produced by a large margin, almost double the amount of spell cards which are the second most produced card.
ggplot(data = data) +
geom_bar(mapping = aes(attribute), stat = "count", fill='#dda15e', colour = '#bc6c25') +
theme(axis.text.x = element_text(angle = 50, vjust = 1, hjust = 1))
The most common attribute (Not counting spells, traps and tokens) in the game is DARK, second EARTH, third LIGHT, the rest are equally not as used in the game and DIVINE is only used in special cards.
data$desc_length <- str_length(str_replace_all(data$desc, " ", ""))
Here we want to count each of the characters not including spaces.
mean(data$desc_length)
[1] 225.0753
Each card’s description has an average of 225.07 letters. In the English language words average 4.7 letters in length. This would mean that in general each card has 47.88 words.
subset(data, desc_length == max(desc_length))
The Yu-Gi-Oh card with the longest description with 73 for the top portion and 107 for the bottom text adding up to 180 words in total.
Logically we would assume that depending on the level of the monster the Atk and/or Def would increase.
ggplot(data = data, aes(y = level)) +
geom_smooth(aes(x = atk, colour = "Attack")) +
geom_smooth(aes(x = def, colour = "Defense")) +
labs(
x = "Atk and Def (0 - 5000)",
y = "Monster's Level (0 - 13)",
colour = "Legend"
) +
ylim(0, NA)
Through this chart we can see that higher levels have generally high Attack and Defense, this would seem obvious but could be a trap that a card game may fall through during many years in their run.
In the end we can see that despite all the years and all the cards created Yu-Gi-Oh has managed to stay balanced with few banned cards here and there.
Though one problem that surged is that due to the increase in effect monsters cards and the fact that on average cards have a lot of words, which could deter newer players from wanting to pick up the game.