10 min read

Video Game Sales Data Analysis


As a video game enthusiast who lived in Japan for five years, I've always wanted to analyze the trend of video games sales in the world.


David @ WiseData · July 10, 2023


Video Game Console

Introduction

Video games are not just a form of entertainment, they are also a huge and growing industry that generates billions of dollars in revenue every year. According to Statista, the global video game market revenue was estimated at almost 347 billion U.S. dollars in 2022, with the mobile gaming segment accounting for more than 70% of the total. The gaming industry is constantly evolving, with new technologies, platforms, genres and business models emerging and changing the landscape.

In this blog, I will use video game sales data from Kaggle, which contains information on over 16,000 games released from 1980 to 2016 across various regions and platforms to do data analysis with help of WiseData on Jupyter Notebook.

You can download the Jupyter notebook from HERE.

What is WiseData?

WiseData is a Python library that lets you transform and visualize data with natural language. It uses ChatGPT and cutting-edge natural language processing (NLP) and natural language understanding (NLU) techniques to understand your queries and generate chart that match your intent. You can use WiseData to transform data and create many kinds of charts, such as bar charts, line charts, scatter plots, histograms, box plots, and more.

Installing WiseData

1. Obtain an API Key

To use WiseData, you need to obtain an API Key. Simply visit https://www.wisedata.app/, fill out your email address. And the API Key used for Python package will be delivered to your email.

2. Installation

Using WiseData is super easy and fun. You need to install the library using pip.

We will install additional libraries needed as well.

pip install wisedata
pip install pandas numpy matplotlib seaborn

3. Instantiation

Then you can import WiseData and instantiate the WiseData class with your API key:

from wisedata import WiseData

# TODO: Copy your API key which you've received in your email here
wd = WiseData(api_key="YOUR_API_KEY")

4. Read Video Game Sales data

Download data from HERE, save it to where your notebook is located in, and name it as video_game_sales.csv.

Let’s load the data.

import pandas as pd

df = pd.read_csv("video_game_sales.csv")

Cleaning Data

Let's see how clean the data is.

print(wd.transform("Percentage of null values for each column", { "df": df }))
print(f"Number of games: {df.shape[0]}")

Executing this code will print out following:

Percentage of Null Values
Rank 0
Name 0
Platform 0
Year 1.63273
Genre 0
Publisher 0.34944
NA_Sales 0
EU_Sales 0
JP_Sales 0
Other_Sales 0
Global_Sales 0

Number of games: 16598

Great! The data is fairly clean. Let's start doing analysis!

One thing to note is columns such as NA_Sales, EU_Sales, and JP_Sales contains number of copies sold in Millions.

Best-Selling Titles

We will first look at the best-selling titles.

wd.transform("What are the top 10 best-selling video games of all time? Include platform and year.", { "df": df })
Name Platform Year Global_Sales
Wii Sports Wii 2006 82.74
Super Mario Bros. NES 1985 40.24
Mario Kart Wii Wii 2008 35.82
Wii Sports Resort Wii 2009 33
Pokemon Red/Pokemon Blue GB 1996 31.37
Tetris GB 1989 30.26
New Super Mario Bros. DS 2006 30.01
Wii Play Wii 2006 29.02
New Super Mario Bros. Wii Wii 2009 28.62
Duck Hunt NES 1984 28.31

Surprisingly Super Mario Bros which was released in 1985 came to 2nd place! Once a legend, always a legend.

What about best-selling title for each genre?

wd.transform("What is the best-selling video game for each genre?", { "df": df })
Genre Best-selling Game Global_Sales
Action Grand Theft Auto V 21.4
Adventure Super Mario Land 2: 6 Golden Coins 11.18
Fighting Super Smash Bros. Brawl 13.04
Misc Wii Play 29.02
Platform Super Mario Bros. 40.24
Puzzle Tetris 30.26
Racing Mario Kart Wii 35.82
Role-Playing Pokemon Red/Pokemon Blue 31.37
Shooter Duck Hunt 28.31
Simulation Nintendogs 24.76
Sports Wii Sports 82.74
Strategy Pokemon Stadium 5.45

Seems like genres such as Adventure, Fighting, and Strategy are not that popular given best-seller have less global sales.

Best-Selling Genre

Which genre is the best-seller?

wd.transform("Which genre has the highest global sales?", { "df": df })
Genre Global_Sales
Action 1751.18

Action is the most popular. We will see that it is also the most popular genre released by publishers.

wd.transform("What are the 5 most common genres?", { "df": df })
Genre Count
Action 3316
Sports 2346
Misc 1739
Role-Playing 1488
Shooter 1310

Populatity of video game genres by region

wd.viz("Heatmap of how does the popularity of video games vary by region (NA, EU, JP, Other)? Do not annotate head map with number.", { "df": df })

Genre vs Region using WiseData

Role-Playing genre is really popular in Japan. And I definitely agree with this!

Trend of video game sales

How has the number of video games released changed over time?

wd.viz("How has the number of video games released changed over time?", { "df": df })

Video Games released using WiseData

wd.transform("Which year had the highest global sales of video games?", { "df": df })
Year Global_Sales
2008 678.9

We can see that the video game sales had peaked at 2008 and had declined.

There are several possible factors that contributed to the decline of video game sales. According to some sources, one of the main reasons was the decreasing popularity of the Wii console and its games, which had dominated the market for several years before. Many casual gamers who had bought the Wii shifted to playing games on their smart phones and tablets instead. Another reason was the aging of the PS3 and Xbox 360 consoles, which had been on the market for six and seven years respectively by 20122. Some analysts blamed the timidity of Sony and Microsoft for not introducing new consoles sooner to revitalize the industry. Additionally, the economic recession and the competition from other forms of entertainment may have also affected the demand for video games.

However, it is important to note that these factors only explain the decline in new retail sales of video games, which is what NPD measures. They do not account for the growing market for downloadable games, including free-to-play titles, that are available on PCs and mobile devices. These games are not tracked by NPD, but they represent a significant portion of the video game industry’s revenue and audience. Therefore, the decline in NPD’s reported numbers may not reflect the true state of the video game market as a whole.

How has the number of video games released changed over time for each genre?

wd.viz("How has the number of video games released changed over time for each genre?", { "df": df })

Video Games released using WiseData

Before 2004, Sports games were released the most but after 2004, Action games dominated the market!

Conclusion

I hope you learned something new along the way. WiseData has many benefits for data transformation and visualization. Some of them are:

  • It saves you time and effort by automating the chart creation process, and gain insights faster.
  • It allows you to express your transformation and visualization needs in natural language.
  • It allows you to write more concise and readable code. People who are reading code can interpret the visualization easily.

If you're looking for a way to streamline your data analysis workflow, give WiseData a try!

Happy analyzing data! 📊