by Tracy Hoffmann, Peter Mühleder, Florian Rämisch on behalf of the diggr team
The diggr team has recently published a mapping, reference and grouping for platform strings of various video game databases. It is meant to ease mapping, matching and interlinking entities of various game databases.
Problem Statement
In various video game databases (community or institution-driven) different paradigms and ideas are followed for naming platforms. Especially digital re-releases of older games pose a huge problem, as they are sometimes assigned to their initial platform, various new platforms, or some kind of new hybrid indicating that it is a re-release. This makes research across databases, especially interlinking of entities, difficult and sometimes almost impossible.
Due to the fact that most of the existing databases have grown over a long time, some new platforms (e.g. digital Re-Releases) were probably hard to fit into existing data models. The database administrators made different decisions regarding how to cope with the new platforms. As every database has a different purpose and audience, some aspects were valued higher than others in each case. As a result, the coping strategies differ slightly from one database to another. We propose a first draft (Request for comments) to unify these compromises.
Solution
This is meant as a first step towards unification of platform identification. We provide a “master” platform list and a mapping for seven popular video game databases to this master list. It is not exhaustive.
Paradigms
Here are our paradigms
- any platform has a standardized spelling
- the standardized spelling is always the common long version of the platform name
- if there is an equivalent entry in the GAMECIP Platform Vocabulary, link it
Shortcomings
- At this time every specific arcade or pc platform is mapped to a more general term (e.g. Arcade – Namco System 246 -> Arcade ).
Covered Plattforms
Currently (as of May 2018) we cover seven popular video game databases: ESRB (USA), GameFAQs(USA/intl.), MediaArt (J), Mobygames(USA/intl.), OGDB (USA/intl.), PEGI (EU) and USK (D).
Outlook and Call for Collaboration
The master list we developed follows our use cases and applications. There probably is no “everything-fits-all” master list, which can be used for all possible use cases. Therefore, we strongly encourage you to put it to use and expand on it in your own contexts. This will help to identify shortcomings of our master list and allow us to develop it further together. Feel free to contact us about mistakes in the list or ideas to provide better access to those mappings: team@diggr.link
Access
The data can be accessed in two forms: Either by cloning the repository on github[1] and using the provided platform mapping files. Or by using the github pages, which are served from the dist subdirectory as a static REST API[2]. Using the static REST API version is preferable, as it always serves to up-to-date version, while using a cloned git repository can result in usage of an out-of-date version, e.g. if you forget to pull the latest updates before using it. In some data sources there are platform groups (like “Game Archives”) which include individual platforms. A mapping is located at platform_groups.tsv in the tabular_data folder. The file diggr_vocab.tsv contains the entire standarized vocabulary we used for internal mappings. It also contains links to the GAMECIP Platform Vocabulary.
License and Copyright
JSON-exports contain information about the author, license and copyright information, a version and a date element. We claim no copyright on the data, it therefore is released as Creative Commons CC0 1.0 Universal. The software, which generates the mapping files is free software under the GNU General Public License v3.
The model that Oregami uses is an attempt to be more specific about certain revisions of platforms, OSs, and so on. You can see their data model in the post “Sorting out the platform mess” on their website. Larger groups split into larger branching trees which is more than a typical user needs to see on the front end but is useful for specification.
A lot of platform specifications on places like Mobygames are wrong due to issues of inherited datasets with no clear origin. I’ve run across computer games that say they are on every platform yet little evidence exists that they were even released at all. Archive is fixing that with the existence of game versions and attempts to find specific mentions of games in contemporaneous magazine coverage. I believe that notification of a hardware platform – and many other details in a database quite frankly – should require some sort of source for verification. The additional benefit of this would be that if a database doesn’t implement an Oregami-level backend a person looking into the game could have the ability to deduce what versions of a platform/OS could support it. My belief on databases are as equal parts useful consume tool and research assistant.
Thank you for your feedback! The origami blog post describes the problems with platforms really well, thank you for that hint.
We build our mapping to merge and/or deduplicate information from different video game data sources to analyze it. So, the first step was to normalize, standardize and unify the different terms for platforms in each database. While it’s not our aim to build a new game database we are more interested in linking and analyzing them. Although we would love to see a more specific (and maybe verified) platform specification in video game databases we need to handle what is out there – therefore we agree with you.