love-saver: LÖVE saver

love-saver is a module which implements methods to create saves for a game.

There are two methods: single-file saves and repository (multi-file) saves.

Beyond the saving model, the module aims to be an asynchronous interface to not disrupt the main thread with IO.

Introduction

Correctly saving data is important and not trivial. Even if game data are in general less precious than other kind of data, because they are produced in the context of a fictitious, virtual, simulated space instead of a more concrete reality with consequences, they still matter a lot to those who want to immerse themselves in those spaces.

In my methodology, my first choice would be to use SQLite to store data, but there are reasons to explore an alternative:

Most importantly, distributing SQLite for each platform that LÖVE supports is work I want to avoid.
The way I think about game states and saves is much simpler than the formalism and power that SQL brings. In my search for simplicity, I need to use the simplest solution, not the most powerful. Also, the mismatch between the SQL data model and the Lua data model can increase complexity.
There are features that are more specific to games, like having snapshots of different instants of an adventure.
A method based on file as data units can easily introduce integrity checks and compression.

Fundamental Principles

The idea is to find a method which is simple, robust enough and more aligned with the needs related to creating games with LÖVE.

To reach this goal, the method follows some principles:

File-based: The smallest unit of data which can be read/written is the file.
Immutability: Files are immutable, they can be created and deleted, but not modified afterwards.
Integrity: Files are checked using hash functions to detect accidental corruption (not about tampering attacks).
Snapshot: Doing a save creates a snapshot that stores a complete consistent state.
Versioning: Each snapshot has a chronological version which allows to rollback to a previous snapshot in the event of data corruption.

The method requires from the filesystem to be able to create, read, write, delete and list files. Modifying a file in-place or moving a file is avoided on purpose.

In the context of this method, a save (or save slot) is a virtual entry made of chronological snapshots.

Single-file Saves

A single-file snapshot is a file with a name in the form of <prefix>-<version>.save.

The prefix is the save identifier and version is a chronological version of the snapshot. Because single-file saves can be created anywhere, the extension allows to quickly filter them from other files.

Example of the content of a snapshot:

love-saver-snapshot-1.0
md5-68b329da9893e34099c7d8ad5cb9c940 lz4
<payload>

The first line is a header to identify the format and the second line is the hash of the payload followed by the compression method.

When loading a save, the most recent snapshot is used (if valid). The others can be pruned in function of the amount of remaining snapshots desired.

If a save failed to persist to disk correctly (e.g. power outage), it should rollback to the previous snapshot.

Repository Saves

Repository saves can be made of multiple files. A snapshot is decomposed into multiple parts (files) which can be read and written individually, allowing to deal with much bigger states and saves. A new snapshot can be created from another one and will reuse the parts which didn't change.

The repository is a directory. This directory contains snapshot manifests and data files. Snapshot manifests are key-file stores mapping a key to a data file (with some metadata). Data files are the parts shared by the snapshots.

Snapshot file names are like single-save snapshots, but without the extension.

Example repository:

NewWorld-451 (snapshot)
NewWorld-452
AnotherWorld-48
data/md5-3448076e597f2393b0af8343a0335601 (data file)
data/md5-2c9e69dc4d986547590812aa5f287b62
data/md5-9ef67efe7269c71dc7f346f96f891749
...

Example snapshot NewWorld-451:

love-saver-snapshot-manifest-1.0
md5-e45cb09491161dda981d41ad8b27ba2d raw
chunk-1-1
md5-3448076e597f2393b0af8343a0335601 lz4
chunk-2-1
md5-2c9e69dc4d986547590812aa5f287b62 lz4
player-inventory
md5-9ef67efe7269c71dc7f346f96f891749 lz4

The first line is a header to identify the format and the second line is the hash of the rest of the content and its compression method. Then, similarly, every two lines is a key followed by the hash of the associated data file and its compression method.

As with single-file saves, data is verified and the most recent valid snapshot can be loaded.

Different kind of saves can be created in the same repository and it is possible to create a new save from an existing one as a form of checkpointing or branching.

Notes and Recommendations

Use Cases

A use case of a single-file save could be for game application settings; for game state saves, I would recommend to directly use a repository, even to save a single file entry, because they are more scalable and the amount of things to save may increase in the future.

File Size

A file on most file systems seems to have a minimum space overhead, for example related to the page size (e.g. 4 KiB). It is probably best to have files over 10 KB and under 5 MB, but it depends on the amount of data and the decomposition used.

For example, with huge amount of data it could be necessary to use bigger files to reduce their number. Conversely, smaller files could improve data deduplication in the case of many snapshots. It is also about the smallest amount of data which can be manipulated on disk and in memory, e.g. to reduce latency.

Robustness

Filesystems metadata (names, directories, etc.) seem to be more robust than file content¹. By using files as data units, the behavior and robustness of the filesystem is reused while content corruption is detected by integrity checks and recovered through snapshots.

Although any kind of content corruption should be detected, the case that is dealt with by this method is the failure to correctly persist data to disk, which is mostly about the interruption of a save (e.g. application crash, power outage, etc.). Once the data on disk has been verified to be correct, it is assumed that it will stay that way. This is important for repository saves where many snapshots can share the same data files; if one file became corrupted, all snapshots would be. Snapshots are not about redundancy in the event of random data corruption, but about the ability to rollback on save failure.

Concurrency

Concurrency is not a goal of the project; it assumes that only one application will use (read or write) a single-file save or a repository at the same time.

However, runtime checks may detect abnormal activity caused by concurrent changes, which can happen when accidentally running two instances of a game.

File Manipulations

For single-file saves, copying or moving them can be accomplished by copying or moving all of their snapshot files at once.

For repository saves, it is more complicated because snapshot files contain references to data files. The whole repository can be moved or copied, but individual saves cannot unless each associated data file is identified and copied or moved as well.

However, merging a repository into another one should be relatively easy by moving or copying the repository directory and ignoring the already existing files, which is a common feature of file managers. This works because the data files are content-addressed².

^{^} https://en.wikipedia.org/wiki/Journaling_file_system
^{^} Each file is named after the hash of its content.