Artifact 8da8313b2c8e81a7f4e3f3ea0d4d1f9d9dba23c9118bdcec18675a55be547a36:

File www/remizstd.md — part of check-in [9a0995ca4b] at 2022-03-10 00:24:16 on branch trunk — Finish web documentation. (user: alexa size: 20540)

title: RemiZstd Manual author: Remilia Scarlet date: March 2022 ...

Introduction

RemiZstd is a set of bindings for Nim of the ZStandard C library. They are essentially a port of the Crystal bindings by Didactic Drunk. The original Crystal bindings can be found here: https://github.com/didactic-drunk/zstd.cr

If you want an alternative set of bindings, check out https://github.com/wltsmrz/nim_zstd

Features

Some features include:

Two APIs: a simple context-based one, and one for (de)compressing to/from streams.
API calls allow reusable buffers to reduce GC overhead.
export ZSTD_CLEVEL=1 sets the default compression level just like the zstd command line utilities.
Probably pretty fast
Custom dictionaries (not well tested)

Getting Set Up

You will need the ZStandard development files on your system. Something like one of the following should work on Linux:

slackpkg install zstd
pacman -S zstd
apt-get install libzstd-dev
yum install libzstd-devel

Once you have the ZStandard development files installed, you will then need to prepare the repository for use. Note that I do not use Git nor Mercurial, so the process here this may seem a bit out of the ordinary. However, in the end, you can still use Nimble just like with other Nim packages.

Clone this repository manually using Fossil.
Change into the cloned directory.
Run nimble install to install the library.

Usage

To import all of the library (minus the low-level FFI stuff), you can just use import remizstd. Or, you can import just what you need:

remizstd/common: Custom dictionaries and some common types/methods/values
remizstd/compress: Context-based compression
remizstd/decompress: Context-based decompression
remizstd/compstream: Compressing streams
remizstd/decompstream: Decompression streams
remizstd/libffi: Low-level C bindings

Full example:

import std/streams
import remizstd/[compstream, decompstream]

# Generate the test file
let filename = "/tmp/somefile.txt"
writeFile(filename, "This is a test file for remizstd :D")

# Compress to a file stream
var outFile = newFileStream("/tmp/somefile.txt.zst", fmWrite)
assert(not isNil(outFile))

withCompressStream(outFile, 9, cio): # Compression level 9
  cio.syncClose = true
  cio.write(readFile(filename))
  # We set cio.syncClose to true, so outFile is automatically closed as well

# Decompress from one file stream to another file stream
let cfilename = "/tmp/somefile.txt.zst"
let dfilename = "/tmp/somefile.txt.orig"
var inFile = newFileStream(cfilename, fmRead)
outFile = newFileStream(dfilename, fmWrite)
assert(not isNil(outFile))

withDecompressStream(inFile, dio):
  outFile.write(dio.readAll())
# Note we didn't set dio.syncClose to true, so we have to close our output file manually
outFile.close()

assert(readFile(filename) == readFile(dfilename))

Basic Streaming API Usage

The streaming API is used to compress data to a stream, or decompress data from a stream. It also includes a few templates that provide useful constructs that will automatically handle closing of the (de)compression stream for you.

Example usage of the streaming API:

import std/streams
import remizstd/[compstream]

# Compression to a string stream
var dest = newStringStream()
withCompressStream(dest, cio):
  cio.write("This is some test data")

Basic Context-based API Usage

The context-based API uses wrappers around the lower-level ZstdCCtx pointers. These allow you to allocate a (de)compression context once, then re-use it for successive operations. This will generally be more friendly towards your system's memory.

Note: Contexts are mostly just an optimization for speed and resource usage. It does not change the compression ratio.

Note: In multi-threaded environments, use one different context per thread for parallel execution.

Example usage of the context-based API:

import remizstd/[compress, decompress]

let buf = "this is a test buffer"

# Compression using a context
var
  cctx = newCompressCtx(1) # Compression level of 1
  cbuf = cctx.compress(buf)

# Decompression using a context
var
  dctx = newDecompressCtx()
  dbuf = dctx.decompress(cbuf)

assert(dbuf == buf)

API Reference

remizstd Module

[]{#libversion}

func libVersion(): string

Returns the current version of libzstd.

remizstd/common Module

Types

[]{#compressionlevel}

type CompressionLevel* = cint

A compression level.

[]{#zstderr}

type  ZStdError = object of CatchableError

All errors in remizstd are of this type, or a subtype.

[]{#compressdictobj}

type CompressDictObj = object

A wraper for a ZstdCDict pointer. You should always use CompressDict instead.

[]{#compressdict}

type CompressDict = ref CompressDictObj

A managed wraper for a ZstdCDict pointer.

[]{#decompressdictobj}

type DecompressDictObj = object

A wraper for a ZstdDDict pointer. You should always use DecompressDict instead.

[]{#decompressdict}

type DecompressDict = ref DecompressDictObj

A managed wraper for a ZstdDDict pointer.

[]{#dict}

type Dict = ref object of RootObj

An in-memory representation of a custom dictionary. It has no exported fields.

Globals and Constants

[]{#defaultcomplevel}

var DefaultCompLevel: CompressionLevel

The default compression level. When the bindings are first loaded, this will check to see if the ZSTD_CLEVEL environment variable is set. If it is, it uses the integer value from that variable. Otherwise this defaults to 3.

[]{#mincomplevel}

var MinCompLevel*: CompressionLevel

The lowest supported compression level. This will be set when the program is started using a call to the lower level min_c_level().

[]{#maxcomplevel}

var MaxCompLevel*: CompressionLevel

The highest supported compression level. This will be set when the program is started using a call to the lower level max_c_level().

Procedures and Functions

[]{#newdict}

proc newDict*(buf: string, level: CompressionLevel = DefaultCompLevel): Dict

Creates a new custom dictionary using buf as the dictionary data.

[]{#dictid}

proc dictID*(d: Dict): cuint

Returns the ID of the Dict.

remizstd/compress Module

Types

[]{#compressctxobj}

type CompressCtxObj* = object

A wrapper and some state around ZstdCCtx used to compress data. You should use CompressCtx instead.

[]{#compressctx}

type CompressCtx* = ref CompressCtxObj

A managed context used to compress data.

Procedures and Functions

[]{#setdict-compressctx}

proc setDict*(ctx: CompressCtx, d: Dict)

Sets the custom dictionary on a CompressCtx.

[]{#newcompressctx-level}

proc newCompressCtx*(level: CompressionLevel, dict: Dict = nil): CompressCtx

Creates a new context for compressing data at the given compression level. You can optionally pass a custom dictionary to this to use for compression.

Example showing a custom dictionary:

import remizstd/[common, compress]

let myCompLevel = 7

# Load and create custom dictionary with compression level 7
let dictData = readFile("/path/to/dict/file.dat")
var dictionary = newDict(dictData, myCompLevel)

# Compression using a context
var
  cctx = newCompressCtx(myCompLevel, dictionary)
  cbuf = cctx.compress("Just some test data to compress, could be binary instead")

[]{#newcompressctx}

proc newCompressCtx*(dict: Dict = nil): CompressCtx

Creates a new context for compressing data using the DefaultCompLevel. You can optionally pass a custom dictionary to this to use for compression.

[]{#compress}

proc compress*(ctx: CompressCtx, src: string): string

Compresses src using the given context, then returns the compressed data.

Example: ```nim import remizstd/compress

Compression using a context

var cctx = newCompressCtx(1) # Compression level of 1 cbuf = cctx.compress("this is some test data)

cbuf now holds the compressed data


\
\

[]{#level-compressctx}

nim func level*(ctx: CompressCtx): CompressionLevel ```

Returns the current compression level for the context.

[]{#setlevel-compressctx}

proc setLevel*(ctx: CompressCtx, level: CompressionLevel)

Changes the context's compression level.

[]{#threads-compressctx}

func threads*(ctx: CompressCtx): int

Returns the number of threads the context will use for compression.

[]{#setthreads-compressctx}

proc setThreads*(ctx: CompressCtx, num: int)

Sets the number of threads the context will use for compression.

[]{#checksum-compressctx}

func checksum*(ctx: CompressCtx): bool {.inline.}

Returns true if the checksum flag is set in the context, or false otherwise.

[]{#setchecksum-compressctx}

proc setChecksum*(ctx: CompressCtx, val: bool) {.inline.}

Changes whether or not the checkum flag is set in the context.

[]{#compressbound}

func compressBound*(ctx: CompressCtx, size: int): int

Given data of length size, this returns the maximum compressed size in worst case single-pass scenario.

[]{#memsize-compressctx}

func memsize*(ctx: CompressCtx): csize_t

Returns the size of the wrapped ZstdCCtx in memory.

remizstd/decompress Module

Types

[]{#decompressctxobj}

type DecompressCtxObj* = object

A wrapper and some state around ZstdDCtx used to compress data. You should use DecompressCtx instead.

[]{#decompressctx}

type DecompressCtx* = ref DecompressCtxObj

A managed context used to decompress data.

[]{#framesizeunknownerror}

type FrameSizeUnknownError* = object of ZstdError

Indicates that the library cannot automatically determine the destination size. Instead, you should use the streaming API in remizstd/decompstream.

See decompress().

Procedures and Functions

[]{#setdict-decompressctx}

proc setDict*(ctx: DecompressCtx, d: Dict)

Sets the custom dictionary on a decompression context.

[]{#newdecompressctx}

proc newDecompressCtx*(dict: Dict = nil): DecompressCtx

Creates a new context for decompressing data. You can optionally pass a custom dictionary to this to use for decompression.

[]{#framecontentsize}

func frameContentSize*(ctx: DecompressCtx, src: string): (uint64, bool)

Attempts to determine the size of src after decompression. If successful, this returns the size of src after decompression and true. If this cannot determine the size, this returns zero and false.

[]{#decompress}

proc decompress*(ctx: DecompressCtx, src: string, size = 0): string

Decompresses src and returns the decompressed data. If size is zero, then this attempts to determine the correct size for the return value by calling frameContentSize(). If size is zero and this cannot determine the size of the return value automatically, this will raise a FrameSizeUnknownError.

Example showing compression, then decompression: ```nim import remizstd/[compress, decompress]

let buf = "this is a test buffer"

Compression using a context

var cctx = newCompressCtx(1) # Compression level of 1 cbuf = cctx.compress(buf)

Decompression using a context

var dctx = newDecompressCtx() dbuf = dctx.decompress(cbuf)

assert(dbuf == buf) ```

[]{#memsize-decompressctx}

func memsize*(ctx: DecompressCtx): csize_t

Returns the size of the wrapped ZstdDCtx in memory.

remizstd/compstream Module

Types

[]{#compressstream}

type
  CompressStream* = ref object of RootObj
    syncClose*: bool

Represents a stream that uncompressed data can be written to in order to be compressed. The syncClose field controls whether or not the underlying stream is closed when the close() proc is called.

A CompressStream essentially wraps CompressCtx to provide a stream-like API.

Note: This does not currently implement the full api as found in std/streams.

[]{#level-compressstream}

func level*(strm: CompressStream): CompressionLevel {.inline.}

Returns the current compression level for this stream.

[]{#setlevel-compressstream}

proc setLevel*(strm: CompressStream, level: CompressionLevel) {.inline.}

Sets the compression level for this stream.

[]{#compressstream-threads}

func threads*(strm: CompressStream): int {.inline.}

Returns the number of threads this stream will use for compression.

[]{#compressstream-setthreads}

proc setThreads*(strm: CompressStream, num: int) {.inline.}

Sets the number of threads this stream will use for compression.

[]{#compressstream-checksum}

func checksum*(strm: CompressStream): bool {.inline.}

Returns true if the checksum flag is used by the underlying compression context, or false otherwise.

[]{#compressstream-setchecksum}

proc setChecksum*(strm: CompressStream, val: bool) {.inline.}

Sets whether or not the underlying context uses the checkum flag.

[]{#compressstream-closed}

func closed*(strm: CompressStream): bool {.inline.}

Returns true if the stream is closed, or false otherwise. This does not consider the underlying stream.

[]{#newcompressstream-level}

proc newCompressStream*(io: Stream, level: CompressionLevel, outputBufSize = 0): CompressStream

Creates a new stream for compression using the given compression level. Data written to this stream will be compressed and output to io. If outputBufSize is zero, then the default output stream size as reported by libzstd will be used.

[]{#newcompressstream}

proc newCompressStream*(io: Stream): CompressStream

Creates a new stream for compression using the DefaultCompLevel. Data written to this stream will be compressed and output to io.

[]{#write}

proc write*(strm: CompressStream, buf: string) {.inline.}

Compresses buf and writes the compressed data to the underlying stream.

[]{#compressstream-close}

proc close*(strm: CompressStream)

Closes the compression stream. If the syncClose field is true, then the underlying stream is also closed, otherwise it remains open.

Templates

[]{#withcompressstream}

template withCompressStream*(io, strmVar, forms: untyped)

Creates a CompressStream bound to strmVar that will write compressed data to io, then executes forms inside of a block. The compression stream will use the DefaultCompLevel. This ensures that close() is called when the block exits.

Example:

import std/streams
import remizstd/[compstream]

# Compression to a string stream
var dest = newStringStream()
withCompressStream(dest, cio):
  cio.write("This is some test data")

template withCompressStream*(io, level, strmVar, forms: untyped)

Creates a CompressStream bound to strmVar that will write compressed data to io using the given compression level, then executes forms inside of a block. This ensures that close() is called when the block exits.

[]{#withcompressstream-level-buffer}

template withCompressStream*(io, level, outputBufSize, strmVar, forms: untyped) =

Creates a CompressStream bound to strmVar that will write compressed data to io using the given compression level, then executes forms inside of a block. This also sets the size of the output buffer. This ensures that close() is called when the block exits.

remizstd/decompstream Module

Types

[]{#decompressctx}

type
  DecompressStream* = ref object of RootObj
    syncClose*: bool

Represents a stream that compressed data can be written to in order to be decompressed. The syncClose field controls whether or not the underlying stream is closed when the close() proc is called.

A DecompressStream essentially wraps DecompressCtx to provide a stream-like API.

Note: This does not currently implement the full api as found in std/streams.

Procedures and Functions

[]{#dict-decompressstream}

func dict*(strm: DecompressStream): Dict {.inline.}

Returns the custom dictionary assigned to the underlying compression context.

[]{#setdict-decompressstream}

proc setDict*(strm: DecompressStream, d: Dict) {.inline.}

Sets the custom dictionary assigned to the underlying compression context.

[]{#newdecompressstream}

proc newDecompressStream*(io: Stream, dict: Dict = nil, inputBufSize = 0): DecompressStream

Creates a new DecompressStream. Data written to this stream will be decompressed and output to io. If inputBufSize is zero, then the default input stream size as reported by libzstd will be used. dict may optionally be a custom decompression dictionary.

[]{#setio}

proc setIO*(strm: DecompressStream, newStream: Stream)

Sets a new underlying stream to write decompressed data into. This re-opens the DecompressStream.

[]{#read-num}

proc read*(strm: DecompressStream, num: Natural): (string, int)

Decompresses up to num bytes of data, then returns the decompressed data and the actual number of bytes that was decompressed.

This will return an empty string and zero when the end of the compressed data is reached.

[]{#read-dest}

proc read*(strm: DecompressStream, dest: var string): (string, int)

Reads up to len(dest) bytes of data into dest, then returns the actual number of bytes that was decompressed into dest.

This will return zero when the end of the compressed data is reached.

[]{#readall}

proc readAll*(strm: DecompressStream, bufferSize: Natural = 1024 * 1024): string

Decompresses all data from the stream and returns it.

[]{#close-decompressstream}

proc close*(strm: DecompressStream)

Closes the decompression stream. If the syncClose field is true, then the underlying stream is also closed, otherwise it will remain open.

Templates

[]{#withdecompressstream}

template withDecompressStream*(io, strmVar, forms: untyped)

Creates a DecompressStream bound to strmVar that will write decompressed data to io, then executes forms inside of a block. This ensures that close() is called when the block exits.

Example, decompressing between streams:

import std/streams
import remizstd/[decompstream]

# Decompress a file stream into a string stream
var src = newFileStream("/path/to/compressed.zstd", fmRead)
var dest = newStringStream()
withDeompressStream(dest, dio):
  src.write(dio.readAll())

Example, decompressing to a string:

import std/streams
import remizstd/[decompstream]

# Decompress a file stream into a string stream
var src = newFileStream("/path/to/compressed.zstd", fmRead)
withDeompressStream(dest, dio):
  echo "Decompressed data:"
  echo dio.readAll()

[]{#withdecompressstream-buffer}

template withDecompressStream*(io, inputBufSize, strmVar, forms: untyped)

Creates a DecompressStream with a custom buffer size bound to strmVar that will write decompressed data to io, then executes forms inside of a block. This ensures that close() is called when the block exits.

[]{#withdecompressstream-buffer}

template withDecompressStream*(io, dict, inputBufSize, strmVar, forms: untyped)

Creates a DecompressStream with a custom dictionary and buffer size bound to strmVar that will write decompressed data to io, then executes forms inside of a block. This ensures that close() is called when the block exits.