Login
Artifact [25cf87bb67]
Login

Artifact 25cf87bb67593cf46618a20007c5fede73c614903d366cf216d310f3ddeb3bba:


RSConf - A data format somewhere between JSON and hjson

Author: Remilia Scarlet
Version: 1.1
Last Updated: 25 September 2025

Introduction

RSConf has a few more features than JSON that make it nicer for config files, such as comments, not needing commas, and unquoted strings. But it it's less strict and has fewer features than hjson to keep parsing simple.

It was created to fill a niche and scratch an itch for a nice, easy-to-use config format that doubles as a data serialization format. JSON is nice and (usually) easy to visually parse as a human, but is too strict with its syntax, and doesn't have comments. YAML is nice, but has far too many bells and whistles, leading to all sorts of strange edge cases, and a large space for possible security problems. TOML is ugly, and I just don't like it. Hjson is nice, but has a few too many unnecessary features for my liking, and is a bit too flexible. Thus, RSConf was born to fill the hole left by these other formats.

As a high-level overview, RSConf is like JSON, but:

It is also a lot like hsjon, except:

Some well-formatted example RSConf data:

;;;;
;;;; Example Document
;;;;

some-object: {
  values: [ 1, 2, 3] ;; Recommended syntax for short arrays

  ;; Better syntax for larger arrays, or ones with long values.
  values-2: [
    "foo"
    "bar"
    "baz"
  ]

  name: "test"
  sub-obj: {
    id: 1, ;; Comma here is optional
    enabled: false ;; or "true"
    something-else: nil ;; Null values are represented with "nil"
  }
}

Notation Used

This spec document uses Common Lisp-style hexadecimal/octal/binary numbers. This means #xA7 instead of 0xA7, #o25 instead of 025, and #b1011 instead of 0b1011.

Specification

The name of the format is RSConf, with that capitalization. It is pronounced "Arr-Ess-Konf" and stands for Remilia Scarlet's Config Format.

Encoding

Files MUST be UTF-8 encoded without a byte-order marker. No other encoding is valid. The two recommended file extensions are .rsconf and .rsc

The following UTF-8 codepoints are considered whitespace (note the absence of

#x0D, Return): #x0A (newline), #x20 (space), #x09 (tab), #x200B (zero width space), #xA0 (non-breaking space), #x2007 (figure space), #x1680 (ogham space mark), #x2000 (en quad), #x2001 (em quad), #x2002 (en space),

#x2003 (em space), #x2004 (three-per-em space), #x2005 (four-per-em space),

#x2006 (five-per-em space), #x2008 (punctuation space), #x2009 (thin space),

#x200A (hair space), #x202F (narrow no-break space), #x205F (medium mathematical space), #x3000 (ideographic space).

Line Endings

Line endings must be "UNIX-Style", meaning only the newline character (UTF-8 character #x0A). Any return character (#x0D) or page character (#x0C) outside of a string must raise an error and must not be counted as whitespace. All other whitespace characters in UTF-8 are counted as whitespace and ignored outside of strings.

Comments

There are only single-line comments, which always starts with one or more semicolons and continues to a newline (#x0A). Lisp-style comments, where the number of semicolons vary on the depth, is recommended. Comments MUST NOT contain parsing directives or similar. Conforming implementations MUST NOT allow parsing directives, or they cannot claim to be conforming to the RSConf specification, nor claim to support RSConf. RSConf explicitly disallows any sort of data or metadata within comments that can be used during or after parsing. Comments are there for human eyes only.

Scalar Values

There are five types of scalar values: integers, floats, strings, booleans, and the null value.

Integer values can be in decimal (12345), hexadecimal (#xA7, #xDEADBEEF), octal (#o64, #o23), or binary (#b1100101, #b1011). They start with a number, or a # to indicate a specific radix, and continue to the end of the line, a comma (#x2C), a comment character, or other whitespace.

Implementing parsers must support integer values that can be represented as 64-bit signed integers. Greater limits are allowed.

Floats can only use decimal or "scientific notation". Decimal format starts with a number, then continues until a period (character #x2E) is reached. After that may be additional decimal numbers. The float continues to the end of the line, a comma, a comment character, or other whitespace.

Floats can use "scientific notation" in the form X.Ye+Z or X.Ye-Z or X.YeZ if desired. The letter d can be used in place of e as well, and they are case-insensitive. Implementing parsers must support values that can be represented with at least 50 bits of precision and an 8-bit exponent; larger limits are allowed.

Deserialization of RSConf data can still places stricter/looser restrictions on the integer and float values as-needed. For example, deserialization code can place a constraint on an integer key so that it must contain a 8-bit unsigned integer. But they must allow for values up to at least the minimum limits described above.

String values must be quote. They start with a double quote (character #x22) and continue to another double quote. This means all strings are "multi-line" strings.

The backslash character (\, #x5C) is used for escaping characters in a string. Double quotes that appear within strings must be escaped, e.g. the string "hello \"world\"".

Unicode characters may optionally be escaped with \u{X}, where X is always in hexadecimal and is the Unicode code point. It must be at least one hex digit long.

Given this, the only special escapes that do anything within strings are for double-quotes (\") and for UTF-8 characters (\u{X}). All other escapes simply produce the character they escape, e.g. "foo\abar" results in the same string as "fooabar".

Booleans are the symbols true or false; these are case-sensitive and must be lower-case.

The null value is the symbol nil. This is also case-sensitive and must be lower-case.

Composite Values

There are two types of composite values in RSConf: objects and arrays.

An object starts with an open brace ({, character #x7B) and end with a close brace (}, character #x7D). Within these braces are zero or more key-value pairs.

Arrays start with an open bracket ([, character #x5B) and end with a close bracket (], character #x5D). Within these brackets are zero or more values (never key-value pairs).

Keys and Values

Keys can be quoted or unquoted. For an unquoted key, the key name starts with a non-whitespace character and continues until whitespace or a colon (character

#x3A) is reached. A newline (#x0A) is not permitted in an unquoted key between the key name and a colon; a colon must appear on the same line as the unquoted key name it terminates. All whitespace before an unquoted key name is ignored. All whitespace except for the newline (#x0A) after the key name and before the colon is ignored.

Quoted keys are treated like string scalars, and thus all characters within them are accepted. All whitespace before the opening double-quote is ignored. All whitespace except for newlines (#x0A) after the closing double-quote is ignored. Quoted keys cannot have a newline character between its ending quote and the colon.

Key names cannot be an empty string (""). Key names are always case-sensitive.

It is recommended that you use unquoted keys whenever possible. Lowercase skewer-case names (e.g. key-name, this-is-a-key) that don't need quoting are highly recommended, but not required. Underscores in key names are allowed, but highly discouraged.

The value associated with the key comes after the colon character. All whitespace after the colon and before the value is ignored.

Values, whether as part of a key-value pair or as a value within an array, may optionally be terminated with a comma (character #x2C). This comma MUST be on the same line as the value it terminates; all whitespace except for a newline (character #x0A) is ignored between the value and a comma.

As an alternative, multiple key-value pairs and values may exist on the same line in both objects and arrays, but MUST be separated by commas. If no additional keys come after the final value, then the trailing comma (character

#x2C) is optional.

Empty values, and multiple commas in succession (regardless of whitespace between them), are not allowed and must signal an error.

Documents

The toplevel is called the "document". A document must consist of a single object or a single array. When a document consists of an object, then the toplevel braces ({ and }) may optionally be omitted and it is assumed all key-value pairs are part of the toplevel object.

Duplicate keys (more than one key with the same name at the same level) are allowed, but later keys overrided earlier keys. It's highly recommended that a warning be presented in some way when duplicate keys at the same level are detected, but this is not required.

Changelog