Artifact 25cf87bb67593cf46618a20007c5fede73c614903d366cf216d310f3ddeb3bba:
- File
docs/rsconf.md
— part of check-in
[bcdaddf434]
at
2025-09-25 07:03:01
on branch trunk
— Adjust wording to clarify how escapes should work in RSConf. Bump
RSConf revision to v1.1. (user: alexa size: 10130) [more...]
RSConf - A data format somewhere between JSON and hjson
Author: Remilia Scarlet
Version: 1.1
Last Updated: 25 September 2025
Introduction
RSConf has a few more features than JSON that make it nicer for config files, such as comments, not needing commas, and unquoted strings. But it it's less strict and has fewer features than hjson to keep parsing simple.
It was created to fill a niche and scratch an itch for a nice, easy-to-use config format that doubles as a data serialization format. JSON is nice and (usually) easy to visually parse as a human, but is too strict with its syntax, and doesn't have comments. YAML is nice, but has far too many bells and whistles, leading to all sorts of strange edge cases, and a large space for possible security problems. TOML is ugly, and I just don't like it. Hjson is nice, but has a few too many unnecessary features for my liking, and is a bit too flexible. Thus, RSConf was born to fill the hole left by these other formats.
As a high-level overview, RSConf is like JSON, but:
- Keys don't need quoting if they do not contain spaces.
- Commas don't need to come after values/key-value pairs unless there are multiple values/key-value pairs on the same line.
- Toplevel braces (
{and}) can be omitted if the toplevel is an object. - Integers and floats are differentiated.
- Integers can be in hex, octal, or binary. Floats can be in decimal or
"scientific notation" (and accept either
eordfor their character). - The null value is called
nil. - Comments are allowed and use semicolons (
;) - Explicit minimum limits for integers and floats.
- Stricter whitespace.
It is also a lot like hsjon, except:
- All strings are multi-line strings; no separate syntax.
- No quoteless strings.
- Integers can be in hex, octal, or binary. Floats can be in decimal or
"scientific notation" (and accept either
eordfor their character). - The null value is called
nil. - Comments are allowed and use semicolons (
;) - No multi-line comments.
- Explicit minimum limits for integers and floats.
- Slightly stricter whitespace.
Some well-formatted example RSConf data:
;;;;
;;;; Example Document
;;;;
some-object: {
values: [ 1, 2, 3] ;; Recommended syntax for short arrays
;; Better syntax for larger arrays, or ones with long values.
values-2: [
"foo"
"bar"
"baz"
]
name: "test"
sub-obj: {
id: 1, ;; Comma here is optional
enabled: false ;; or "true"
something-else: nil ;; Null values are represented with "nil"
}
}
Notation Used
This spec document uses Common Lisp-style hexadecimal/octal/binary numbers. This means #xA7 instead of 0xA7, #o25 instead of 025, and #b1011 instead of 0b1011.
Specification
The name of the format is RSConf, with that capitalization. It is pronounced "Arr-Ess-Konf" and stands for Remilia Scarlet's Config Format.
Encoding
Files MUST be UTF-8 encoded without a byte-order marker. No other encoding is valid. The two recommended file extensions are .rsconf and .rsc
The following UTF-8 codepoints are considered whitespace (note the absence of
#x0D, Return): #x0A (newline), #x20 (space), #x09 (tab), #x200B (zero width space), #xA0 (non-breaking space), #x2007 (figure space), #x1680 (ogham space mark), #x2000 (en quad), #x2001 (em quad), #x2002 (en space),
#x2003 (em space), #x2004 (three-per-em space), #x2005 (four-per-em space),
#x2006 (five-per-em space), #x2008 (punctuation space), #x2009 (thin space),
#x200A (hair space), #x202F (narrow no-break space), #x205F (medium mathematical space), #x3000 (ideographic space).
Line Endings
Line endings must be "UNIX-Style", meaning only the newline character (UTF-8 character #x0A). Any return character (#x0D) or page character (#x0C) outside of a string must raise an error and must not be counted as whitespace. All other whitespace characters in UTF-8 are counted as whitespace and ignored outside of strings.
Comments
There are only single-line comments, which always starts with one or more semicolons and continues to a newline (#x0A). Lisp-style comments, where the number of semicolons vary on the depth, is recommended. Comments MUST NOT contain parsing directives or similar. Conforming implementations MUST NOT allow parsing directives, or they cannot claim to be conforming to the RSConf specification, nor claim to support RSConf. RSConf explicitly disallows any sort of data or metadata within comments that can be used during or after parsing. Comments are there for human eyes only.
Scalar Values
There are five types of scalar values: integers, floats, strings, booleans, and the null value.
Integer values can be in decimal (12345), hexadecimal (#xA7, #xDEADBEEF), octal (#o64, #o23), or binary (#b1100101, #b1011). They start with a number, or a # to indicate a specific radix, and continue to the end of the line, a comma (#x2C), a comment character, or other whitespace.
Implementing parsers must support integer values that can be represented as 64-bit signed integers. Greater limits are allowed.
Floats can only use decimal or "scientific notation". Decimal format starts with a number, then continues until a period (character #x2E) is reached. After that may be additional decimal numbers. The float continues to the end of the line, a comma, a comment character, or other whitespace.
Floats can use "scientific notation" in the form X.Ye+Z or X.Ye-Z or X.YeZ
if desired. The letter d can be used in place of e as well, and they are
case-insensitive. Implementing parsers must support values that can be
represented with at least 50 bits of precision and an 8-bit exponent; larger
limits are allowed.
Deserialization of RSConf data can still places stricter/looser restrictions on the integer and float values as-needed. For example, deserialization code can place a constraint on an integer key so that it must contain a 8-bit unsigned integer. But they must allow for values up to at least the minimum limits described above.
String values must be quote. They start with a double quote (character #x22) and continue to another double quote. This means all strings are "multi-line" strings.
The backslash character (\, #x5C) is used for escaping characters in a string.
Double quotes that appear within strings must be escaped, e.g. the string
"hello \"world\"".
Unicode characters may optionally be escaped with \u{X}, where X is always
in hexadecimal and is the Unicode code point. It must be at least one hex digit
long.
Given this, the only special escapes that do anything within strings are for
double-quotes (\") and for UTF-8 characters (\u{X}). All other escapes
simply produce the character they escape, e.g. "foo\abar" results in the same
string as "fooabar".
Booleans are the symbols true or false; these are case-sensitive and must be
lower-case.
The null value is the symbol nil. This is also case-sensitive and must be
lower-case.
Composite Values
There are two types of composite values in RSConf: objects and arrays.
An object starts with an open brace ({, character #x7B) and end with a close
brace (}, character #x7D). Within these braces are zero or more key-value
pairs.
Arrays start with an open bracket ([, character #x5B) and end with a close
bracket (], character #x5D). Within these brackets are zero or more values
(never key-value pairs).
Keys and Values
Keys can be quoted or unquoted. For an unquoted key, the key name starts with a non-whitespace character and continues until whitespace or a colon (character
#x3A) is reached. A newline (#x0A) is not permitted in an unquoted key between the key name and a colon; a colon must appear on the same line as the unquoted key name it terminates. All whitespace before an unquoted key name is ignored. All whitespace except for the newline (#x0A) after the key name and before the colon is ignored.
Quoted keys are treated like string scalars, and thus all characters within them are accepted. All whitespace before the opening double-quote is ignored. All whitespace except for newlines (#x0A) after the closing double-quote is ignored. Quoted keys cannot have a newline character between its ending quote and the colon.
Key names cannot be an empty string (""). Key names are always
case-sensitive.
It is recommended that you use unquoted keys whenever possible. Lowercase
skewer-case names (e.g. key-name, this-is-a-key) that don't need quoting are
highly recommended, but not required. Underscores in key names are allowed, but
highly discouraged.
The value associated with the key comes after the colon character. All whitespace after the colon and before the value is ignored.
Values, whether as part of a key-value pair or as a value within an array, may optionally be terminated with a comma (character #x2C). This comma MUST be on the same line as the value it terminates; all whitespace except for a newline (character #x0A) is ignored between the value and a comma.
As an alternative, multiple key-value pairs and values may exist on the same line in both objects and arrays, but MUST be separated by commas. If no additional keys come after the final value, then the trailing comma (character
#x2C) is optional.
Empty values, and multiple commas in succession (regardless of whitespace between them), are not allowed and must signal an error.
Documents
The toplevel is called the "document". A document must consist of a single
object or a single array. When a document consists of an object, then the
toplevel braces ({ and }) may optionally be omitted and it is assumed all
key-value pairs are part of the toplevel object.
Duplicate keys (more than one key with the same name at the same level) are allowed, but later keys overrided earlier keys. It's highly recommended that a warning be presented in some way when duplicate keys at the same level are detected, but this is not required.
Changelog
- v1.1, 25 Sep 2025: Changed how escapes work so that it matches other languages.