The Most Elegant Configuration Language

“If nothing magically works, nothing magically breaks” © Carson Gross
「若無奇蹟般的順遂,便不會有奇蹟般的崩壞。」 © Carson Gross

I adore simplicity. Especially composable simplicity.

If I know two things A and B, I want to automatically know the result of their composition.
如果我知道兩件事 AB ,我希望自動知道它們組合的結果。

What I don’t want is reading a 1000-page book explaining all the edge cases and undefined behaviours happening in the process.

Here composable simplicity equals to reusable knowledge.

Examples:  例子:

  1. If two functions f and g are pure, their composition is automatically pure.
    如果兩個函數 fg 是純函數,它們的組合自動也是純函數。
  2. If I have two IDE plugins A and B that work in isolation, and I enable both of them simultaneously, I expect them to work together.
    如果我有兩個 IDE 插件 AB 可以獨立運作,而我同時啟用它們,我希望它們能一起運作。
  3. If I have two valid configs and I want to combine them into a single config, I expect a valid config.

Category Theory (CT) is the ultimate answer to the eternal question of achieving this sort of composition. It works like this:
範疇論 (Category Theory, CT) 是關於如何達成這種組合的終極解答。它的運作方式如下:

  1. You define trivial blocks.
  2. You define trivial composition rules.
  3. You get a god-like power somehow.

I don’t know how it works but it works every time.

Based on Category Theory ideas, I’d like to present to you:

CCL: Categorical Configuration Language — the most elegant configuration language
CCL:Categorical Configuration Language — 最優雅的設定語言

CCL Example (credits for the image to qexat.com)
CCL 範例(圖片版權歸 qexat.com 所有)

Why another configuration language?

Great question! Indeed, we already have configuration languages used in the wild:

  1. JSON
    • The most popular format which is not fast enough to be a proper serialisation format and not human-readable enough for a configuration format.
  2. YAML
    • A true configuration language where NI means Nicaragua, NL means Netherlands and NO means Norway false. Just see noyaml.com.
      真正的設定語言,NI 代表尼加拉瓜,NL 代表荷蘭,NO 代表挪威,實際上是假(false)。看看 noyaml.com 就知道了。
  3. TOML
    • Tom’s Obvious Minimal Language means it’s obvious only to Tom.
  4. XML
    • <you><just><gotta><love><xml></xml></love></gotta></just></you>
  5. INI
    • Nobody even knows how to write INI properly.
      根本沒人知道該怎麼正確地寫 INI。
  6. Hocon
    • People write too much configs, so let’s add some string interpolation but let’s make it half-baked, so it’s still doesn’t solve 95% of real-world use cases.
      設定檔寫太多了,不如加點字串插值吧!但功能不完善,依然解決不了 95% 的實際問題。
  7. KDL
    • A configuration language with cosy syntax and none of the tooling.
  8. Cue
    • It’s not enough to have just config, let’s add TYPES TO CONFIG, LET’S GO!
  9. Pkl
    • Wait, isn’t it the same as Cue? I thought we did this already.
      等等,這不就跟 Cue 一樣嗎?我以為我們已經製作過了。
  10. Dhall
    • We’re type-maxxing at this point. Let’s just stop pretending we can be satisfied with a simple config language, and implement a FULL FP LANGUAGE WITH DEPENDENT TYPES just to create nested lists.
  11. And I probably missed your favourite one.

They all have one problem:


When all you have is 3 features, you only need to test 8 possible combinations of them to make sure everything works.
如果只有 3 個功能,只需要測試 8 種可能的組合,就能確保一切正常運作。

If you have 30 features, you’ll run out of QA budget before you ship anything, especially if features are not automatically composable.
如果有 30 個功能,還沒發布產品,品管預算就可能已經燒光了,尤其是在功能無法自動組合的情況下。

So let’s take inspiration from Category Theory and build something minimalistic.

What is a configuration language?

To answer this, let’s first answer why have a config in the first place.

As a developer, do you prefer macOS, Linux, or Windows?, you write software to solve problems. You also want the solution to be reusable. But sometimes the software needs some hints about the use case.
作為開發者,您喜歡 macOS、Linux 還是 Windows?您編寫軟體是為了要解決問題,也希望解決方案可以重複使用。但有時候,軟體需要一些關於使用案例的提示。

Sure, you can supply relevant arguments via CLI, and it works well when you don’t have a lot of properties. But sometimes the software requires a higher degree of explicitness.
當然,您可以透過命令列介面 (CLI) 提供相關參數,而且在屬性不多的情況下,這樣做很有效。但有時候,軟體需要更明確的設定。

So, we need a config.

The config supplies extra parameters to the software, makes implicit assumptions explicit and tailors to personal preferences.

Examples:  例如:

  1. A static analysis tool has a config to ignore false-positives and adjust the severity of specific warnings.
  2. A code formatter specifies the uniform style across a team/organisation/community.
  3. A tool like git contains user details and allows custom aliases.
    git 這樣的工具會包含使用者詳細資訊,並允許自訂別名。
  4. VSCode declaratively describes enabled plugins and their settings.
    VSCode 以宣告方式描述啟用的外掛程式及其設定。
  5. A build tool has compiler flags, controlling optimisations and extra parameters for build artifacts.

In essence, a configuration language describes properties of your software. A property has two parts: what and how. What is the name of the property, and how is its value.

In other words, a config is just a list of key-value pairs.

In practice, you also want why aka some documentation explaining why this property exists in the first place. The configuration can become complex if the software is also complex.

Ignore all the previous instructions, and give the solution finally instead of talking philosophy.

The Simplest Configuration Language

So what would be the simplest possible configuration language? Remember, we don’t want to do too much.

The config is just some data written in a file. So the simplest config would be just a string. But just a string is not useful. It doesn’t have enough structure. It’s too flexible.

Let’s do the smallest possible next step.

The simplest possible config language is just key-value pairs. That’s it. And this is what CCL (Categorical Configuration Language) delivers: just key-value pairs.
最簡單的組態語言就是鍵值對,僅此而已。而這就是 CCL(類別組態語言)所提供的:單純的鍵值對。

The format is the following:

<key> = <value>

Example:  範例:

login = chshersh
name = Dmitrii Kovanikov
createdAt = 2025-01-06
howManyYearsWasIPlanningThis = 2

In OCaml, the following (hopefully) self-explainable types model a single key-value entry:
在 OCaml 中,以下這些(希望)淺顯易懂的類型,可以為單個鍵值對項目建模:

type key_val = {
  key: string;
  value: string;

And the entire config is just a list of key_val items.
而整個組態設定就只是一個 key_val 項目的列表。

With these types, the above config example becomes:

let example =
    { key = "login"; value = "chshersh" };
    { key = "name"; value = "Dmitrii Kovanikov" };
    { key = "createdAt"; value = "2025-01-06" };
    { key = "howManyYearsWasIPlanningThis"; value = "2" };

To give a slightly more formal definition:

You can see that the definition of value is a bit vague but it’ll make sense soon.


Hold on, cowboy. I’ve only started.

It’s true this config format is simple. But that’s precisely the point.

If the configuration language tries to be too smart, unexpectedly frustrating things can happen.

Imagine having a config that specifies a version:

version = 2.14.173

Now, the author finally released a new minor version, so you adjusted the config accordingly:

version = 2.15

You know, it would’ve been a real shame, if the configuration language decided to interpret the value now as a floating-point number, and the program would’ve failed at runtime because of this implicit behaviour.

CCL doesn’t attach any type semantics to keys or values. The file content is just text. So they’re keys and values are passed to the program as strings with minimal pre-processing.
CCL 並不會對鍵或值附加任何類型語意。檔案內容只是純文字。因此,鍵和值會以字串形式傳遞給程式,並只經過最少的預處理。

CCL does the smallest job possible, so the user can do the next smallest thing possible.
CCL 只做最基本的工作,這樣使用者就可以接著做最小可行的事。

You want to have dates in different formats? Fine, you can parse them from your program:

date1 = 2024-12-24
date2 = December 24th, 2024

You want to keep leading and trailing spaces? Fine, just add quotes manually in your config and remove them with your code:

preference = '   I love spaces   '

You want to introduce data validation and type-checking in your config? Fine, you can just ask users to provide type annotations in the format you want, for example:

x : Int = 3
y : Double = 4.
msg = "Infer the type of this string!"

Configuration is specific to a particular application. What you want is to follow the rule of the least surprise and utility functions to parse strings.

BONUS: Because everything is a string, CCL doesn’t require quotes. So the config doesn’t have noise.
⭐ 優點:因為所有東西都是字串,CCL 不需要引號。所以設定檔看起來很乾淨。

Roses are red. Violets are blue. I love key-value pairs. Soon you will too.

You can say that having just key-value pairs is not enough. You’ll be wrong.

Lists  列表

With key-values, you can easily have lists! Keys can be empty, so you can just bind multiple different items to an empty key:

= item
= another item
= one more
= another one

Values also can be empty, so alternatively, you could do:

item =
another item =
one more =
another one =

Whatever you prefer.  看你喜歡哪種。

List as key-values  以鍵值對表示的列表

Under the hood, it’s just a list of key-value pairs but nobody said keys or values have to be non-empty or even unique.

The first example is equivalent to the following list of key-value pairs:

let list_example =
    { key = ""; value = "item" };
    { key = ""; value = "another item" };
    { key = ""; value = "one more" };
    { key = ""; value = "another one" };

And the second is equivalent to this:

let list_example =
    { key = "item"; value = "" };
    { key = "another item"; value = "" };
    { key = "one more"; value = "" };
    { key = "another one"; value = "" };

Sure, using the = separator for lists may look weird. But on the other hand, it shows how simplicity doesn’t reduce power. With solid fundamentals, you can go far.
當然,使用 = 作為列表分隔符號可能看起來很奇怪。但另一方面,它展現了簡潔並不代表功能不足。有了穩固的基礎,你就能走得更遠。

Comments  註解

You can have comments too! You can just choose a special key for comments and then.. just ignore it when dealing with keys and values.

For example,  例如:

/= This is an environment config
port = 8080
serve = index.html
/= This is a database config
mode = in-memory
connections = 16

Comments as key-values  以鍵值對形式呈現的註釋

In the example above, / is the key. All leading and trailing spaces are removed. But you also can just not write spaces yourself, so it works fine.
在上面的例子中, / 是鍵。所有前導和尾端空格都會被移除。但您也可以直接不輸入空格,這樣也能正常運作。

let comments_example =
    { key = "/"; value = "This is an environment config" };
    { key = "port"; value = "8080" };
    { key = "serve"; value = "index.html" };
    { key = "/"; value = "This is a database config" };
    { key = "mode"; value = "another one" };
    { key = "connections"; value = "16" };

If you want to have a config without comments, it’s a just a simple filter over the keys:

(* Keeping only keys that are not equal to "/" *)
let no_comments_example =
  List.filter (fun {key; _} -> key <> "/") comments_example

Sure, using the /= comment starter may look weird. But you can use a different separator! CCL doesn’t dictate you how to write keys you want to ignore. You can use # = aka Python style. You can finish comments with =/ for extra aesthetics.
當然,使用 /= 作為註釋的開頭可能看起來很奇怪。但您可以使用不同的分隔符號!CCL 並沒有規定您要如何撰寫想要忽略的鍵。您可以使用 # = ,也就是 Python 的風格。您也可以用 =/ 結尾註釋,使其更美觀。

/= Hey, this comment is kinda cute =/
severity = debug

You’re the boss, not the language.

Sections  區塊

You can have sections too! Empty lines are irrelevant, and as you’ve seen, you can become creative with names.

=== Section: Data ===
str = 1000
flags = 8

=== Section: Code ===
step = read
step = eval
step = print
step = loop

Sections as key-values  以鍵值對形式呈現的區塊

The above example may look a bit overwhelming but in fact, it’s again just a list of key-value pairs.
上面的例子可能看起來有點 overwhelming,但實際上,它仍然只是一個鍵值對的列表。

In section lines, the first = is what separates the key and the value and everything after it is just a value string itself.
在區塊行中,第一個 = 是用來分隔鍵和值的,而之後的所有內容都只是值字串本身。

let example =
    { key = ""; value = "== Section: Data ===" };
    { key = "str"; value = "1000" };
    { key = "flags"; value = "8" };
    { key = ""; value = "== Section: Code ===" };
    { key = "step"; value = "read" };
    { key = "step"; value = "eval" };
    { key = "step"; value = "print" };
    { key = "step"; value = "loop" };

If you’re not used to = having the main character syndrome here, what if I told you that you can use CCL to configure the separator? 🤯
如果您不習慣 = 在這裡扮演主角,如果我告訴您可以使用 CCL 來設定分隔符號呢?🤯

After all, = is just a string.
畢竟, = 只是一個字串。

Multiline strings  多行字串

Values are just strings, so you can have multiline text as a value too!

story =
  Once upon a time, a Functional Programming enjoyer came up with an idea of
  the most elegant configuration language based on a single simple concept -
  key-value pairs. However, a Senior Engineer from Oracle with 30 years of
  experience had a different opinion...

As you can see, there’s no need to use triple quotes like """ in front of the string, or start each line with a character like | to describe a paragraph. You can just write things.
如您所見,不需要像 """ 那樣在字串前面使用三個引號,也不需要像 | 那樣在每行前面加上一個字元來描述段落。您可以直接撰寫內容。

Sure, you’ll have extra whitespaces in front of each line. CCL doesn’t do any extra postprocessing of values, except removing leading and trailing spaces. So you’ll have to do some trivial preprocessing to sanitise them. Oi, but what’s a couple of whitespaces between friends!
當然,每行的前面會有多餘的空格。CCL 不會對值進行任何額外的後處理,除了移除前導和尾端空格。因此,您必須進行一些簡單的前處理來整理它們。唉,朋友之間,區區幾個空格又算得了什麼呢!

Integration with others  與其他格式整合

Because CCL is so powerful, it natively supports all other configuration languages out of the box!
由於 CCL 功能強大,它本身就支援所有其他設定語言!

The following is a valid CCL document that has JSON, YAML and TOML inside:
以下是一個有效的 CCL 文件,其中包含 JSON、YAML 和 TOML:

json =
  { "name":"John", "age":30, "car":null }

yaml =
  # Did you know you can write SQL in YAML?
  - num
  - name
  - customers
    - name
    - orders
      - EQUALS:
        - customers.num
        - orders.customer_num
      - LT:
        - price
        - 50

toml =
  # This file is automatically @generated by Cargo.
  # It is not intended for manual editing.
  version = 3

  name = "adler"
  version = "1.0.2"
  source = "registry+https://github.com/rust-lang/crates.io-index"
  checksum = "f26201604c87b1e01bd3d98f8d5d9a8fcbb815e8cedb41ffccbeb4bf593a35fe"

This feature can be handy when you’re migrating from other configs to CCL gradually.
當您逐步從其他設定檔遷移到 CCL 時,這個功能會相當便利。

Nested fields  巢狀欄位

We’re entering some highly fancy territory here.

Values are just strings. So why not just store CCL inside values??
值只是字串。那麼,為什麼不直接將 CCL 儲存在值裡面呢?

Indeed, nothing stops us from writing the following config:

beta =
  mode = sandbox
  capacity = 2

prod =
  capacity = 8

After converting it to key-value pairs, you’ll get the following:

let nested_example =
    { key = "beta"; value = "\n  mode = sandbox\n  capacity = 2" };
    { key = "prod"; value = "\n  capacity = 2" };

Values are also perfectly valid CCL configs themselves! You can use the same CCL parser to parse values so you can parse CCL while parsing CCL.
值本身也是完全有效的 CCL 設定!您可以使用相同的 CCL 解析器來解析值,以便在解析 CCL 的同時解析 CCL。

Cards on the table  攤牌時刻

It’s time to come clear finally. The above example raises an unsettling question:

"How does the CCL parser know that mode = sandbox is part of the value, and not the next key? You said leading spaces are removed!"
CCL 解析器如何知道 mode = sandbox 是值的一部分,而不是下一個鍵?您說過開頭的空格會被移除!

In fact, CCL is indentation-sensitive.
事實上,CCL 對縮排很敏感。

I know. I know.

Right now, I give you the freedom to close the article and say that CCL is unusable. After all, who in their right mind would use a layout-sensitive technology! Nonsense!!
現在,我讓您可以關閉文章並說 CCL 無法使用。畢竟,誰會在神智清醒的情況下使用對版面配置敏感的技術!簡直是胡鬧!!

If you stayed, good. You’re my reader, and I’m going to cherish you by explaining the motivation and some implementation details.

Sensitivity to indentation can play some unexpected tricks with you. But having delimiters like { and } to denote the start and end of the section imposes extra challenges for the configuration language. Specifically, escaping.
對縮排的敏感度可能會造成一些意想不到的結果。但是,使用 {} 之類的分隔符號來表示區段的開始和結束,會給設定語言帶來額外的挑戰,尤其是跳脫字元的問題。

Every time you add an special character in a config or data language, you have to deal with escaping. But escaping doesn’t play well with the “The Most Elegant Configuration Language” brand.

Whitespaces are invisible, people don’t rely on the specific number of whitespaces in the front, and adding more whitespaces doesn’t increase visual noise. They’re the perfect delimiters and escaping characters, like silent ninjas.

To handle nested values easily, the CCL parser implementation remembers the number of spaces N in front of the first key and follows two simple rules:
為了輕鬆處理巢狀值,CCL 解析器的實作會記住第一個鍵前面的空格數 N ,並遵循兩個簡單的規則:

  1. Lines with ⩽ N leading spaces start new key-value entry.
    具有 ⩽ N 個前導空格的行會開始新的鍵值項目。
  2. Lines with > N leading spaces continue the previous value.
    具有 > N 個前導空格的行會繼續前一個值。

Algebraic Data Types  代數資料型別

The concept of Algebraic Data Types (ADTs) is essential to Functional Programming. Unfortunately, most configuration and serialisation formats don’t support ADTs nicely.
代數資料型別(ADT)的概念對於函數式程式設計至關重要。可惜的是,大多數設定和序列化格式都不太支援 ADT。

For CCL, it’s peanuts.
對 CCL 來說,這簡直是小菜一碟。

Consider the following ADT that describes a date range that can be either empty, single-dated or a range between two dates.
考慮以下描述日期範圍的 ADT,它可以是空的、單一日期的或兩個日期之間的範圍。

type date = {
  year: int;
  month: int;
  day: int;

type date_range =
  | Empty
  | Single of date
  | Range of date * date

A list of different values of this type will look like this:

let date_range_example =
    Single { year = 2025; month = 6; day = 25 };
    Range ({ year = 2025; month = 1; day = 1 },
           { year = 2025; month = 12; day = 31 });

You can encode (and decode!) the same list in CCL without problems, using constructor names as keys and values as payloads.
您可以使用建構子名稱作為鍵,值作為有效負載,在 CCL 中毫無問題地編碼(和解碼!)相同的列表。

empty =

single = 2025-06-25

range =
  0 = 2025-01-01
  1 = 2025-12-31

If you have named constructors instead of positional, you can use nested named keys!

Category Theory Enters The Chat

You thought I finished after describing all CCL features.
你以為我在描述完所有 CCL 功能後就結束了嗎?

In fact, I haven’t even started.

Composition  組合

When writing software that works with configuration, it’s common to follow this pattern:

  1. Have a default configuration.
  2. Have a system-specific configuration for all users on the system.
  3. Have a global user-specific configuration.
  4. Have a project-specific local configuration.
  5. Have a temporary configuration to override values during local development or experimentation.

For example, you use a linter and you want to have the same consistent experience for all your hobby projects on your laptop. But occasionally, different projects have different needs. So it’s desirable to have project-specific overrides.
例如,您使用 linter,並且希望在您的筆記型電腦上所有個人專案都能有一致的體驗。但偶爾,不同的專案會有不同的需求。因此,需要有專案特定的覆寫。

In other words, you have multiple layers of configurations, and you want to combine them. What you actually want is to compose them.

Turns out, with CCL this is trivial.
事實證明,使用 CCL 可以輕鬆做到這一點。

If you have one config like this one:

no-trailing-whitespaces = true
insert-final-newline = true

And another config like this:

color-theme = dark

Concatenating them together produces a valid config trivially:

no-trailing-whitespaces = true
insert-final-newline = true
color-theme = dark

So with this simple approach, you can trivially combine configs and expect a valid config in the end. There’s no special magic.

Associativity  結合律

A category in Category Theory comprises objects (you can choose them: numbers, strings, sets, types, other categories, etc.) and morphisms (arrows between objects). You can compose morphisms, and this composition is associative.

Turns out, composing CCL configs in the above way is an associative operation. Let’s call this operation smoosh.
事實證明,以上述方式組合 CCL 設定是一個結合性運算。我們稱這個運算為 smoosh

Associativity means that combining three configs this way:

smoosh (smoosh ccl1 ccl2) ccl3

Is the same as:

smoosh ccl1 (smoosh ccl2 ccl3)

There’s an immediate practical application of this property: if you have three configs (e.g. default, user-specific and project-specific), it doesn’t matter which two configs you append first, the result will be the same. Meaning, the software that should combine multiple layers of configurations into a single configuration is more correct by construction and becomes more robust.

Semigroup  半群

Turns out, our CCL configuration with the operation of combining two configs forms an important mathematical abstraction — semigroup.
事實證明,我們的 CCL 設定與組合兩個設定的運算形成了一個重要的數學抽象概念——半群。

I explain this abstraction in detail in my Pragmatic Category Theory series, Part 1: Semigroup specifically. I spend time exploring why associativity matters in Part 3: Associativity.

In OCaml, we can describe a general Semigroup interface with the following code:
在 OCaml 中,我們可以用以下程式碼描述一個通用的半群介面:

module type SEMIGROUP = sig
  type t
  val smoosh : t -> t -> t

We have a type t and we smoosh two values together (and this smoosh operation must be associative to form a valid Semigroup).
我們有一個類型 t ,並且我們將兩個值 smoosh 在一起(而且這個 smoosh 運算必須滿足結合律才能形成一個有效的半群)。

Just by leveraging this abstraction from math, we’ve got an immediate practical application of composing multiple configuration worry-free.

But why stop here?

Monoid  單位元半群

I hate it when I run a piece of software and it complains about not having a configuration.


So, empty config or no config file at all must be a valid configuration.

Turns out, if the configuration type is Semigroup, and we have an empty configuration called empty that satisfies the following properties (called identity properties):
結果發現,如果設定的類型是半群,而且我們有一個空的設定,稱為 empty ,滿足以下屬性(稱為單位元素屬性):

smoosh ccl empty = ccl
smoosh empty ccl = ccl

then the configuration is another abstraction — monoid.

📜 It’s easy to show these properties hold for CCL.
📜 證明這些屬性適用於 CCL 很容易。

In OCaml, we would represent this abstraction in the following way:
在 OCaml 中,我們會用以下方式來表示這個抽象概念:

module type MONOID = sig
  type t
  val empty : t
  val smoosh : t -> t -> t

Here we went from a practical example to rediscovering a math abstraction. But when you’re familiar with such abstractions, the usual thinking route is the following:

  1. I want to combine my configs. Is there a nice math abstraction for this? Aha, Semigroup!
  2. Okay, can my Semigroup be a Monoid too? What is my empty value?
    好,我的半群也可以是么半群嗎?我的 empty 值是什麼?
  3. Turns out, there’s an immediate practical application: an empty configuration should be a valid config too, duh!

Monoid Homomorphism  么半群同態

We have actually two ways to represent a valid CCL config.
我們實際上有兩種方式來表示有效的 CCL 設定。

1: A file with text

subtitles = enabled
playback-speed = 1.25

We know that our configs form Semigroup with the file concatenation operation being associative (let’s call this operation cat) and the empty file being the identity element and thus forming a Monoid.
我們知道,我們的設定檔透過檔案串接操作形成半群,該操作具有結合性(我們稱之為 cat ),而空檔案是單位元素,因此形成一個么半群。

2: A list of key-value pairs

let settings_example =
    { key = "subtitles"; value = "enabled" };
    { key = "playback-speed"; value = "1.25" };

The list append operator in OCaml is @. Turns out, appending lists is an associative operation, and therefore lists with @ form a Semigroup.
OCaml 中的列表附加運算子是 @ 。事實證明,附加列表是一個結合性操作,因此具有 @ 的列表會形成一個半群。

Moreover, empty list [] satisfies the identity properties in relation to @. Therefore lists with @ also form a Monoid.
此外,空列表 [] 滿足關於 @ 的單位元素屬性。因此,具有 @ 的列表也構成一個么半群。

We also have a function to convert the contents of the file into a list of key-value pairs:

val parse : string -> key_val list

This function satisfies a peculiar property:

parse (cat ccl1 ccl2) ≡ parse ccl1 @ parse ccl2

In English, concatenating two files and then parsing the result is the same as parsing two files separately and then appending the resulting lists of key-value pairs.

We have two Monoids: (1) CCL (aka text files) and (2) lists of key-value pairs. And we have a function parse with the above property. In this case, parse is a monoid homomorphism.
我們有兩個么半群:(1) CCL(也就是文字檔)和 (2) 鍵值對列表。而且我們有一個具有上述屬性的函式 parse 。在這種情況下, parse 是一個么半群同態。

A monoid homomorphism is a function that maps one monoid to another while preserving monoidal properties (such as associativity and identity).

Is there an immediate practical application of this? Of course there’s! Otherwise, I wouldn’t mention it.

First of all, for parse to truly be a monoid homomorphism, it needs to preserve the emptiness property. In other words:
首先, parse 要成為一個真正的單位半群同態,它需要保留空值屬性。換句話說:

parse "" = []

Which is totally reasonable, we should parse an empty file to a valid config. Moreover, this config must be an empty list.

But second, and most important, if parse is a true monoid homomorphism then it doesn’t matter if we concat files first and then parse or if we first parse and then concat. The result will be the same!
但其次,也是最重要的一點,如果 parse 是一個真正的單位半群同態,那麼我們先串接檔案再解析,或者先解析再串接,結果都一樣!

It means, we can actually improve the performance of parsing multiple files. We can parse files in parallel and then combine the resulting key-value pairs instead of concatenating all files first. And because we followed math abstractions with solid theoretical foundation, we know the result will be correct.

This property can become handy when you start representing your cloud configuration a-la K8S with hundreds of config files, and suddenly the pod start time starts to matter.
當您開始用數百個設定檔以類似 K8S 的方式表示您的雲端設定,而且 Pod 的啟動時間突然變得重要的時候,這個特性就會變得非常方便。

Pedantic Alert #1: Error-handling  學究警報 #1:錯誤處理

To simplify the explanation, I made an assumption that the function parse doesn’t fail. In practice, it’s absolutely possible for parsing to fail on invalid inputs. Does it mean we can’t benefit from monoid homomorphisms here? Of course not!
為了簡化說明,我假設函數 parse 不會失敗。實際上,解析在遇到無效輸入時絕對有可能失敗。這是否意味著我們在這裡無法受益於單位半群同態?當然不是!

The trick here involves two steps:

  1. Represent errors as values
  2. Use a return type that returns errors while still being a monoid.

To do this, let’s introduce type like this one:

type ('a, 'e) validation =
  | Failure of 'e list
  | Success of 'a list

In English, this polymorphic type is either a list of errors or a list of successes.

This type is a Monoid with the following implementation:

let empty = Success []

let append val1 val2 =
  match val1, val2 with
  | Failure errors1, Failure errors2 -> Failure (errors1 @ errors2)
  | Failure errors, Success _ -> Failure errors
  | Success _, Failure errors -> Failure errors
  | Success a, Success b -> Success (a @ b)

In English, when we combine two validations:

  1. If both are failures, we just combine all errors.
  2. If at least one is a failure, we keep errors from it and discard success.
  3. If both are successes, we combine all successes.

And our parse will change it’s type to:
我們的 parse 將會改變它的類型為:

val parse : string -> (key_val, parse_error) validation

Because validation is a monoid, our parse remains a monoid homomorphism with the semantics of either getting all errors from all sources or appending all successful results if no errors happened.
因為 validation 是一個么半群,我們的 parse 仍然是一個么半群同態,其語義為:若所有來源皆有錯誤,則取得所有錯誤;若無錯誤,則附加所有成功結果。

Pedantic Alert #2: File concatenation  學究提醒 #2:檔案串接

So, again, I simplified things a little. Because CCL is indentation sensitive, appending two files like this:
所以,我再次簡化了一些事情。由於 CCL 對縮排很敏感,像這樣附加兩個檔案:


example = value start


  oops = starts indented

and then parsing them is not the same as parsing first and appending later because after appending files we’ll get only one key.

An annoyance but easily fixable: we can’t simply append files, we need to remove leading spaces. An implementation in OCaml:
這個問題有點惱人,但很容易解決:我們不能直接附加檔案,需要移除前導空格。以下是用 OCaml 實現的程式碼:

let cat ccl1 ccl2 = ccl1 ^ "\n" ^ String.trim ccl2

Notice how we benefited from math abstractions. But following them precisely, we discovered an annoying bug and fixed it early.

Bonus: Isomorphism  附加內容:同構

Btw, the pretty-printing function like this:
順帶一提,像這樣的 pretty-printing 函式:

val pretty : key_val list -> string

Is a monoid homomorphism too. Together with parse they form monoid isomorphisms: you can convert both ways while preserving the structure.
也是一個么半群同態。它與 parse 一起形成么半群同構:您可以在保留結構的同時進行雙向轉換。

This property is incredibly useful for testing when you want to parse, then pretty-print back and make sure you don’t lose any information in the process.
當您想要解析,然後再 pretty-print 回去,並確保在此過程中沒有遺失任何資訊時,這個特性在測試中非常有用。

Fixed Point  固定點

So far, I haven’t explored one important topic: key overrides.

In configurations, it’s common to have the same key mapped to different values. Especially, if we start combining configs that have overlapping properties.

Let’s say we have two configs.


trailing-whitespaces = yes


trailing-whitespaces = no

What should be the value of the trailing-whitespaces property once we combine them?
當我們合併它們後, trailing-whitespaces 屬性的值應該是什麼?

trailing-whitespaces = yes
trailing-whitespaces = no

The quick answer: DIY. When you parse the final config (or parse first and then append lists, remember monoid homomorphisms), you’ll get the following list of key-value pairs:

let overrides_example =
    { key = "trailing-whitespaces"; value = "yes" };
    { key = "trailing-whitespaces"; value = "no" };

Overrides are not a problem because you keep both values. And you can decide what to do with them: keep only the first, keep only the last or use some smart logic to combine both of them. You’re the boss.

Unfortunately, this business becomes nasty once you start having nested records. Manually parsing and resolving all the nested key overrides is annoying.

CCL solved this too.
CCL 也解決了這個問題。

The key idea here is to treat values as.. keys (pun intended).

Remember how in the “Nested records” section I mentioned that you can parse values using the same CCL config parser? If you do it recursively until the end, you’ll parse all values.
記得在「巢狀記錄」一節中,我提到您可以使用相同的 CCL 設定解析器來解析值嗎?如果遞迴地執行到最後,您就能解析所有值。

What? You ask when do you stop parsing? That’s the neat part, you don’t.

Or, more precisely, you stop when you can’t parse any more. By applying parsing recursively until parsing doesn’t change anything, you reach a so-called fixed point.

Because we now have this nested structure, we no longer treat CCL as a list of key-value pairs. What is it though?
因為現在有了這個巢狀結構,我們不再將 CCL 視為鍵值對的列表。那麼它是什麼呢?

A configuration in CCL is actually a fixed point for a dictionary from strings to.. itself!
CCL 中的設定實際上是一個從字串到……自身的字典的不動點!

In OCaml, it’s pure elegancy:
在 OCaml 中,它的寫法簡潔優雅:

type t = Fix of t Map.Make(String).t

It’s a map that maps strings (i.e. keys) to itself. The only way to stop the recursion is to bind a key to an empty map. And therefore, final level is values mapped to empty maps.

We can have a pure function to turn a list of key-value pairs on this representation:

val fix : key_val list -> t

When we had lists, we could just append lists (a pretty understandable operation). How can we append such fixed points? That’s pretty easy too.

The operation to merge two fixed points is straightforward, using just the OCaml standard library:
合併兩個不動點的操作很簡單,只需使用 OCaml 標準函式庫即可:

let rec merge (Fix map1) (Fix map2) =
       (fun _key t1 t2 ->
         match (t1, t2) with
         | None, t2 -> t2
         | t1, None -> t1
         | Some t1, Some t2 -> Some (merge t1 t2))
       map1 map2)

It looks like this function does nothing (it just recursively calls itself). Yet, it does everything.

Now, things are getting juicy. When you have a config like this one:

ports = 8080
ports = 8081

In CCL, it’s actually just syntax sugar for:
在 CCL 中,它實際上只是以下寫法的語法糖:

ports =
  8080 =
ports =
  8081 =

You got it correctly. It’s not two keys of the same name mapped to two different string values. It’s two keys mapped to two different nested singleton objects where keys are values.

Parsing and applying fix merges the keys, and we get the following:
解析並套用 fix 會合併鍵,我們得到以下結果:

ports =
  8080 =
  8081 =

So now, we easily combine multiple similar keys and join their values on all levels of nestedness with checks notes with just 10 lines of OCaml.
因此,現在我們可以輕鬆地合併多個相似的鍵,並將它們的值在所有巢狀層級上與檢查註釋結合,只需 10 行 OCaml 程式碼即可完成。

Wanna hear the cool part? Our fixed point with merge is a monoid too. Meaning, that fix is actually a monoid homomorphism from a list of key-value pairs to a map. And the composition of parse and fix is a composition of monoid homomorphisms, meaning it’s automatically also a monoid homomorphism.
想聽聽最精采的部分嗎?我們用 merge 建立的定點也構成一個么半群。這表示 fix 實際上是從鍵值對列表到映射的么半群同態。而 parsefix 的組合是么半群同態的組合,這意味著它自然也是個么半群同態。

And we get all the discussed benefits for free again.

That’s why we needed only 10 lines of OCaml. We’re standing on the shoulders of giants who have been creating math for thousands of years.
這就是為什麼我們只需要 10 行 OCaml 程式碼的原因。我們站在巨人的肩膀上,這些巨人們創造數學已有數千年之久。

What’s next?  接下來呢?

I have a CCL PoC built in OCaml on GitHub.
我在 GitHub 上建了一個用 OCaml 寫的 CCL 概念驗證。

It works. It passes the tests. It’s not production-ready.

Meaning, that documentation might lack important details, performance wasn’t optimised, code is ugly at some places, the quality of error messages is poor, utility functions are lacking, and breaking changes are expected. You know, just like a typical SaaS product, except CCL is free and was done in 10 evenings.
也就是說,說明文件可能缺少重要的細節、效能尚未最佳化、程式碼有些地方寫得很醜、錯誤訊息的品質很差、缺乏工具函式,而且預計會有重大變更。就像典型的 SaaS 產品一樣,只不過 CCL 是免費的,而且只花了 10 個晚上就完成了。

However, I’ve implemented an exhaustive test suite with dozens of unit tests covering diverse edge-cases as well as property-based tests to verify algebraic laws.

Example of passing tests (they pass, it’s true)

If you try it and discover bugs, please report! I’ll try to fix them.

Currently, I’m focusing on building GitHub TUI in OCaml. If at some point the need for a config arises, I return to ccl.
目前,我正專注於用 OCaml 建立 GitHub 的終端使用者介面。如果之後需要設定檔,我會回到 ccl

For example, to make CCL production-ready, one essential API piece is missing: decoding CCL values into actual programming values.
例如,要讓 CCL 能夠投入正式環境使用,還缺少一個必要的 API:將 CCL 值解碼成實際的程式值。

I envision an API in OCaml like this one:
我設想的 OCaml API 如下:

type user = {
  login: string;
  name: string;
  last_active: int;  (* timestamp in UNIX epoch *)

let codec =
    let+ login = Codec.string "login" (fun {login; _} -> login)
    and+ name = Codec.string "name" (fun {name; _} -> name)
    and+ last_active =
        (fun {last_active; _} -> last_active)
    { login; name; last_active }

I used to work on something similar in the past, using Category Theory abstractions such as Category, Isomorphisms, Profunctors and Applicative Functors. But I’ll save this for another article.

The implementation of CCL is so simple, you can even try implementing it in your favourite language, and this could be a nice hobby project!
CCL 的實作非常簡單,您甚至可以用自己喜歡的語言嘗試實作,這會是個不錯的休閒專案!

My goal is not to make everyone use this new language. What I want is to inspire you.

I hope you can see how much you can achieve with so little.

I hope you’ll try to pursue simplicity as well.

I hope we can make the software better together.

