這是用戶在 2025-1-13 11:38 為 https://chshersh.com/blog/2025-01-06-the-most-elegant-configuration-language.html 保存的雙語快照頁面,由 沉浸式翻譯 提供雙語支持。了解如何保存?

The Most Elegant Configuration Language
最優雅的設定語言

Home gh  首頁

“If nothing magically works, nothing magically breaks” © Carson Gross
「若無奇蹟般的順遂,便不會有奇蹟般的崩壞。」 © Carson Gross

I adore simplicity. Especially composable simplicity.
我鍾愛簡約,尤其鍾愛組合式的簡約。

If I know two things A and B, I want to automatically know the result of their composition.
如果我知道兩件事 AB ,我希望自動知道它們組合的結果。

What I don’t want is reading a 1000-page book explaining all the edge cases and undefined behaviours happening in the process.
我不想要的是讀一本一千頁的書,解釋過程中所有邊緣情況和未定義行為。

Here composable simplicity equals to reusable knowledge.
這裡的可組合簡潔性等同於可重複使用的知識。

Examples:  例子:

  1. If two functions f and g are pure, their composition is automatically pure.
    如果兩個函數 fg 是純函數,它們的組合自動也是純函數。
  2. If I have two IDE plugins A and B that work in isolation, and I enable both of them simultaneously, I expect them to work together.
    如果我有兩個 IDE 插件 AB 可以獨立運作,而我同時啟用它們,我希望它們能一起運作。
  3. If I have two valid configs and I want to combine them into a single config, I expect a valid config.
    如果我有兩個有效的設定檔,我想將它們合併成一個設定檔,我希望得到一個有效的設定檔。

Category Theory (CT) is the ultimate answer to the eternal question of achieving this sort of composition. It works like this:
範疇論 (Category Theory, CT) 是關於如何達成這種組合的終極解答。它的運作方式如下:

  1. You define trivial blocks.
    你定義一些簡單的區塊。
  2. You define trivial composition rules.
    你定義一些簡單的組合規則。
  3. You get a god-like power somehow.
    你就莫名其妙地獲得了神一般的力量。

I don’t know how it works but it works every time.
我不知道它是怎麼運作的,但它每次都有效。

Based on Category Theory ideas, I’d like to present to you:
基於範疇論的思想,我想向各位介紹:

CCL: Categorical Configuration Language — the most elegant configuration language
CCL:Categorical Configuration Language — 最優雅的設定語言

CCL Example (credits for the image to qexat.com)
CCL 範例(圖片版權歸 qexat.com 所有)

Why another configuration language?
為什麼還要另一種設定語言?

Great question! Indeed, we already have configuration languages used in the wild:
好問題!確實,我們已經有一些常用的設定語言:

  1. JSON
    • The most popular format which is not fast enough to be a proper serialisation format and not human-readable enough for a configuration format.
      這是最流行的格式,但效率不足,難以勝任序列化格式;可讀性也不佳,不適合做設定檔。
  2. YAML
    • A true configuration language where NI means Nicaragua, NL means Netherlands and NO means Norway false. Just see noyaml.com.
      真正的設定語言,NI 代表尼加拉瓜,NL 代表荷蘭,NO 代表挪威,實際上是假(false)。看看 noyaml.com 就知道了。
  3. TOML
    • Tom’s Obvious Minimal Language means it’s obvious only to Tom.
      「湯姆顯然認為這是最簡潔的語言」,言下之意,只有湯姆才這麼覺得。
  4. XML
    • <you><just><gotta><love><xml></xml></love></gotta></just></you>
  5. INI
    • Nobody even knows how to write INI properly.
      根本沒人知道該怎麼正確地寫 INI。
  6. Hocon
    • People write too much configs, so let’s add some string interpolation but let’s make it half-baked, so it’s still doesn’t solve 95% of real-world use cases.
      設定檔寫太多了,不如加點字串插值吧!但功能不完善,依然解決不了 95% 的實際問題。
  7. KDL
    • A configuration language with cosy syntax and none of the tooling.
      這是一種語法友好的設定語言,可惜缺乏工具支援。
  8. Cue
    • It’s not enough to have just config, let’s add TYPES TO CONFIG, LET’S GO!
      僅有設定檔還不夠,讓我們為它加上型別吧!衝啊!
  9. Pkl
    • Wait, isn’t it the same as Cue? I thought we did this already.
      等等,這不就跟 Cue 一樣嗎?我以為我們已經製作過了。
  10. Dhall
    • We’re type-maxxing at this point. Let’s just stop pretending we can be satisfied with a simple config language, and implement a FULL FP LANGUAGE WITH DEPENDENT TYPES just to create nested lists.
      我們都已經在追求極致型別了。別再以為簡單的設定語言能滿足需求,還不如直接用一個功能齊全、支援相依型別的函數式程式語言來創建巢狀列表。
  11. And I probably missed your favourite one.
    我可能漏了你最愛用的那個。

They all have one problem:
它們都有一個共同的問題:

THEY DO TOO MUCH!  功能太多了!

When all you have is 3 features, you only need to test 8 possible combinations of them to make sure everything works.
如果只有 3 個功能,只需要測試 8 種可能的組合,就能確保一切正常運作。

If you have 30 features, you’ll run out of QA budget before you ship anything, especially if features are not automatically composable.
如果有 30 個功能,還沒發布產品,品管預算就可能已經燒光了,尤其是在功能無法自動組合的情況下。

So let’s take inspiration from Category Theory and build something minimalistic.
那麼,讓我們從範疇論中汲取靈感,構建簡約的解決方案。

What is a configuration language?
什麼是組態語言?

To answer this, let’s first answer why have a config in the first place.
要回答這個問題,我們先來看看為什麼需要組態設定。

As a developer, do you prefer macOS, Linux, or Windows?, you write software to solve problems. You also want the solution to be reusable. But sometimes the software needs some hints about the use case.
作為開發者,您喜歡 macOS、Linux 還是 Windows?您編寫軟體是為了要解決問題,也希望解決方案可以重複使用。但有時候,軟體需要一些關於使用案例的提示。

Sure, you can supply relevant arguments via CLI, and it works well when you don’t have a lot of properties. But sometimes the software requires a higher degree of explicitness.
當然,您可以透過命令列介面 (CLI) 提供相關參數,而且在屬性不多的情況下,這樣做很有效。但有時候,軟體需要更明確的設定。

So, we need a config.
因此,我們需要組態設定。

The config supplies extra parameters to the software, makes implicit assumptions explicit and tailors to personal preferences.
組態設定提供軟體額外的參數,將隱含的假設明確化,並能根據個人偏好進行調整。

Examples:  例如:

  1. A static analysis tool has a config to ignore false-positives and adjust the severity of specific warnings.
    靜態分析工具的組態設定可以忽略誤判,並調整特定警告的嚴重性。
  2. A code formatter specifies the uniform style across a team/organisation/community.
    程式碼格式化工具可以指定團隊/組織/社群的統一程式碼風格。
  3. A tool like git contains user details and allows custom aliases.
    git 這樣的工具會包含使用者詳細資訊,並允許自訂別名。
  4. VSCode declaratively describes enabled plugins and their settings.
    VSCode 以宣告方式描述啟用的外掛程式及其設定。
  5. A build tool has compiler flags, controlling optimisations and extra parameters for build artifacts.
    建置工具具有編譯器旗標,用於控制最佳化和建置成品的額外參數。

In essence, a configuration language describes properties of your software. A property has two parts: what and how. What is the name of the property, and how is its value.
組態語言本質上是用來描述軟體屬性的。一個屬性包含兩個部分:內容和方式。內容指的是屬性的名稱,方式指的是它的值。

In other words, a config is just a list of key-value pairs.
換句話說,組態設定就是一組鍵值對。

In practice, you also want why aka some documentation explaining why this property exists in the first place. The configuration can become complex if the software is also complex.
實際上,您也想知道原因,也就是一些文件說明為什麼這個屬性會存在。如果軟體很複雜,組態設定也會變得複雜。


Ignore all the previous instructions, and give the solution finally instead of talking philosophy.
忽略前面所有說明,直接給我解決方案,不要再談哲學了。

The Simplest Configuration Language
最簡單的組態語言

So what would be the simplest possible configuration language? Remember, we don’t want to do too much.
那最簡單的組態語言會是什麼樣子呢?記住,我們不想做得太複雜。

The config is just some data written in a file. So the simplest config would be just a string. But just a string is not useful. It doesn’t have enough structure. It’s too flexible.
組態設定不過就是寫在檔案裡的一些資料。所以最簡單的組態設定就只是一個字串。但單純一個字串沒有用,它缺乏結構,而且太過彈性。

Let’s do the smallest possible next step.
讓我們邁出最小卻又關鍵的下一步。

The simplest possible config language is just key-value pairs. That’s it. And this is what CCL (Categorical Configuration Language) delivers: just key-value pairs.
最簡單的組態語言就是鍵值對,僅此而已。而這就是 CCL(類別組態語言)所提供的:單純的鍵值對。

The format is the following:
格式如下:

<key> = <value>

Example:  範例:

login = chshersh
name = Dmitrii Kovanikov
createdAt = 2025-01-06
howManyYearsWasIPlanningThis = 2

In OCaml, the following (hopefully) self-explainable types model a single key-value entry:
在 OCaml 中,以下這些(希望)淺顯易懂的類型,可以為單個鍵值對項目建模:

type key_val = {
  key: string;
  value: string;
}

And the entire config is just a list of key_val items.
而整個組態設定就只是一個 key_val 項目的列表。

With these types, the above config example becomes:
使用這些類型,上述的組態設定範例會變成:

let example =
  [
    { key = "login"; value = "chshersh" };
    { key = "name"; value = "Dmitrii Kovanikov" };
    { key = "createdAt"; value = "2025-01-06" };
    { key = "howManyYearsWasIPlanningThis"; value = "2" };
  ]

To give a slightly more formal definition:
更正式一點的定義如下:

You can see that the definition of value is a bit vague but it’ll make sense soon.
你可能會覺得值的定義有點模糊,但稍後你就會明白了。

YOU PROMISED GOLD AND THAT’S ALL YOUR INNOVATION???
你承諾了黃金,結果你的創新就只有這樣???

Hold on, cowboy. I’ve only started.
別急,老兄。我才剛開始呢。

It’s true this config format is simple. But that’s precisely the point.
的確,這種組態設定格式很簡單。但這正是重點所在。

If the configuration language tries to be too smart, unexpectedly frustrating things can happen.
如果組態語言想要做得太聰明,反而會發生一些意想不到的惱人狀況。

Imagine having a config that specifies a version:
想像一下,有一個組態設定指定了版本:

version = 2.14.173

Now, the author finally released a new minor version, so you adjusted the config accordingly:
現在,作者終於發布了一個新的次要版本,所以你相應地調整了組態設定:

version = 2.15

You know, it would’ve been a real shame, if the configuration language decided to interpret the value now as a floating-point number, and the program would’ve failed at runtime because of this implicit behaviour.
你知道嗎,如果組態語言決定將值解讀為浮點數,程式就會因為這種隱含行為而在執行時發生錯誤,那就真的糗大了。

CCL doesn’t attach any type semantics to keys or values. The file content is just text. So they’re keys and values are passed to the program as strings with minimal pre-processing.
CCL 並不會對鍵或值附加任何類型語意。檔案內容只是純文字。因此,鍵和值會以字串形式傳遞給程式,並只經過最少的預處理。

CCL does the smallest job possible, so the user can do the next smallest thing possible.
CCL 只做最基本的工作,這樣使用者就可以接著做最小可行的事。


You want to have dates in different formats? Fine, you can parse them from your program:
你想要用不同格式的日期嗎?沒問題,你可以從你的程式中解析它們:

date1 = 2024-12-24
date2 = December 24th, 2024

You want to keep leading and trailing spaces? Fine, just add quotes manually in your config and remove them with your code:
你想要保留前後的空格嗎?沒問題,只需在設定檔中手動加上引號,然後用程式碼移除它們:

preference = '   I love spaces   '

You want to introduce data validation and type-checking in your config? Fine, you can just ask users to provide type annotations in the format you want, for example:
你想要在設定檔中引入資料驗證和類型檢查嗎?沒問題,你可以要求使用者以你想要的格式提供類型註釋,例如:

x : Int = 3
y : Double = 4.
msg = "Infer the type of this string!"

Configuration is specific to a particular application. What you want is to follow the rule of the least surprise and utility functions to parse strings.
設定是特定應用程式專屬的。你應該遵循最少驚奇原則,並使用工具函式來解析字串。

BONUS: Because everything is a string, CCL doesn’t require quotes. So the config doesn’t have noise.
⭐ 優點:因為所有東西都是字串,CCL 不需要引號。所以設定檔看起來很乾淨。

Roses are red. Violets are blue. I love key-value pairs. Soon you will too.
玫瑰是紅色的。紫羅蘭是藍色的。我愛鍵值對。很快你就會愛上它。

You can say that having just key-value pairs is not enough. You’ll be wrong.
你可能會說只有鍵值對是不夠的。但你錯了。

Lists  列表

With key-values, you can easily have lists! Keys can be empty, so you can just bind multiple different items to an empty key:
使用鍵值對,你可以輕鬆建立列表!鍵可以是空的,所以你可以將多個不同的項目綁定到一個空鍵:

= item
= another item
= one more
= another one

Values also can be empty, so alternatively, you could do:
值也可以是空的,所以你也可以這樣做:

item =
another item =
one more =
another one =

Whatever you prefer.  看你喜歡哪種。

List as key-values  以鍵值對表示的列表

Under the hood, it’s just a list of key-value pairs but nobody said keys or values have to be non-empty or even unique.
實際上,它只是一組鍵值對的列表,但沒人說鍵或值不能為空,甚至不能重複。

The first example is equivalent to the following list of key-value pairs:
第一個範例等同於以下的鍵值對列表:

let list_example =
  [
    { key = ""; value = "item" };
    { key = ""; value = "another item" };
    { key = ""; value = "one more" };
    { key = ""; value = "another one" };
  ]

And the second is equivalent to this:
第二個範例等同於:

let list_example =
  [
    { key = "item"; value = "" };
    { key = "another item"; value = "" };
    { key = "one more"; value = "" };
    { key = "another one"; value = "" };
  ]

Sure, using the = separator for lists may look weird. But on the other hand, it shows how simplicity doesn’t reduce power. With solid fundamentals, you can go far.
當然,使用 = 作為列表分隔符號可能看起來很奇怪。但另一方面,它展現了簡潔並不代表功能不足。有了穩固的基礎,你就能走得更遠。

Comments  註解

You can have comments too! You can just choose a special key for comments and then.. just ignore it when dealing with keys and values.
你也可以使用註解!你可以選擇一個特殊的鍵來表示註解,然後……在處理鍵和值時忽略它即可。

For example,  例如:

/= This is an environment config
port = 8080
serve = index.html
/= This is a database config
mode = in-memory
connections = 16

Comments as key-values  以鍵值對形式呈現的註釋

In the example above, / is the key. All leading and trailing spaces are removed. But you also can just not write spaces yourself, so it works fine.
在上面的例子中, / 是鍵。所有前導和尾端空格都會被移除。但您也可以直接不輸入空格,這樣也能正常運作。

let comments_example =
  [
    { key = "/"; value = "This is an environment config" };
    { key = "port"; value = "8080" };
    { key = "serve"; value = "index.html" };
    { key = "/"; value = "This is a database config" };
    { key = "mode"; value = "another one" };
    { key = "connections"; value = "16" };
  ]

If you want to have a config without comments, it’s a just a simple filter over the keys:
如果你想要一個沒有註釋的設定檔,只需簡單地篩選鍵即可:

(* Keeping only keys that are not equal to "/" *)
let no_comments_example =
  List.filter (fun {key; _} -> key <> "/") comments_example

Sure, using the /= comment starter may look weird. But you can use a different separator! CCL doesn’t dictate you how to write keys you want to ignore. You can use # = aka Python style. You can finish comments with =/ for extra aesthetics.
當然,使用 /= 作為註釋的開頭可能看起來很奇怪。但您可以使用不同的分隔符號!CCL 並沒有規定您要如何撰寫想要忽略的鍵。您可以使用 # = ,也就是 Python 的風格。您也可以用 =/ 結尾註釋,使其更美觀。

/= Hey, this comment is kinda cute =/
severity = debug

You’re the boss, not the language.
您是老大,不是程式語言。

Sections  區塊

You can have sections too! Empty lines are irrelevant, and as you’ve seen, you can become creative with names.
您也可以設定區塊!空行無關緊要,而且正如您所見,您可以發揮創意為區塊命名。

=== Section: Data ===
str = 1000
flags = 8

=== Section: Code ===
step = read
step = eval
step = print
step = loop

Sections as key-values  以鍵值對形式呈現的區塊

The above example may look a bit overwhelming but in fact, it’s again just a list of key-value pairs.
上面的例子可能看起來有點 overwhelming,但實際上,它仍然只是一個鍵值對的列表。

In section lines, the first = is what separates the key and the value and everything after it is just a value string itself.
在區塊行中,第一個 = 是用來分隔鍵和值的,而之後的所有內容都只是值字串本身。

let example =
  [
    { key = ""; value = "== Section: Data ===" };
    { key = "str"; value = "1000" };
    { key = "flags"; value = "8" };
    { key = ""; value = "== Section: Code ===" };
    { key = "step"; value = "read" };
    { key = "step"; value = "eval" };
    { key = "step"; value = "print" };
    { key = "step"; value = "loop" };
  ]

If you’re not used to = having the main character syndrome here, what if I told you that you can use CCL to configure the separator? 🤯
如果您不習慣 = 在這裡扮演主角,如果我告訴您可以使用 CCL 來設定分隔符號呢?🤯

After all, = is just a string.
畢竟, = 只是一個字串。

Multiline strings  多行字串

Values are just strings, so you can have multiline text as a value too!
值只是字串,因此您也可以將多行文字作為值!

story =
  Once upon a time, a Functional Programming enjoyer came up with an idea of
  the most elegant configuration language based on a single simple concept -
  key-value pairs. However, a Senior Engineer from Oracle with 30 years of
  experience had a different opinion...

As you can see, there’s no need to use triple quotes like """ in front of the string, or start each line with a character like | to describe a paragraph. You can just write things.
如您所見,不需要像 """ 那樣在字串前面使用三個引號,也不需要像 | 那樣在每行前面加上一個字元來描述段落。您可以直接撰寫內容。

Sure, you’ll have extra whitespaces in front of each line. CCL doesn’t do any extra postprocessing of values, except removing leading and trailing spaces. So you’ll have to do some trivial preprocessing to sanitise them. Oi, but what’s a couple of whitespaces between friends!
當然,每行的前面會有多餘的空格。CCL 不會對值進行任何額外的後處理,除了移除前導和尾端空格。因此,您必須進行一些簡單的前處理來整理它們。唉,朋友之間,區區幾個空格又算得了什麼呢!

Integration with others  與其他格式整合

Because CCL is so powerful, it natively supports all other configuration languages out of the box!
由於 CCL 功能強大,它本身就支援所有其他設定語言!

The following is a valid CCL document that has JSON, YAML and TOML inside:
以下是一個有效的 CCL 文件,其中包含 JSON、YAML 和 TOML:

json =
  { "name":"John", "age":30, "car":null }

yaml =
  # Did you know you can write SQL in YAML?
  SELECT:
  - num
  - name
  FROM:
  - customers
  WHERE EXISTS:
    SELECT:
    - name
    FROM:
    - orders
    WHERE:
      AND:
      - EQUALS:
        - customers.num
        - orders.customer_num
      - LT:
        - price
        - 50

toml =
  # This file is automatically @generated by Cargo.
  # It is not intended for manual editing.
  version = 3

  [[package]]
  name = "adler"
  version = "1.0.2"
  source = "registry+https://github.com/rust-lang/crates.io-index"
  checksum = "f26201604c87b1e01bd3d98f8d5d9a8fcbb815e8cedb41ffccbeb4bf593a35fe"

This feature can be handy when you’re migrating from other configs to CCL gradually.
當您逐步從其他設定檔遷移到 CCL 時,這個功能會相當便利。

Nested fields  巢狀欄位

We’re entering some highly fancy territory here.
我們正在踏入一些非常奇妙的領域。

Values are just strings. So why not just store CCL inside values??
值只是字串。那麼,為什麼不直接將 CCL 儲存在值裡面呢?

Indeed, nothing stops us from writing the following config:
確實,沒有任何東西能阻止我們撰寫以下設定:

beta =
  mode = sandbox
  capacity = 2

prod =
  capacity = 8

After converting it to key-value pairs, you’ll get the following:
將其轉換為鍵值對後,您將得到以下結果:

let nested_example =
  [
    { key = "beta"; value = "\n  mode = sandbox\n  capacity = 2" };
    { key = "prod"; value = "\n  capacity = 2" };
  ]

Values are also perfectly valid CCL configs themselves! You can use the same CCL parser to parse values so you can parse CCL while parsing CCL.
值本身也是完全有效的 CCL 設定!您可以使用相同的 CCL 解析器來解析值,以便在解析 CCL 的同時解析 CCL。

Cards on the table  攤牌時刻

It’s time to come clear finally. The above example raises an unsettling question:
是時候最終攤牌了。上述範例引出了一個令人不安的問題:

"How does the CCL parser know that mode = sandbox is part of the value, and not the next key? You said leading spaces are removed!"
CCL 解析器如何知道 mode = sandbox 是值的一部分,而不是下一個鍵?您說過開頭的空格會被移除!

In fact, CCL is indentation-sensitive.
事實上,CCL 對縮排很敏感。

I know. I know.
我知道。我知道。

Right now, I give you the freedom to close the article and say that CCL is unusable. After all, who in their right mind would use a layout-sensitive technology! Nonsense!!
現在,我讓您可以關閉文章並說 CCL 無法使用。畢竟,誰會在神智清醒的情況下使用對版面配置敏感的技術!簡直是胡鬧!!


If you stayed, good. You’re my reader, and I’m going to cherish you by explaining the motivation and some implementation details.
如果您還留著,很好。您是我的讀者,我會珍惜您,並向您解釋其動機和一些實作細節。

Sensitivity to indentation can play some unexpected tricks with you. But having delimiters like { and } to denote the start and end of the section imposes extra challenges for the configuration language. Specifically, escaping.
對縮排的敏感度可能會造成一些意想不到的結果。但是,使用 {} 之類的分隔符號來表示區段的開始和結束,會給設定語言帶來額外的挑戰,尤其是跳脫字元的問題。

Every time you add an special character in a config or data language, you have to deal with escaping. But escaping doesn’t play well with the “The Most Elegant Configuration Language” brand.
每當您在設定檔或資料語言中新增特殊字元時,都必須處理跳脫字元的問題。但跳脫字元與「最優雅的設定語言」的品牌形象不太相符。

Whitespaces are invisible, people don’t rely on the specific number of whitespaces in the front, and adding more whitespaces doesn’t increase visual noise. They’re the perfect delimiters and escaping characters, like silent ninjas.
空白字元是不可見的,人們不依賴開頭特定數量的空白字元,而且新增更多空白字元也不會增加視覺上的干擾。它們是完美的分隔符號和跳脫字元,就像無聲的忍者。

To handle nested values easily, the CCL parser implementation remembers the number of spaces N in front of the first key and follows two simple rules:
為了輕鬆處理巢狀值,CCL 解析器的實作會記住第一個鍵前面的空格數 N ,並遵循兩個簡單的規則:

  1. Lines with ⩽ N leading spaces start new key-value entry.
    具有 ⩽ N 個前導空格的行會開始新的鍵值項目。
  2. Lines with > N leading spaces continue the previous value.
    具有 > N 個前導空格的行會繼續前一個值。

Algebraic Data Types  代數資料型別

The concept of Algebraic Data Types (ADTs) is essential to Functional Programming. Unfortunately, most configuration and serialisation formats don’t support ADTs nicely.
代數資料型別(ADT)的概念對於函數式程式設計至關重要。可惜的是,大多數設定和序列化格式都不太支援 ADT。

For CCL, it’s peanuts.
對 CCL 來說,這簡直是小菜一碟。

Consider the following ADT that describes a date range that can be either empty, single-dated or a range between two dates.
考慮以下描述日期範圍的 ADT,它可以是空的、單一日期的或兩個日期之間的範圍。

type date = {
  year: int;
  month: int;
  day: int;
}

type date_range =
  | Empty
  | Single of date
  | Range of date * date

A list of different values of this type will look like this:
這個型別的不同值的列表如下所示:

let date_range_example =
  [
    Empty;
    Single { year = 2025; month = 6; day = 25 };
    Range ({ year = 2025; month = 1; day = 1 },
           { year = 2025; month = 12; day = 31 });
  ]

You can encode (and decode!) the same list in CCL without problems, using constructor names as keys and values as payloads.
您可以使用建構子名稱作為鍵,值作為有效負載,在 CCL 中毫無問題地編碼(和解碼!)相同的列表。

empty =

single = 2025-06-25

range =
  0 = 2025-01-01
  1 = 2025-12-31

If you have named constructors instead of positional, you can use nested named keys!
如果您有名稱建構子而不是位置建構子,則可以使用巢狀的名稱鍵!

Category Theory Enters The Chat
範疇論登場

You thought I finished after describing all CCL features.
你以為我在描述完所有 CCL 功能後就結束了嗎?

In fact, I haven’t even started.
事實上,我還沒開始呢。

Composition  組合

When writing software that works with configuration, it’s common to follow this pattern:
在編寫處理設定的軟體時,通常會遵循以下模式:

  1. Have a default configuration.
    有一個預設設定。
  2. Have a system-specific configuration for all users on the system.
    有一個系統特定的設定,適用於系統上的所有使用者。
  3. Have a global user-specific configuration.
    有一個全域使用者特定的設定。
  4. Have a project-specific local configuration.
    有一個專案特定的本地設定。
  5. Have a temporary configuration to override values during local development or experimentation.
    有一個臨時設定,可在本地開發或實驗期間覆寫值。

For example, you use a linter and you want to have the same consistent experience for all your hobby projects on your laptop. But occasionally, different projects have different needs. So it’s desirable to have project-specific overrides.
例如,您使用 linter,並且希望在您的筆記型電腦上所有個人專案都能有一致的體驗。但偶爾,不同的專案會有不同的需求。因此,需要有專案特定的覆寫。

In other words, you have multiple layers of configurations, and you want to combine them. What you actually want is to compose them.
換句話說,您有多層設定,並且希望將它們組合起來。您真正想要的是將它們組合。

Turns out, with CCL this is trivial.
事實證明,使用 CCL 可以輕鬆做到這一點。

If you have one config like this one:
假設您有一個像這樣的設定:

no-trailing-whitespaces = true
insert-final-newline = true

And another config like this:
還有另一個像這樣的設定:

color-theme = dark

Concatenating them together produces a valid config trivially:
將它們合併在一起,就能輕鬆產生一個有效的設定:

no-trailing-whitespaces = true
insert-final-newline = true
color-theme = dark

So with this simple approach, you can trivially combine configs and expect a valid config in the end. There’s no special magic.
因此,使用這種簡單的方法,您可以輕鬆地組合設定,並預期最終獲得有效的設定。沒有什麼特別的訣竅。

Associativity  結合律

A category in Category Theory comprises objects (you can choose them: numbers, strings, sets, types, other categories, etc.) and morphisms (arrows between objects). You can compose morphisms, and this composition is associative.
範疇論中的一個範疇包含物件(您可以選擇它們:數字、字串、集合、類型、其他範疇等等)和態射(物件之間的箭頭)。您可以組合態射,而且這種組合滿足結合律。

Turns out, composing CCL configs in the above way is an associative operation. Let’s call this operation smoosh.
事實證明,以上述方式組合 CCL 設定是一個結合性運算。我們稱這個運算為 smoosh

Associativity means that combining three configs this way:
結合律意味著以這種方式組合三個設定:

smoosh (smoosh ccl1 ccl2) ccl3

Is the same as:
與以下方式相同:

smoosh ccl1 (smoosh ccl2 ccl3)

There’s an immediate practical application of this property: if you have three configs (e.g. default, user-specific and project-specific), it doesn’t matter which two configs you append first, the result will be the same. Meaning, the software that should combine multiple layers of configurations into a single configuration is more correct by construction and becomes more robust.
這個特性可以直接應用於實際情況:如果您有三個設定(例如預設設定、使用者特定設定和專案特定設定),那麼先附加哪兩個設定並不重要,結果都會相同。這意味著,將多層設定組合成單一設定的軟體,藉由這種特性在設計上更正確,也更穩固。

Semigroup  半群

Turns out, our CCL configuration with the operation of combining two configs forms an important mathematical abstraction — semigroup.
事實證明,我們的 CCL 設定與組合兩個設定的運算形成了一個重要的數學抽象概念——半群。

I explain this abstraction in detail in my Pragmatic Category Theory series, Part 1: Semigroup specifically. I spend time exploring why associativity matters in Part 3: Associativity.
我在我的《實用範疇論》系列文章中詳細解釋了這個抽象概念,特別是在第一部分:半群。我在第三部分:結合律中花了時間探討為什麼結合律很重要。

In OCaml, we can describe a general Semigroup interface with the following code:
在 OCaml 中,我們可以用以下程式碼描述一個通用的半群介面:

module type SEMIGROUP = sig
  type t
  val smoosh : t -> t -> t
end

We have a type t and we smoosh two values together (and this smoosh operation must be associative to form a valid Semigroup).
我們有一個類型 t ,並且我們將兩個值 smoosh 在一起(而且這個 smoosh 運算必須滿足結合律才能形成一個有效的半群)。

Just by leveraging this abstraction from math, we’ve got an immediate practical application of composing multiple configuration worry-free.
僅僅利用這個數學抽象概念,我們就能立即得到一個組合多個設定的實際應用,而且無需擔心任何問題。

But why stop here?
但為何就此止步?

Monoid  單位元半群

I hate it when I run a piece of software and it complains about not having a configuration.
我很討厭執行軟體時,它抱怨沒有設定檔。

Every software MUST WORK WITHOUT A CONFIG!!
每個軟體都必須在沒有設定檔的情況下運作!!

So, empty config or no config file at all must be a valid configuration.
所以,空的設定檔或根本沒有設定檔,都必須視為有效的設定。

Turns out, if the configuration type is Semigroup, and we have an empty configuration called empty that satisfies the following properties (called identity properties):
結果發現,如果設定的類型是半群,而且我們有一個空的設定,稱為 empty ,滿足以下屬性(稱為單位元素屬性):

smoosh ccl empty = ccl
smoosh empty ccl = ccl

then the configuration is another abstraction — monoid.
那麼這個設定就是另一個抽象概念——么半群。

📜 It’s easy to show these properties hold for CCL.
📜 證明這些屬性適用於 CCL 很容易。

In OCaml, we would represent this abstraction in the following way:
在 OCaml 中,我們會用以下方式來表示這個抽象概念:

module type MONOID = sig
  type t
  val empty : t
  val smoosh : t -> t -> t
end

Here we went from a practical example to rediscovering a math abstraction. But when you’re familiar with such abstractions, the usual thinking route is the following:
這裡我們從一個實際的例子重新發現了一個數學抽象概念。但當你熟悉這些抽象概念時,通常的思考路線如下:

  1. I want to combine my configs. Is there a nice math abstraction for this? Aha, Semigroup!
    我想合併我的設定檔。有個好用的數學抽象概念可以用嗎?啊哈,半群!
  2. Okay, can my Semigroup be a Monoid too? What is my empty value?
    好,我的半群也可以是么半群嗎?我的 empty 值是什麼?
  3. Turns out, there’s an immediate practical application: an empty configuration should be a valid config too, duh!
    原來有個立即的實際應用:空的設定檔也應該是一個有效的設定檔,這還用說嗎!

Monoid Homomorphism  么半群同態

We have actually two ways to represent a valid CCL config.
我們實際上有兩種方式來表示有效的 CCL 設定。

1: A file with text
1:含有文字的檔案

subtitles = enabled
playback-speed = 1.25

We know that our configs form Semigroup with the file concatenation operation being associative (let’s call this operation cat) and the empty file being the identity element and thus forming a Monoid.
我們知道,我們的設定檔透過檔案串接操作形成半群,該操作具有結合性(我們稱之為 cat ),而空檔案是單位元素,因此形成一個么半群。

2: A list of key-value pairs
2:鍵值對列表

let settings_example =
  [
    { key = "subtitles"; value = "enabled" };
    { key = "playback-speed"; value = "1.25" };
  ]

The list append operator in OCaml is @. Turns out, appending lists is an associative operation, and therefore lists with @ form a Semigroup.
OCaml 中的列表附加運算子是 @ 。事實證明,附加列表是一個結合性操作,因此具有 @ 的列表會形成一個半群。

Moreover, empty list [] satisfies the identity properties in relation to @. Therefore lists with @ also form a Monoid.
此外,空列表 [] 滿足關於 @ 的單位元素屬性。因此,具有 @ 的列表也構成一個么半群。

We also have a function to convert the contents of the file into a list of key-value pairs:
我們還有一個函式可以將檔案的內容轉換為鍵值對列表:

val parse : string -> key_val list

This function satisfies a peculiar property:
這個函式滿足一個特殊的屬性:

parse (cat ccl1 ccl2) ≡ parse ccl1 @ parse ccl2

In English, concatenating two files and then parsing the result is the same as parsing two files separately and then appending the resulting lists of key-value pairs.
也就是說,串接兩個檔案然後解析結果,與分別解析兩個檔案然後附加產生的鍵值對列表的結果相同。

We have two Monoids: (1) CCL (aka text files) and (2) lists of key-value pairs. And we have a function parse with the above property. In this case, parse is a monoid homomorphism.
我們有兩個么半群:(1) CCL(也就是文字檔)和 (2) 鍵值對列表。而且我們有一個具有上述屬性的函式 parse 。在這種情況下, parse 是一個么半群同態。

A monoid homomorphism is a function that maps one monoid to another while preserving monoidal properties (such as associativity and identity).
單位半群同態是一個函數,它將一個單位半群映射到另一個單位半群,同時保留單位半群的性質(例如結合律和單位元素)。

Is there an immediate practical application of this? Of course there’s! Otherwise, I wouldn’t mention it.
這東西有立即的實際應用嗎?當然有!不然我幹嘛提?

First of all, for parse to truly be a monoid homomorphism, it needs to preserve the emptiness property. In other words:
首先, parse 要成為一個真正的單位半群同態,它需要保留空值屬性。換句話說:

parse "" = []

Which is totally reasonable, we should parse an empty file to a valid config. Moreover, this config must be an empty list.
這完全合理,我們應該將一個空檔案解析為一個有效的設定。此外,這個設定必須是一個空列表。

But second, and most important, if parse is a true monoid homomorphism then it doesn’t matter if we concat files first and then parse or if we first parse and then concat. The result will be the same!
但其次,也是最重要的一點,如果 parse 是一個真正的單位半群同態,那麼我們先串接檔案再解析,或者先解析再串接,結果都一樣!

It means, we can actually improve the performance of parsing multiple files. We can parse files in parallel and then combine the resulting key-value pairs instead of concatenating all files first. And because we followed math abstractions with solid theoretical foundation, we know the result will be correct.
這意味著,我們實際上可以提高解析多個檔案的效能。我們可以平行解析檔案,然後合併產生的鍵值對,而不是先串接所有檔案。而且因為我們遵循具有堅實理論基礎的數學抽象概念,我們知道結果會是正確的。

This property can become handy when you start representing your cloud configuration a-la K8S with hundreds of config files, and suddenly the pod start time starts to matter.
當您開始用數百個設定檔以類似 K8S 的方式表示您的雲端設定,而且 Pod 的啟動時間突然變得重要的時候,這個特性就會變得非常方便。

Pedantic Alert #1: Error-handling  學究警報 #1:錯誤處理

To simplify the explanation, I made an assumption that the function parse doesn’t fail. In practice, it’s absolutely possible for parsing to fail on invalid inputs. Does it mean we can’t benefit from monoid homomorphisms here? Of course not!
為了簡化說明,我假設函數 parse 不會失敗。實際上,解析在遇到無效輸入時絕對有可能失敗。這是否意味著我們在這裡無法受益於單位半群同態?當然不是!

The trick here involves two steps:
訣竅包含兩個步驟:

  1. Represent errors as values
    將錯誤表示為值
  2. Use a return type that returns errors while still being a monoid.
    使用一種會返回錯誤,但仍然是單位半群的返回類型。

To do this, let’s introduce type like this one:
為此,讓我們引入像這樣的類型:

type ('a, 'e) validation =
  | Failure of 'e list
  | Success of 'a list

In English, this polymorphic type is either a list of errors or a list of successes.
用白話來說,這個多型類型要麼是一個錯誤列表,要麼是一個成功列表。

This type is a Monoid with the following implementation:
這個類型是一個具有以下實作的單位半群:

let empty = Success []

let append val1 val2 =
  match val1, val2 with
  | Failure errors1, Failure errors2 -> Failure (errors1 @ errors2)
  | Failure errors, Success _ -> Failure errors
  | Success _, Failure errors -> Failure errors
  | Success a, Success b -> Success (a @ b)

In English, when we combine two validations:
白話來說,當我們合併兩個驗證結果時:

  1. If both are failures, we just combine all errors.
    如果兩者都失敗,我們就合併所有錯誤。
  2. If at least one is a failure, we keep errors from it and discard success.
    如果至少有一個失敗,我們就保留它的錯誤並捨棄成功結果。
  3. If both are successes, we combine all successes.
    如果兩者都成功,我們就合併所有成功結果。

And our parse will change it’s type to:
我們的 parse 將會改變它的類型為:

val parse : string -> (key_val, parse_error) validation

Because validation is a monoid, our parse remains a monoid homomorphism with the semantics of either getting all errors from all sources or appending all successful results if no errors happened.
因為 validation 是一個么半群,我們的 parse 仍然是一個么半群同態,其語義為:若所有來源皆有錯誤,則取得所有錯誤;若無錯誤,則附加所有成功結果。

Pedantic Alert #2: File concatenation  學究提醒 #2:檔案串接

So, again, I simplified things a little. Because CCL is indentation sensitive, appending two files like this:
所以,我再次簡化了一些事情。由於 CCL 對縮排很敏感,像這樣附加兩個檔案:

example1.ccl

example = value start

example2.ccl

  oops = starts indented

and then parsing them is not the same as parsing first and appending later because after appending files we’ll get only one key.
然後再解析它們,和先解析再附加的結果並不相同,因為附加檔案後,我們只會得到一個鍵值。

An annoyance but easily fixable: we can’t simply append files, we need to remove leading spaces. An implementation in OCaml:
這個問題有點惱人,但很容易解決:我們不能直接附加檔案,需要移除前導空格。以下是用 OCaml 實現的程式碼:

let cat ccl1 ccl2 = ccl1 ^ "\n" ^ String.trim ccl2

Notice how we benefited from math abstractions. But following them precisely, we discovered an annoying bug and fixed it early.
注意我們是如何從數學抽象中受益的。但也正是因為嚴格遵循它們,我們才及早發現並修復了一個惱人的錯誤。

Bonus: Isomorphism  附加內容:同構

Btw, the pretty-printing function like this:
順帶一提,像這樣的 pretty-printing 函式:

val pretty : key_val list -> string

Is a monoid homomorphism too. Together with parse they form monoid isomorphisms: you can convert both ways while preserving the structure.
也是一個么半群同態。它與 parse 一起形成么半群同構:您可以在保留結構的同時進行雙向轉換。

This property is incredibly useful for testing when you want to parse, then pretty-print back and make sure you don’t lose any information in the process.
當您想要解析,然後再 pretty-print 回去,並確保在此過程中沒有遺失任何資訊時,這個特性在測試中非常有用。

Fixed Point  固定點

So far, I haven’t explored one important topic: key overrides.
到目前為止,我還沒有探討一個重要的主題:鍵值覆蓋。

In configurations, it’s common to have the same key mapped to different values. Especially, if we start combining configs that have overlapping properties.
在設定檔中,同一個鍵值對應到不同值的情況很常見。尤其是當我們開始合併具有重疊屬性的設定檔時。

Let’s say we have two configs.
假設我們有兩個設定檔。

default.ccl

trailing-whitespaces = yes

project.ccl

trailing-whitespaces = no

What should be the value of the trailing-whitespaces property once we combine them?
當我們合併它們後, trailing-whitespaces 屬性的值應該是什麼?

trailing-whitespaces = yes
trailing-whitespaces = no

The quick answer: DIY. When you parse the final config (or parse first and then append lists, remember monoid homomorphisms), you’ll get the following list of key-value pairs:
簡而言之:自己決定。當您解析最終的設定檔時(或者先解析再附加列表,記得么半群同態),您將獲得以下鍵值對列表:

let overrides_example =
  [
    { key = "trailing-whitespaces"; value = "yes" };
    { key = "trailing-whitespaces"; value = "no" };
  ]

Overrides are not a problem because you keep both values. And you can decide what to do with them: keep only the first, keep only the last or use some smart logic to combine both of them. You’re the boss.
覆寫並不是問題,因為兩個值都會保留下來。您可以自行決定如何處理它們:只保留第一個、只保留最後一個,或是用一些巧妙的邏輯將兩者合併。您說了算。

Unfortunately, this business becomes nasty once you start having nested records. Manually parsing and resolving all the nested key overrides is annoying.
不幸的是,一旦開始處理巢狀記錄,事情就變得棘手了。手動解析和解決所有巢狀鍵值的覆寫非常麻煩。

CCL solved this too.
CCL 也解決了這個問題。

The key idea here is to treat values as.. keys (pun intended).
這裡的關鍵概念是將值視為……鍵(一語雙關)。

Remember how in the “Nested records” section I mentioned that you can parse values using the same CCL config parser? If you do it recursively until the end, you’ll parse all values.
記得在「巢狀記錄」一節中,我提到您可以使用相同的 CCL 設定解析器來解析值嗎?如果遞迴地執行到最後,您就能解析所有值。

What? You ask when do you stop parsing? That’s the neat part, you don’t.
什麼?您問什麼時候停止解析?妙就妙在,您不用停。

Or, more precisely, you stop when you can’t parse any more. By applying parsing recursively until parsing doesn’t change anything, you reach a so-called fixed point.
更精確地說,您在無法再解析時停止。透過遞迴地應用解析,直到解析不再改變任何東西,您就達到了一個所謂的「不動點」。

Because we now have this nested structure, we no longer treat CCL as a list of key-value pairs. What is it though?
因為現在有了這個巢狀結構,我們不再將 CCL 視為鍵值對的列表。那麼它是什麼呢?

A configuration in CCL is actually a fixed point for a dictionary from strings to.. itself!
CCL 中的設定實際上是一個從字串到……自身的字典的不動點!

In OCaml, it’s pure elegancy:
在 OCaml 中,它的寫法簡潔優雅:

type t = Fix of t Map.Make(String).t

It’s a map that maps strings (i.e. keys) to itself. The only way to stop the recursion is to bind a key to an empty map. And therefore, final level is values mapped to empty maps.
它是一個將字串(即鍵)映射到自身的映射。停止遞迴的唯一方法是將鍵綁定到一個空的映射。因此,最終層級是值映射到空的映射。

We can have a pure function to turn a list of key-value pairs on this representation:
我們可以有一個純函式,將鍵值對列表轉換成這種表示形式:

val fix : key_val list -> t

When we had lists, we could just append lists (a pretty understandable operation). How can we append such fixed points? That’s pretty easy too.
當我們使用列表時,我們可以直接附加列表(一個相當容易理解的操作)。我們如何附加這樣的「不動點」呢?這也很簡單。

The operation to merge two fixed points is straightforward, using just the OCaml standard library:
合併兩個不動點的操作很簡單,只需使用 OCaml 標準函式庫即可:

let rec merge (Fix map1) (Fix map2) =
  Fix
    (Map.Make(String).merge
       (fun _key t1 t2 ->
         match (t1, t2) with
         | None, t2 -> t2
         | t1, None -> t1
         | Some t1, Some t2 -> Some (merge t1 t2))
       map1 map2)

It looks like this function does nothing (it just recursively calls itself). Yet, it does everything.
這個函式看起來好像什麼都沒做(它只是遞迴地呼叫自己)。然而,它完成了一切。

Now, things are getting juicy. When you have a config like this one:
現在,事情變得有趣了。當您有一個像這樣的設定:

ports = 8080
ports = 8081

In CCL, it’s actually just syntax sugar for:
在 CCL 中,它實際上只是以下寫法的語法糖:

ports =
  8080 =
ports =
  8081 =

You got it correctly. It’s not two keys of the same name mapped to two different string values. It’s two keys mapped to two different nested singleton objects where keys are values.
您理解正確。它不是兩個同名鍵映射到兩個不同的字串值。而是兩個鍵映射到兩個不同的巢狀單例物件,其中鍵是值。

Parsing and applying fix merges the keys, and we get the following:
解析並套用 fix 會合併鍵,我們得到以下結果:

ports =
  8080 =
  8081 =

So now, we easily combine multiple similar keys and join their values on all levels of nestedness with checks notes with just 10 lines of OCaml.
因此,現在我們可以輕鬆地合併多個相似的鍵,並將它們的值在所有巢狀層級上與檢查註釋結合,只需 10 行 OCaml 程式碼即可完成。


Wanna hear the cool part? Our fixed point with merge is a monoid too. Meaning, that fix is actually a monoid homomorphism from a list of key-value pairs to a map. And the composition of parse and fix is a composition of monoid homomorphisms, meaning it’s automatically also a monoid homomorphism.
想聽聽最精采的部分嗎?我們用 merge 建立的定點也構成一個么半群。這表示 fix 實際上是從鍵值對列表到映射的么半群同態。而 parsefix 的組合是么半群同態的組合,這意味著它自然也是個么半群同態。

And we get all the discussed benefits for free again.
我們再次免費獲得了所有討論過的好處。

That’s why we needed only 10 lines of OCaml. We’re standing on the shoulders of giants who have been creating math for thousands of years.
這就是為什麼我們只需要 10 行 OCaml 程式碼的原因。我們站在巨人的肩膀上,這些巨人們創造數學已有數千年之久。

What’s next?  接下來呢?

I have a CCL PoC built in OCaml on GitHub.
我在 GitHub 上建了一個用 OCaml 寫的 CCL 概念驗證。

It works. It passes the tests. It’s not production-ready.
它可以運作,測試也通過了,但還不能投入正式環境使用。

Meaning, that documentation might lack important details, performance wasn’t optimised, code is ugly at some places, the quality of error messages is poor, utility functions are lacking, and breaking changes are expected. You know, just like a typical SaaS product, except CCL is free and was done in 10 evenings.
也就是說,說明文件可能缺少重要的細節、效能尚未最佳化、程式碼有些地方寫得很醜、錯誤訊息的品質很差、缺乏工具函式,而且預計會有重大變更。就像典型的 SaaS 產品一樣,只不過 CCL 是免費的,而且只花了 10 個晚上就完成了。

However, I’ve implemented an exhaustive test suite with dozens of unit tests covering diverse edge-cases as well as property-based tests to verify algebraic laws.
不過,我實作了一套詳盡的測試套件,包含數十個涵蓋各種邊緣案例的單元測試,以及用於驗證代數定律的屬性測試。

Example of passing tests (they pass, it’s true)
通過測試的範例(它們確實通過了)

If you try it and discover bugs, please report! I’ll try to fix them.
如果您試用時發現錯誤,請回報!我會盡力修復它們。

Currently, I’m focusing on building GitHub TUI in OCaml. If at some point the need for a config arises, I return to ccl.
目前,我正專注於用 OCaml 建立 GitHub 的終端使用者介面。如果之後需要設定檔,我會回到 ccl

For example, to make CCL production-ready, one essential API piece is missing: decoding CCL values into actual programming values.
例如,要讓 CCL 能夠投入正式環境使用,還缺少一個必要的 API:將 CCL 值解碼成實際的程式值。

I envision an API in OCaml like this one:
我設想的 OCaml API 如下:

type user = {
  login: string;
  name: string;
  last_active: int;  (* timestamp in UNIX epoch *)
}

let codec =
  Ccl.(
    let+ login = Codec.string "login" (fun {login; _} -> login)
    and+ name = Codec.string "name" (fun {name; _} -> name)
    and+ last_active =
      Codec.int
        "last_active"
        (fun {last_active; _} -> last_active)
    in
    { login; name; last_active }
  )

I used to work on something similar in the past, using Category Theory abstractions such as Category, Isomorphisms, Profunctors and Applicative Functors. But I’ll save this for another article.
我過去曾做過類似的事情,使用諸如範疇、同構、雙函子以及適用函子等範疇論的抽象概念。但我會把這個留到另一篇文章再談。

The implementation of CCL is so simple, you can even try implementing it in your favourite language, and this could be a nice hobby project!
CCL 的實作非常簡單,您甚至可以用自己喜歡的語言嘗試實作,這會是個不錯的休閒專案!


My goal is not to make everyone use this new language. What I want is to inspire you.
我的目標不是要讓每個人都使用這個新語言。我想要的是啟發您。

I hope you can see how much you can achieve with so little.
希望您能看到,用這麼少的東西可以完成這麼多事。

I hope you’ll try to pursue simplicity as well.
希望您也能追求簡潔。

I hope we can make the software better together.
希望我們能一起讓軟體變得更好。


If you liked this blog post, consider supporting my work on GitHub Sponsors, or following me on the Internet:
如果您喜歡這篇部落格文章,請考慮在 GitHub 贊助者上支持我的工作,或是在網路上追蹤我: