A Universal Algorithm for Sequential Data Compression 用于顺序数据压缩的通用算法
JACOB ZIV, FELLOW, IEEE, AND ABRAHAM LEMPEL, MEMBER, IEEE 雅各布·齐夫, IEEE 院士, 亚伯拉罕·伦普尔, IEEE 会员
Abstract 摘要
A universal algorithm for sequential data compression is presented. Its performance is investigated with respect to a nonprobabilistic model of constrained sources. The compression ratio achieved by the proposed universal code uniformly approaches the lower bounds on the compression ratios attainable by block-to-variable codes and variable-to-block codes designed to match a completely specified source. 呈现了一种用于序列数据压缩的通用算法。其性能是根据受约束源的非概率模型进行的调查。所提出的通用代码所获得的压缩率均匀地接近了块到可变长编码和可变长到块编码所能达到的压缩率下限,这些编码旨在匹配完全指定的源。
I. INTRODUCTION 一、 引言
IN MANY situations arising in digital communications and data processing, the encountered strings of data display various structural regularities or are otherwise subject to certain constraints, thereby allowing for storage and time-saving techniques of data compression. Given a discrete data source, the problem of data compression is first to identify the limitations of the source, and second to devise a coding scheme which, subject to certain performance criteria, will best compress the given source. 在数字通信和数据处理中的许多情况下,遇到的数据串显示各种结构性规则或受到某些约束,从而允许使用数据压缩的存储和节时技术。给定一个离散数据源,数据压缩的问题首先是识别该源的局限性,其次是设计一个在某些性能标准下最佳压缩给定源的编码方案。
Once the relevant source parameters have been identified, the problem reduces to one of minimum-redundancy coding. This phase of the problem has received extensive treatment in the literature [1]-[7]. 一旦确定了相关的源参数,问题就简化为最小冗余编码的问题。这个问题的这一阶段在文献[1]-[7]中已经得到了广泛的研究。
When no a priori knowledge of the source characteristics is available, and if statistical tests are either impossible or unreliable, the problem of data compression becomes considerably more complicated. In order to overcome these difficulties one must resort to universal coding schemes whereby the coding process is interlaced with a learning process for the varying source characteristics [8], [9]. Such coding schemes inevitably require a larger working memory space and generally employ performance criteria that are appropriate for a wide variety of sources. 当没有先验源特征知识可用,且统计检验要么不可能要么不可靠时,数据压缩问题将变得更加复杂。为克服这些困难,必须依赖通用编码方案,其中编码过程与对变化源特征的学习过程交织在一起。这种编码方案必然需要更大的工作内存空间,并通常采用适合多种源的性能准则。
In this paper, we describe a universal coding scheme which can be applied to any discrete source and whose performance is comparable to certain optimal fixed code book schemes designed for completely specified sources. For lack of adequate criteria, we do not attempt to rank the proposed scheme with respect to other possible universal coding schemes. Instead, for the broad class of sources defined in Section III, we derive upper bounds on the compression efficiency attainable with full a priori knowledge of the source by fixed code book schemes, and 在本文中,我们描述了一种通用编码方案,可应用于任何离散源,其性能可与为特定源设计的某些最优定长码本方案相媲美。由于缺乏适当的标准,我们没有尝试将所提出的方案与其他可能的通用编码方案进行排名比较。相反,对于第三节中定义的广泛源类,我们推导了利用定长码本方案在完全事先知道源的情况下可以达到的最大压缩效率的上界。
then show that the efficiency of our universal code with no a priori knowledge of the source approaches those bounds. 然后证明我们的无先验知识的通用编码的效率接近于这些界限。
The proposed compression algorithm is an adaptation of a simple copying procedure discussed recently [10] in a study on the complexity of finite sequences. Basically, we employ the concept of encoding future segments of the source-output via maximum-length copying from a buffer containing the recent past output. The transmitted codeword consists of the buffer address and the length of the copied segment. With a predetermined initial load of the buffer and the information contained in the codewords, the source data can readily be reconstructed at the decoding end of the process. 该提议的压缩算法是对最近[10]在关于有限序列复杂性的研究中讨论的简单复制过程的改编。基本上,我们采用通过从包含最近过去输出的缓冲区进行最大长度复制来对源输出的未来段进行编码的概念。传输的代码字由缓冲区地址和复制段的长度组成。通过缓冲区的预定初始负载和代码字中包含的信息,可以在解码过程的末端轻松地重构源数据。
The main drawback of the proposed algorithm is its susceptibility to error propagation in the event of a channel error. 所提出算法的主要缺点是易受信道错误传播的影响。
II. The Compression Algorithm 压缩算法
The proposed compression algorithm consists of a rule for parsing strings of symbols from a finite alphabet AA into substrings, or words, whose lengths do not exceed a prescribed integer L_(s)L_{s}, and a coding scheme which maps these substrings sequentially into uniquely decipherable codewords of fixed length L_(c)L_{c} over the same alphabet AA. 提议的压缩算法由以下两部分组成:一个将来自有限字母表 AA 的符号串解析为长度不超过预定整数 L_(s)L_{s} 的子串(或单词)的规则,以及一个将这些子串按顺序映射到同一字母表 AA 上的固定长度 L_(c)L_{c} 的唯一可解密编码字的编码方案。
The word-length bounds L_(s)L_{s} and L_(c)L_{c} allow for boundeddelay encoding and decoding, and they are related by 词长界限 L_(s)L_{s} 和 L_(c)L_{c} 允许有界延迟编码和解码,它们之间有如下关系
where |~x~|\lceil x\rceil is the least integer not smaller than xx, the logarithm base is the cardinality alpha\alpha of the alphabet AA, and nn is the length of a buffer, employed at the encoding end of the process, which stores the latest nn symbols emitted by the source. The exact relationship between nn and L_(s)L_{s} is discussed in Section III. Typically, n≃L_(s)alpha^(hL_(s))n \simeq L_{s} \alpha^{h L_{s}}, where 0 < h < 1<h<1. For on-line decoding, a buffer of similar length has to be employed at the decoding end also. 其中 |~x~|\lceil x\rceil 是不小于 xx 的最小整数,对数底数是字母表 AA 的基数 alpha\alpha , nn 是编码端使用的缓冲区长度,存储源最近输出的 nn 个符号。 nn 和 L_(s)L_{s} 之间的确切关系在第三节中讨论。通常, n≃L_(s)alpha^(hL_(s))n \simeq L_{s} \alpha^{h L_{s}} ,其中 0 < h < 1<h<1 。对于在线解码,解码端也需要使用类似长度的缓冲区。
To describe the exact mechanics of the parsing and coding procedures, we need some preparation by way of notation and definitions. 为了描述解析和编码过程的确切机制,我们需要通过符号和定义进行一些准备工作。
Consider a finite alphabet AA of alpha\alpha symbols, say A=A={0,1,cdots,alpha-1}\{0,1, \cdots, \alpha-1\}. A string, or word, SS of length ℓ(S)=k\ell(S)=k over AA is an ordered kk-tuple S=s_(1)s_(2)cdotss_(k)S=s_{1} s_{2} \cdots s_{k} of symbols from AA. To indicate a substring of SS which starts at position ii and ends at position jj, we write S(i,j)S(i, j). When i <= j,S(i,j)=i \leq j, S(i, j)=s_(i)s_(i+1)cdotss_(j)s_{i} s_{i+1} \cdots s_{j}, but when i > ji>j, we take S(i,j)=LambdaS(i, j)=\Lambda, the null string of length zero. 考虑一个有限字母表 AA ,其中有 alpha\alpha 个符号,如 A=A={0,1,cdots,alpha-1}\{0,1, \cdots, \alpha-1\} 。长度为 ℓ(S)=k\ell(S)=k 的字符串或单词 SS 是从 AA 中选择的符号的有序 kk 元组 S=s_(1)s_(2)cdotss_(k)S=s_{1} s_{2} \cdots s_{k} 。为了表示 SS 中从位置 ii 开始到位置 jj 结束的子字符串,我们写为 S(i,j)S(i, j) 。当 i <= j,S(i,j)=i \leq j, S(i, j)=s_(i)s_(i+1)cdotss_(j)s_{i} s_{i+1} \cdots s_{j} 时,但是当 i > ji>j 时,我们取 S(i,j)=LambdaS(i, j)=\Lambda ,即长度为零的空字符串。
The concatenation of strings QQ and RR forms a new string S=QRS=Q R; if ℓ(Q)=k\ell(Q)=k and ℓ(R)=m\ell(R)=m, then ℓ(S)=k+m,Q\ell(S)=k+m, Q=S(1,k)=S(1, k), and R=S(k+1,k+m)R=S(k+1, k+m). For each j,0 <= j <=j, 0 \leq j \leq 字符串 QQ 和 RR 的连接形成一个新的字符串 S=QRS=Q R ;如果 ℓ(Q)=k\ell(Q)=k 和 ℓ(R)=m\ell(R)=m ,则 ℓ(S)=k+m,Q\ell(S)=k+m, Q=S(1,k)=S(1, k) , R=S(k+1,k+m)R=S(k+1, k+m) 。对于每个 j,0 <= j <=j, 0 \leq j \leq ℓ(S),S(1,j)\ell(S), S(1, j) is called a prefix of S;S(1,j)S ; S(1, j) is a proper prefix of SS if j < ℓ(S)j<\ell(S). ℓ(S),S(1,j)\ell(S), S(1, j) 是 S;S(1,j)S ; S(1, j) 的前缀,如果 j < ℓ(S)j<\ell(S) ,则 ℓ(S),S(1,j)\ell(S), S(1, j) 是 S;S(1,j)S ; S(1, j) 的真前缀。
Given a proper prefix S(1,j)S(1, j) of a string SS and a positive integer ii such that i <= ji \leq j, let L(i)L(i) denote the largest nonnegative integer ℓ <= ℓ(S)-j\ell \leqslant \ell(S)-j such that 给定一个字符串 SS 的合适前缀 S(1,j)S(1, j) 和一个正整数 ii ,使得 i <= ji \leq j ,令 L(i)L(i) 表示最大的非负整数 ℓ <= ℓ(S)-j\ell \leqslant \ell(S)-j ,使得
and let pp be a position of S(1,j)S(1, j) for which 并且让 pp 成为 S(1,j)S(1, j) 的一个位置
L(p)=max_(1 <= i <= j){L(i)}L(p)=\max _{1 \leq i \leq j}\{L(i)\}
The substring S(j+1,j+L(p))S(j+1, j+L(p)) of SS is called the reproducible extension of S(1,j)S(1, j) into SS, and the integer pp is called the pointer of the reproduction. For example, if SS=00101011=00101011 and j=3j=3, then L(1)=1L(1)=1 since S(j+1,j+1)=S(j+1, j+1)=S(1,1)S(1,1) but S(j+1,j+2)!=S(1,2)S(j+1, j+2) \neq S(1,2). Similarly, L(2)=4L(2)=4 and L(3)=0L(3)=0. Hence, S(3+1,3+4)=0101S(3+1,3+4)=0101 is the reproducible extension of S(1,3)=001S(1,3)=001 into SS with pointer p=2p=2. S(j+1,j+L(p))S(j+1, j+L(p)) 是 SS 的可重复扩展到 SS ,整数 pp 是该重复的指针。例如,如果 SS=00101011=00101011 和 j=3j=3 ,那么 L(1)=1L(1)=1 ,因为 S(j+1,j+1)=S(j+1, j+1)=S(1,1)S(1,1) 但 S(j+1,j+2)!=S(1,2)S(j+1, j+2) \neq S(1,2) 。同样地, L(2)=4L(2)=4 和 L(3)=0L(3)=0 。因此, S(3+1,3+4)=0101S(3+1,3+4)=0101 是 S(1,3)=001S(1,3)=001 到 SS 的可重复扩展,指针为 p=2p=2 。
Now, to describe the encoding process, let S=s_(1)s_(2)dotsS=s_{1} s_{2} \ldots denote the string of symbols emitted by the source. The sequential encoding of SS entails parsing SS into successive source words, S=S_(1)S_(2)cdotsS=S_{1} S_{2} \cdots, and assigning a codeword C_(i)C_{i} for each S_(i)S_{i}. For bounded-delay encoding, the length ℓ_(i)\ell_{i} of each S_(i)S_{i} is at most equal to a predetermined parameter L_(s)L_{s}, while each C_(i)C_{i} is of fixed length L_(c)L_{c} as given by (1). 现在,为了描述编码过程,让 S=s_(1)s_(2)dotsS=s_{1} s_{2} \ldots 表示源发出的符号串。 SS 的顺序编码涉及将 SS 解析为连续的源字, S=S_(1)S_(2)cdotsS=S_{1} S_{2} \cdots ,并为每个 S_(i)S_{i} 分配一个代码字 C_(i)C_{i} 。对于有界延迟编码,每个 S_(i)S_{i} 的长度 ℓ_(i)\ell_{i} 最多等于预先确定的参数 L_(s)L_{s} ,而每个 C_(i)C_{i} 的长度 L_(c)L_{c} 是固定的,由(1)式给出。
To initiate the encoding process, we assume that the output SS of the source was preceded by a string ZZ of n-n-L_(s)L_{s} zeros, and we store the string B_(1)=ZS(1,L_(s))B_{1}=Z S\left(1, L_{s}\right) in the buffer. If S(1,j)S(1, j) is the reproducible extension of ZZ into ZS(1,L_(s)-1)Z S\left(1, L_{s}-1\right), then S_(1)=S(1,j+1)S_{1}=S(1, j+1) and ℓ_(1)=j+1\ell_{1}=j+1. To determine the next source word, we shift out the first ℓ_(1)\ell_{1} symbols from the buffer and feed into it the next ℓ_(1)\ell_{1} symbols of SS to obtain the string B_(2)=B_(1)(ℓ_(1)+1,n)S(L_(s)+1:}B_{2}=B_{1}\left(\ell_{1}+1, n\right) S\left(L_{s}+1\right., L_(s)+ℓ_(1)L_{s}+\ell_{1} ). Now we look for the reproducible extension EE of B_(2)(1,n-L_(s))B_{2}\left(1, n-L_{s}\right) into B_(2)(1,n-1)B_{2}(1, n-1), and set S_(2)=EsS_{2}=E s, where ss is the symbol next to EE in B_(2)B_{2}. In general, if B_(i)B_{i} denotes the string of nn source symbols stored in the buffer when we are ready to determine the ii th source word S_(i)S_{i}, the successive encoding steps can be formally described as follows. 为了启动编码过程,我们假设源输出 SS 是由长度为 n-n- 的 L_(s)L_{s} 个 0 组成的字符串 ZZ 前缀的,并将字符串 B_(1)=ZS(1,L_(s))B_{1}=Z S\left(1, L_{s}\right) 存储在缓冲区中。如果 S(1,j)S(1, j) 是 ZZ 延伸到 ZS(1,L_(s)-1)Z S\left(1, L_{s}-1\right) 的可复制扩展,那么 S_(1)=S(1,j+1)S_{1}=S(1, j+1) 和 ℓ_(1)=j+1\ell_{1}=j+1 。为了确定下一个源词,我们从缓冲区中移出前 ℓ_(1)\ell_{1} 个符号,并将 SS 的下一个 ℓ_(1)\ell_{1} 个符号输入到其中,以获得字符串 B_(2)=B_(1)(ℓ_(1)+1,n)S(L_(s)+1:}B_{2}=B_{1}\left(\ell_{1}+1, n\right) S\left(L_{s}+1\right. , L_(s)+ℓ_(1)L_{s}+\ell_{1} )。现在我们寻找 B_(2)(1,n-L_(s))B_{2}\left(1, n-L_{s}\right) 延伸到 B_(2)(1,n-1)B_{2}(1, n-1) 的可重复扩展 EE ,并设置 S_(2)=EsS_{2}=E s ,其中 ss 是 B_(2)B_{2} 中紧挨着 EE 的符号。一般而言,如果 B_(i)B_{i} 表示在我们准备确定第 ii 个源词 S_(i)S_{i} 时存储在缓冲区中的 nn 个源符号的字符串,则可以将连续的编码步骤形式化地描述如下。
Initially, set B_(1)=0^(n-L_(s))S(1,L_(s))B_{1}=0^{n-L_{s}} S\left(1, L_{s}\right), i.e., the all-zero string of length n-L_(s)n-L_{s} followed by the first L_(s)L_{s} symbols of SS. 最初,设置 B_(1)=0^(n-L_(s))S(1,L_(s))B_{1}=0^{n-L_{s}} S\left(1, L_{s}\right) ,即长度为 n-L_(s)n-L_{s} 的全零字符串,后跟 SS 的前 L_(s)L_{s} 个符号。
Having determined B_(i),i >= 1B_{i}, i \geq 1, set 确定 B_(i),i >= 1B_{i}, i \geq 1 之后,设置
where the prefix of length ℓ_(i)-1\ell_{i}-1 of S_(i)S_{i} is the reproducible extension of B_(i)(1,n-L_(s))B_{i}\left(1, n-L_{s}\right) into B_(i)(1,n-1)B_{i}(1, n-1). 其中长度为 ℓ_(i)-1\ell_{i}-1 的 S_(i)S_{i} 前缀是将 B_(i)(1,n-L_(s))B_{i}\left(1, n-L_{s}\right) 扩展到 B_(i)(1,n-1)B_{i}(1, n-1) 的可重复扩展。
3) If p_(i)p_{i} is the reproduction pointer used to determine S_(i)S_{i}, then the codeword C_(i)C_{i} for S_(i)S_{i} is given by 如果 p_(i)p_{i} 是用于确定 S_(i)S_{i} 的复制指针,那么 S_(i)S_{i} 的代码字 C_(i)C_{i} 为
where C_(i1)C_{i 1} is the radix- alpha\alpha representation of p_(i)-1p_{i}-1 with ℓ(C_(i1))=|~log(n-L_(s))~|,C_(i2)\ell\left(C_{i 1}\right)=\left\lceil\log \left(n-L_{s}\right)\right\rceil, C_{i 2} is the radix- alpha\alpha representation of ℓ_(i)-1\ell_{i}-1 with ℓ(C_(i2))=|~log L_(s)~|\ell\left(C_{i 2}\right)=\left\lceil\log L_{s}\right\rceil, and C_(i3)C_{i 3} is the last symbol of S_(i)S_{i}, i.e., the symbol occupying position n-L_(s)+ℓ_(i)n-L_{s}+\ell_{i} of B_(i)B_{i}. The total length of C_(i)C_{i} is given by 其中 C_(i1)C_{i 1} 是 p_(i)-1p_{i}-1 的基数- alpha\alpha 表示形式, ℓ(C_(i1))=|~log(n-L_(s))~|,C_(i2)\ell\left(C_{i 1}\right)=\left\lceil\log \left(n-L_{s}\right)\right\rceil, C_{i 2} 是 ℓ_(i)-1\ell_{i}-1 的基数- alpha\alpha 表示形式, ℓ(C_(i2))=|~log L_(s)~|\ell\left(C_{i 2}\right)=\left\lceil\log L_{s}\right\rceil , C_(i3)C_{i 3} 是 S_(i)S_{i} 的最后一个符号,即 B_(i)B_{i} 第 n-L_(s)+ℓ_(i)n-L_{s}+\ell_{i} 个位置的符号。 C_(i)C_{i} 的总长度由
in accordance with (1). 根据(1)。
4) To update the contents of the buffer, shift out the symbols occupying the first ℓ_(i)\ell_{i} positions of the buffer while feeding in the next ℓ_(i)\ell_{i} symbols from the source to obtain 更新缓冲区内容时,先移出缓冲区前 ℓ_(i)\ell_{i} 个位置占用的符号,然后从源中输入下一个 ℓ_(i)\ell_{i} 个符号
where h_(i)h_{i} is the position of SS occupied by the last symbol of B_(i)B_{i}. 其中 h_(i)h_{i} 是 SS 占用 B_(i)B_{i} 最后一个符号的位置。
This completes the description of the encoding process. It is easy to verify that the parsing rule defined by (2) guarantees a bounded, positive source word length in each iteration; in fact, 1 <= ℓ_(i) <= L_(s)1 \leq \ell_{i} \leq L_{s} for each ii thus allowing for a radix- alpha\alpha representation of ℓ_(i)-1\ell_{i}-1 with |~log L_(s)~|\left\lceil\log L_{s}\right\rceil symbols from AA. Also, since 1 <= p_(i) <= n-L_(s)1 \leq p_{i} \leq n-L_{s} for each ii, it is possible to represent p_(i)-1p_{i}-1 with |~log(n-L_(s))~|\left\lceil\log \left(n-L_{s}\right)\right\rceil symbols from AA. 这完成了编码过程的描述。很容易验证由(2)定义的解析规则保证了每次迭代中源词长度的有界性和正性; 事实上, 1 <= ℓ_(i) <= L_(s)1 \leq \ell_{i} \leq L_{s} 对于每一个 ii 从而允许用 alpha\alpha 进制表示 ℓ_(i)-1\ell_{i}-1 中 |~log L_(s)~|\left\lceil\log L_{s}\right\rceil 个符号来自 AA 。此外,由于 1 <= p_(i) <= n-L_(s)1 \leq p_{i} \leq n-L_{s} 对于每一个 ii ,因此可用 |~log(n-L_(s))~|\left\lceil\log \left(n-L_{s}\right)\right\rceil 个来自 AA 的符号表示 p_(i)-1p_{i}-1 。
Decoding can be performed simply by reversing the encoding process. Here we employ a buffer of length n-n-L_(s)L_{s} to store the latest decoded source symbols. Initially, the buffer is loaded with n-L_(s)n-L_{s} zeros. If D_(i)=d_(1)d_(2)cdotsd_(n-L_(s))D_{i}=d_{1} d_{2} \cdots d_{n-L_{s}} denotes the contents of the buffer after C_(i-1)C_{i-1} has been decoded into S_(i-1)S_{i-1}, then 解码只需简单地反转编码过程。这里我们使用长度为 n-n- 的缓冲区 L_(s)L_{s} 来存储最近解码的源符号。初始时,缓冲区被加载为 n-L_(s)n-L_{s} 个零。如果 D_(i)=d_(1)d_(2)cdotsd_(n-L_(s))D_{i}=d_{1} d_{2} \cdots d_{n-L_{s}} 表示在 C_(i-1)C_{i-1} 被解码为 S_(i-1)S_{i-1} 之后缓冲区的内容,那么
where ℓ_(i-1)=ℓ(S_(i-1))\ell_{i-1}=\ell\left(S_{i-1}\right), and where D_(i+1)D_{i+1} can be obtained from D_(i)D_{i} and C_(i)C_{i} as follows. 当 ℓ_(i-1)=ℓ(S_(i-1))\ell_{i-1}=\ell\left(S_{i-1}\right) ,且可以从 D_(i)D_{i} 和 C_(i)C_{i} 获得 D_(i+1)D_{i+1} ,如下所示。
Determine p_(i)-1p_{i}-1 and ℓ_(i)-1\ell_{i}-1 from the first |~log(n-L_(s))~|\left\lceil\log \left(n-L_{s}\right)\right\rceil and the next |~log L_(s)~|\left\lceil\log L_{s}\right\rceil symbols of C_(i)C_{i}. Then, apply ℓ_(i)-1\ell_{i}-1 shifts while feeding the contents of stage p_(i)p_{i} into stage n-n-L_(s)L_{s}. The first of these shifts will change the buffer contents from D_(i)D_{i} to 从 C_(i)C_{i} 的前 |~log(n-L_(s))~|\left\lceil\log \left(n-L_{s}\right)\right\rceil 个和后 |~log L_(s)~|\left\lceil\log L_{s}\right\rceil 个符号确定 p_(i)-1p_{i}-1 和 ℓ_(i)-1\ell_{i}-1 。然后在将 p_(i)p_{i} 阶段的内容输送到 n-n- 阶段时应用 ℓ_(i)-1\ell_{i}-1 个位移操作。第一次位移将把缓冲区内容从 D_(i)D_{i} 改变为
Similarly, if j <= ℓ_(i)-1j \leq \ell_{i}-1, the jj th shift will transform D_(i)^((j-1))D_{i}^{(j-1)}=d_(1)^((j-1))d_(2)^((j-1))cdotsd_(n-L_(s))^((j-1))=d_{1}^{(j-1)} d_{2}^{(j-1)} \cdots d_{n-L_{s}}^{(j-1)} into D_(i)^((j))=d_(2)^((j-1))d_(3)^((j-1))cdotsD_{i}^{(j)}=d_{2}^{(j-1)} d_{3}^{(j-1)} \cdotsd_(n-L_(s))^((j-1))d_(p_(i))^((j-1))=d_(1)^((j))d_(2)^((j))cdotsd_(n-L_(s))^((j))d_{n-L_{s}}^{(j-1)} d_{p_{i}}^{(j-1)}=d_{1}^{(j)} d_{2}^{(j)} \cdots d_{n-L_{s}}^{(j)}. After these ℓ_(i)-1\ell_{i}-1 shifts are completed, shift once more, while feeding the last symbol of C_(i)C_{i} into stage n-L_(s)n-L_{s} of the buffer. It is easy to verify that the resulting load of the buffer contains S_(i)S_{i} in its last ℓ_(i)=ℓ(S_(i))\ell_{i}=\ell\left(S_{i}\right) positions. 类似地,如果 j <= ℓ_(i)-1j \leq \ell_{i}-1 ,第 jj 次移位将 D_(i)^((j-1))D_{i}^{(j-1)}=d_(1)^((j-1))d_(2)^((j-1))cdotsd_(n-L_(s))^((j-1))=d_{1}^{(j-1)} d_{2}^{(j-1)} \cdots d_{n-L_{s}}^{(j-1)} 转换为 D_(i)^((j))=d_(2)^((j-1))d_(3)^((j-1))cdotsD_{i}^{(j)}=d_{2}^{(j-1)} d_{3}^{(j-1)} \cdotsd_(n-L_(s))^((j-1))d_(p_(i))^((j-1))=d_(1)^((j))d_(2)^((j))cdotsd_(n-L_(s))^((j))d_{n-L_{s}}^{(j-1)} d_{p_{i}}^{(j-1)}=d_{1}^{(j)} d_{2}^{(j)} \cdots d_{n-L_{s}}^{(j)} 。在这些 ℓ_(i)-1\ell_{i}-1 次移位完成后,再移位一次,同时将 C_(i)C_{i} 的最后一个符号输入缓冲区的 n-L_(s)n-L_{s} 级。很容易验证,缓冲区的最终负载在其最后 ℓ_(i)=ℓ(S_(i))\ell_{i}=\ell\left(S_{i}\right) 个位置包含 S_(i)S_{i} 。
The following example will serve to illustrate the mechanics of the algorithm. Consider the ternary ( alpha=3\alpha=3 ) input string 下面的示例将说明该算法的机制。考虑三进制( alpha=3\alpha=3 )输入字符串
and an encoder with parameters L_(s)=9L_{s}=9 and n=18n=18. (These parameters were chosen to simplify the illustration; they do not reflect the design considerations to be discussed in Section III.) According to (1), the corresponding codeword length is given by 以及一个参数为 L_(s)=9L_{s}=9 和 n=18n=18 的编码器。(这些参数是为了简化说明而选择的;它们并不反映第三节中要讨论的设计考虑因素。)根据(1),相应的码字长度由
Initially, the buffer is loaded with n-L_(s)=9n-L_{s}=9 zeros, followed by the first L_(s)=9L_{s}=9 digits of SS, namely, 最初,缓冲区被加载了 n-L_(s)=9n-L_{s}=9 个零,然后是 SS 的前 L_(s)=9L_{s}=9 个数字,即
To determine the first source word S_(1)S_{1}, we have to find the longest prefix B_(1)(10,9+ℓ_(1)-1)B_{1}\left(10,9+\ell_{1}-1\right) of 要确定第一个源词语 S_(1)S_{1} ,我们必须找到最长的前缀 B_(1)(10,9+ℓ_(1)-1)B_{1}\left(10,9+\ell_{1}-1\right)
B_(1)(10,17)=00101021B_{1}(10,17)=00101021
which matches a substring of B_(1)B_{1} that starts in position p_(1)p_{1}<= 9\leq 9 and then set S_(1)=B_(1)(10,9+ℓ_(1))S_{1}=B_{1}\left(10,9+\ell_{1}\right). It is easily seen that the longest match in this case is B_(1)(10,11)=00B_{1}(10,11)=00, and hence S_(1)=001S_{1}=001 and ℓ_(1)=3\ell_{1}=3. The pointer p_(1)p_{1} for this step can be any integer between one and nine; we choose to set p_(1)=9p_{1}=9. The two-digit radix-3 representation of p_(1)-1p_{1}-1 is C_(11)=22C_{11}=22, and that of ℓ_(1)-1\ell_{1}-1 is C_(12)=02C_{12}=02. Since C_(i3)C_{i 3} is always equal to the last symbol of S_(i)S_{i}, the codeword for S_(1)S_{1} is given by C_(1)=C_{1}= 22021. 该匹配 B_(1)B_{1} 的子字符串,从位置 p_(1)p_{1}<= 9\leq 9 开始,然后设置 S_(1)=B_(1)(10,9+ℓ_(1))S_{1}=B_{1}\left(10,9+\ell_{1}\right) 。很容易看出,在这种情况下最长的匹配是 B_(1)(10,11)=00B_{1}(10,11)=00 ,因此 S_(1)=001S_{1}=001 和 ℓ_(1)=3\ell_{1}=3 。这一步中指针 p_(1)p_{1} 可以是 1 到 9 之间的任意整数;我们选择设置 p_(1)=9p_{1}=9 。 p_(1)-1p_{1}-1 的二进制三进位表示是 C_(11)=22C_{11}=22 , ℓ_(1)-1\ell_{1}-1 的二进制三进位表示是 C_(12)=02C_{12}=02 。由于 C_(i3)C_{i 3} 始终等于 S_(i)S_{i} 的最后一个符号, S_(1)S_{1} 的码字是由 C_(1)=C_{1}= 22021 给出的。
To obtain the buffer load B_(2)B_{2} for the second step, we shift out the first ℓ_(1)=3\ell_{1}=3 digits of B_(1)B_{1} and feed in the next 3 digits S(10,12)=210S(10,12)=210 of the input string SS. The details of steps 2,3 , and 4 are tabulated below, where pointer positions are indicated by arrows and where the source words S_(i)S_{i} are indicated by the italic substring of the corresponding buffer load B_(i)B_{i} 要获得第二步的缓冲区负载 B_(2)B_{2} ,我们将 B_(1)B_{1} 的前 ℓ_(1)=3\ell_{1}=3 位移出并输入输入字符串 SS 的下 3 位 S(10,12)=210S(10,12)=210 。下表列出了步骤 2、3 和 4 的详细信息,其中指针位置由箭头指示,源词 S_(i)S_{i} 由相应缓冲区负荷 B_(i)B_{i} 的斜体子字符串表示。
In this section, we investigate the performance of the proposed compression algorithm with respect to a nonprobabilistic model of constrained information sources. After defining the source model, we derive lower bounds on the compression ratios attainable by block-to-variable and variable-to-block codes under full knowledge of the source, and then show that the compression ratio achieved by our universal code approaches these bounds. 在本节中,我们研究了所提出的压缩算法在受约束信息源的非概率模型下的性能。在定义源模型后,我们推导出在完全了解源的情况下,块变长码和变长块码所能达到的压缩率下限,然后证明我们的通用码所实现的压缩率接近这些下限。
A. Definition of the Source Model 源模型的定义
Let A={0,1,cdots,alpha-1}A=\{0,1, \cdots, \alpha-1\} be the given alpha\alpha-symbol alphabet, and let A^(**)A^{*} denote the set of all finite strings over AA. Given a string S inA^(**)S \in A^{*} and a positive integer m <= ℓ(S)m \leq \ell(S), let S{m}S\{m\} denote the set of all substrings of length mm contained in SS, and let S(m)S(m) denote the cardinality of S{m}S\{m\}. That is, 令 A={0,1,cdots,alpha-1}A=\{0,1, \cdots, \alpha-1\} 为给定的 alpha\alpha 字符集, 令 A^(**)A^{*} 表示 AA 上的所有有限字符串的集合. 给定一个字符串 S inA^(**)S \in A^{*} 和一个正整数 m <= ℓ(S)m \leq \ell(S) , 令 S{m}S\{m\} 表示 SS 中长度为 mm 的所有子串的集合, 令 S(m)S(m) 表示 S{m}S\{m\} 的基数.
Given a subset sigma\sigma of A^(**)A^{*}, let 给定 sigma\sigma 是 A^(**)A^{*} 的子集, 令
sigma{m}={S in sigma∣ℓ(S)=m}\sigma\{m\}=\{S \in \sigma \mid \ell(S)=m\}
and let sigma(m)\sigma(m) denote the cardinality of sigma{m}\sigma\{m\}. 并且让 sigma(m)\sigma(m) 表示 sigma{m}\sigma\{m\} 的基数。
A subset sigma\sigma of A^(**)A^{*} is called a source if the following three properties hold: 子集 sigma\sigma 称为源,如果满足以下三个属性:
A sub sigmaA \subset \sigma (i.e., sigma\sigma contains all the unit length strings), A sub sigmaA \subset \sigma (即, sigma\sigma 包含所有单位长度的字符串),
S in sigmaS \in \sigma implies SS in sigmaS S \in \sigma, S in sigmaS \in \sigma 意味着 SS in sigmaS S \in \sigma
S in sigmaS \in \sigma implies S{m}sub sigma{m}S\{m\} \subset \sigma\{m\}. S in sigmaS \in \sigma 意味着 S{m}sub sigma{m}S\{m\} \subset \sigma\{m\} 。
Typically, such a source sigma\sigma is defined by specifying a finite set of strings over AA which are forbidden to appear as. substrings of elements belonging to sigma\sigma, and therefore sigma(m)\sigma(m)< alpha^(m)<\alpha^{m} for all mm exceeding some m_(0)m_{0}. 通常,这种来源 sigma\sigma 是通过指定一组有限的字符串,它们不能作为 sigma\sigma 中元素的子串而被定义的,因此 sigma(m)\sigma(m)< alpha^(m)<\alpha^{m} 所有 mm 超过某个 m_(0)m_{0} 。
With every source sigma\sigma, we associate a sequence h(1)h(1), h(2),cdotsh(2), \cdots of parameters, called the hh-parameters of sigma\sigma, where ^(1){ }^{1} 对于每个源 sigma\sigma ,我们都会关联一个参数序列 h(1)h(1) , h(2),cdotsh(2), \cdots ,称为 hh 参数,其中 ^(1){ }^{1}
It is clear that 0 <= h(m) <= 10 \leq h(m) \leq 1 for all mm and, by 2 ) it is also clear that mh(m)m h(m) is a nondecreasing function of mm. The sequence of hh-parameters, however, is usually nonincreasing in mm. To avoid any possible confusion in the sequel, we postulate this property as an additional defining property of a source. Namely, we require 很明显, 0 <= h(m) <= 10 \leq h(m) \leq 1 对于所有 mm ,并且,根据 2)也很明显, mh(m)m h(m) 是 mm 的非递减函数。但是, hh 参数的序列通常在 mm 中是非递增的。为了避免后续可能产生的任何困惑,我们将此特性作为源的额外定义属性来要求。
4) h(m)=1//m log sigma(m)h(m)=1 / m \log \sigma(m) is a nonincreasing function of mm. 4) h(m)=1//m log sigma(m)h(m)=1 / m \log \sigma(m) 是 mm 的非递增函数。
B. Some Lower Bounds on the Compression Ratio 压缩率的一些下限
Consider a compression coding scheme for a source sigma\sigma which employs a block-to-variable ( BVB V ) code book of MM pairs (X_(i),Y_(i))\left(X_{i}, Y_{i}\right) of words over AA, with ℓ(X_(i))=L\ell\left(X_{i}\right)=L for i=i=1,2,cdots,M1,2, \cdots, M. The encoding of an infinitely long string S inS \insigma\sigma by such a code is carried out by first parsing SS into blocks of length LL, and then replacing each block X_(i)X_{i} by the corresponding codeword Y_(i)Y_{i}. It is assumed, of course, that the code book is exhaustive with respect to sigma\sigma and uniquely decipherable [2]. Hence, we must have 考虑一个源码 sigma\sigma 的压缩编码方案,它使用了一个块到变长( BVB V )代码本,其中包括 MM 对 (X_(i),Y_(i))\left(X_{i}, Y_{i}\right) 单词,这些单词来自 AA ,且 ℓ(X_(i))=L\ell\left(X_{i}\right)=L 为 i=i=1,2,cdots,M1,2, \cdots, M 。通过这种编码方式,无限长的字符串 S inS \insigma\sigma 会被编码成由长度为 LL 的块 X_(i)X_{i} 组成,每个块被替换为相应的代码字 Y_(i)Y_{i} 。当然,假设代码本针对 sigma\sigma 是充分的,并且是唯一可解的[2]。因此,我们必须有:
The BVB V compression ratio, rho_(BV)(sigma,M)\rho_{B V}(\sigma, M), of the source sigma\sigma. is defined as the minimax value of rho_(i)\rho_{i}, where the maximization is over all word-pairs of a given code, and the minimization is over the set C_(BV)(sigma,M)C_{B V}(\sigma, M) of all BVB V code books consisting of MM word-pairs. Thus, 源码 sigma\sigma 的 BVB V 压缩比 rho_(BV)(sigma,M)\rho_{B V}(\sigma, M) 定义为 rho_(i)\rho_{i} 的极小极大值,其中最大化是针对给定编码的所有词对,而最小化是针对所有由 MM 个词对组成的 BVB V 编码本集合 C_(BV)(sigma,M)C_{B V}(\sigma, M) 。
{:[{:[rho_(BV)(sigma","M)=min_(C_(BV)(sigma,M))max_(1 <= i <= M){(ℓ(Y_(i)))/(L)}],[ >= (log M)/(L)=(Lh(L))/(L)=h(L)]:}],[^(1)" Throughout this paper, log "x" means the base- "alpha" logarithm of "x.]:}\begin{aligned}
& \begin{aligned}
\rho_{B V}(\sigma, M)= & \min _{C_{B V}(\sigma, M)} \max _{1 \leq i \leq M}\left\{\frac{\ell\left(Y_{i}\right)}{L}\right\} \\
& \geq \frac{\log M}{L}=\frac{L h(L)}{L}=h(L)
\end{aligned} \\
& { }^{1} \text { Throughout this paper, log } x \text { means the base- } \alpha \text { logarithm of } x .
\end{aligned}
For later reference, we record this result in the following lemma. 为了后续参考,我们在下面的引理中记录了这个结果。
Lemma 1: 引理 1:
rho_(BV)(sigma,M) >= h(L)\rho_{B V}(\sigma, M) \geq h(L)
where 哪里
Lh(L)=log ML h(L)=\log M
Now, consider a compression scheme which employs a variable-to-block ( VBV B ) code book of MM word-pairs ( X_(i),Y_(i)X_{i}, Y_{i} ), with ℓ(Y_(i))=L\ell\left(Y_{i}\right)=L for all i=1,2,cdots,Mi=1,2, \cdots, M. In this case, the compression ratio associated with the ii th word-pair is given by 现在,考虑一个使用可变到块(v-b)代码簿的压缩方案,其中包含 n 个单词对,其中所有单词的长度均为 m。在这种情况下,与第 k 个单词对相关的压缩比给出为
and similarly, the VBV B compression ratio rho_(VB)(sigma,M)\rho_{V B}(\sigma, M) of sigma\sigma is defined as the minimax value of rho_(i)\rho_{i} over all word-pairs and over the set C_(VB)(sigma,M)C_{V B}(\sigma, M) of VBV B code books with MM wordpairs. 同样地, VBV B 压缩比 rho_(VB)(sigma,M)\rho_{V B}(\sigma, M) 被定义为 sigma\sigma 上的极值,其中涉及所有单词对以及 C_(VB)(sigma,M)C_{V B}(\sigma, M) 中的 VBV B 个代码本,每个代码本包含 MM 个单词对。
Lemma 2: 引理 2:
rho_(VB)(sigma,M) >= h(L_(M))\rho_{V B}(\sigma, M) \geq h\left(L_{M}\right)
where 哪里
L_(M)=max{ℓ∣M >= sigma(ℓ)}L_{M}=\max \{\ell \mid M \geq \sigma(\ell)\}
Proof: We may assume, without loss of generality, that in every code under consideration 我们可以设无损概括地说,在每个考虑的代码中
Since CC is exhaustive with respect to sigma\sigma, we have 由于 CC 对 sigma\sigma 是广泛的,我们有
M >= sigma(ℓ_(1))M \geq \sigma\left(\ell_{1}\right)
where ℓ_(1)=ℓ(X_(1))\ell_{1}=\ell\left(X_{1}\right); and since CC is uniquely decipherable, we have 在 ℓ_(1)=ℓ(X_(1))\ell_{1}=\ell\left(X_{1}\right) ;由于 CC 是唯一可解码的,我们有
L(C) >= log ML(C) \geq \log M
From the definition of L_(M)L_{M}, inequality (6), and the nondecreasing property of sigma(ℓ)\sigma(\ell), we obtain 从 L_(M)L_{M} 的定义、不等式(6)和 sigma(ℓ)\sigma(\ell) 的非递减性质可以得出
ℓ_(1) <= L_(M)\ell_{1} \leq L_{M}
From (5), (7), and (8), we have 从(5)、(7)和(8)式中,我们有
Since the value of LL in the context of Lemma 1 satisfies the definition of L_(M)L_{M} as given in Lemma 2, it follows that the bounds of both lemmas are essentially the same, despite the basic difference between the respective coding schemes. 鉴于在引理 1 的上下文中 LL 的值满足引理 2 给出的 L_(M)L_{M} 的定义,因此两个引理的界限本质上是相同的,尽管它们各自的编码方案存在基本差异。
By 2), the second defining property of a source, the per-word bounds derived above apply to indefinitely long messages as well, since the whole message may consist of repeated appearances of the same worst case word. 根据第二个定义的源特性,上述推导的逐词界限也适用于无限长的消息,因为整个消息可能由同一个最坏情况词的重复出现组成。
By 4), the nonincreasing property of the hh-parameters, the form of the derived bounds confirms the intuitive expectation that an increase in the size MM of the employed code book causes a decrease in the lower bound on the attainable compression ratio. 根据 4)中 hh 参数的非增性质,衍生界限的形式确认了直观预期,即所采用的代码本尺寸 MM 增加会导致可实现压缩比的下界降低。
C. Performance of the Proposed Algorithm 提议算法的性能
We proceed now to derive an upper bound on the compression ratio attainable by the algorithm of Section II. To this end, we consider the worst case source message of length n-L_(s)n-L_{s}, where nn is the prospective buffer length and L_(s)L_{s} is the maximal word-length. The bound obtained for this case will obviously apply to all messages of length nn - L_(s)L_{s} or greater. 我们现在要推导第二节算法所能达到的最大压缩比。为此,我们考虑长度为 n-L_(s)n-L_{s}
First, we assume that only the hh-parameters of the source under consideration are known to the designer of the encoder. Later, when we discuss the universal performance of the proposed algorithm, we will show that even this restricted aa priori knowledge of the source is actually unessential. 首先,我们假设只有 hh 参数对于编码器的设计者是已知的。当我们讨论所提出算法的整体性能时,我们将证明即使这种受限的 aa 先验知识也是不必要的。
We begin by choosing the buffer length nn to be an integer of the form 我们首先选择缓冲区长度 nn 为以下形式的整数
n=sum_(m=1)^(lambda)malpha^(m)+sum_(m=lambda+1)^(ℓ)m sigma(ℓ)+(ℓ+1)(N_(ℓ+1)+1)n=\sum_{m=1}^{\lambda} m \alpha^{m}+\sum_{m=\lambda+1}^{\ell} m \sigma(\ell)+(\ell+1)\left(N_{\ell+1}+1\right)
lambda=|__ℓh(ℓ)__|\lambda=\lfloor\ell h(\ell)\rfloor, the integer part of log sigma(ℓ)=ℓh(ℓ)\log \sigma(\ell)=\ell h(\ell), and lambda=|__ℓh(ℓ)__|\lambda=\lfloor\ell h(\ell)\rfloor , log sigma(ℓ)=ℓh(ℓ)\log \sigma(\ell)=\ell h(\ell) 的整数部分,和
ℓ=L_(s)-1\ell=L_{s}-1
The specific valuc of the parameter L_(s)L_{s} is left for later determination (see the first remark following the proof of Theorem 1). The reasoning that motivates the given form of nn will become clear from subsequent derivations. 参数 L_(s)L_{s} 的具体值将留待以后确定(见定理 1 证明后的第一条备注)。给定 nn 形式的推导过程将在后续推导中阐明。
Consider a string S in sigma{n-L_(s)}S \in \sigma\left\{n-L_{s}\right\}, and let N(S)N(S) denote the number of words into which SS is parsed by the algorithm during the encoding process. Recalling that each of these words is mapped into a codeword of fixed length L_(c)L_{c} (see (1)), it follows that the compression ratio rho(S)\rho(S) associated with the string SS is given by 考虑一个字符串 S in sigma{n-L_(s)}S \in \sigma\left\{n-L_{s}\right\} ,令 N(S)N(S) 表示在编码过程中算法将其解析为的字词数。回顾每个字词都被映射为固定长度 L_(c)L_{c} 的码字(见(1)),则与字符串 SS 相关的压缩比 rho(S)\rho(S) 为
Hence, the compression ratio rho\rho attainable by the algorithm for a given source sigma\sigma is 因此,算法对于给定的源代码 sigma\sigma 所能实现的压缩率为 rho\rho
rho=(L_(c))/(n-L_(s))N\rho=\frac{L_{c}}{n-L_{s}} N
where 哪里
N=max_(S in sigma{n-L_(s)})N(S)N=\max _{S \in \sigma\left\{n-L_{s}\right\}} N(S)
Let Q in sigma{n-L_(s)}Q \in \sigma\left\{n-L_{s}\right\} be such that N(Q)=NN(Q)=N, and suppose that the algorithm parses QQ into Q=Q_(1)Q_(2)cdotsQ_(N)Q=Q_{1} Q_{2} \cdots Q_{N}. From step 2) of the encoding cycle it follows that if ℓ(Q_(i))=ℓ(Q_(j))\ell\left(Q_{i}\right)=\ell\left(Q_{j}\right)< L_(s)<L_{s}, for some i < j < Ni<j<N, then Q_(i)!=Q_(j)Q_{i} \neq Q_{j}. (Note that when Q_(j)Q_{j} is being determined at the jj th cycle of the encoding process, all of Q_(i)Q_{i} is still stored in the buffer, and since ℓ(Q_(j))\ell\left(Q_{j}\right)< L_(s)<L_{s}, the longest substring in the buffer that precedes Q_(j)Q_{j} and is a prefix of Q_(j)Q_{j} musi be of length ℓ(Q_(j))-1\ell\left(Q_{j}\right)-1.) 让 Q in sigma{n-L_(s)}Q \in \sigma\left\{n-L_{s}\right\} 满足 N(Q)=NN(Q)=N ,假设算法将 QQ 解析为 Q=Q_(1)Q_(2)cdotsQ_(N)Q=Q_{1} Q_{2} \cdots Q_{N} 。根据编码循环的第 2 步可知,如果 ℓ(Q_(i))=ℓ(Q_(j))\ell\left(Q_{i}\right)=\ell\left(Q_{j}\right)< L_(s)<L_{s} ,对于某些 i < j < Ni<j<N ,则 Q_(i)!=Q_(j)Q_{i} \neq Q_{j} 。(注意,当在编码过程的第 jj 个循环中确定 Q_(j)Q_{j} 时,所有 Q_(i)Q_{i} 仍存储在缓冲区中,并且由于 ℓ(Q_(j))\ell\left(Q_{j}\right)< L_(s)<L_{s} ,缓冲区中最长的前导 Q_(j)Q_{j} 的前缀子串长度必须为 ℓ(Q_(j))-1\ell\left(Q_{j}\right)-1 。)
Denoting by K_(m)K_{m} the number of Q_(i),1 <= i <= N-1Q_{i}, 1 \leq i \leq N-1, of length m,1 <= m <= L_(s)m, 1 \leq m \leq L_{s}, we have 设 K_(m)K_{m} 表示 Q_(i),1 <= i <= N-1Q_{i}, 1 \leq i \leq N-1 的数量,长度为 m,1 <= m <= L_(s)m, 1 \leq m \leq L_{s} ,那么
By the above argument, and by property 3) of the source, we have 通过上述论证以及根据源的属性 3),我们有
K_(m) <= sigma(m),quad" for "1 <= m <= ℓ=L_(s)-1K_{m} \leq \sigma(m), \quad \text { for } 1 \leq m \leq \ell=L_{s}-1
Since 自从
n-L_(s)=ℓ(Q_(N))+sum_(m=1)^(ℓ+1)mK_(m)n-L_{s}=\ell\left(Q_{N}\right)+\sum_{m=1}^{\ell+1} m K_{m}
and nn and L_(s)L_{s} are both fixed, it is clear that by overestimating the values of K_(m)K_{m} for 1 <= m <= ℓ1 \leq m \leq \ell at the expense of K_(ℓ+1)K_{\ell+1}, we can only overestimate the value of NN. Therefore, since sigma(m) <= sigma(m+1)\sigma(m) \leq \sigma(m+1) and sigma(m)=alpha^(mh(m)) <= alpha^(m)\sigma(m)=\alpha^{m h(m)} \leq \alpha^{m}, we obtain 和 nn 和 L_(s)L_{s} 都是固定的,很明显,通过高估 1 <= m <= ℓ1 \leq m \leq \ell 的 K_(m)K_{m} 值来牺牲 K_(ℓ+1)K_{\ell+1} ,我们只能高估 NN 的值。因此,既然 sigma(m) <= sigma(m+1)\sigma(m) \leq \sigma(m+1) 和 sigma(m)=alpha^(mh(m)) <= alpha^(m)\sigma(m)=\alpha^{m h(m)} \leq \alpha^{m} ,我们得到
N <= K_(ℓ+1)^(')+sum_(m=1)^(ℓ)K_(m)^(')=N^(')N \leq K_{\ell+1}^{\prime}+\sum_{m=1}^{\ell} K_{m}^{\prime}=N^{\prime}
where 哪里
K_(m)^(')={[alpha^(m)","," for "1 <= m <= lambda=|__ℓh(ℓ)__|],[sigma(ℓ)","," for "lambda < m <= ℓ]:}K_{m}^{\prime}= \begin{cases}\alpha^{m}, & \text { for } 1 \leq m \leq \lambda=\lfloor\ell h(\ell)\rfloor \\ \sigma(\ell), & \text { for } \lambda<m \leq \ell\end{cases}
and 和
K_(ℓ+1)^(')=|~(1)/(ℓ+1)(n-L_(s)-sum_(m=1)^(ℓ)mK_(m)^('))~|.K_{\ell+1}^{\prime}=\left\lceil\frac{1}{\ell+1}\left(n-L_{s}-\sum_{m=1}^{\ell} m K_{m}^{\prime}\right)\right\rceil .
From (14), (15), and (9), we obtain K_(ℓ+1)^(')=N_(ℓ+1)K_{\ell+1}^{\prime}=N_{\ell+1}, and 从(14)、(15)和(9)式中得到 K_(ℓ+1)^(')=N_(ℓ+1)K_{\ell+1}^{\prime}=N_{\ell+1}
Note that despite the rater rudimentary overestimation of NN by N^(')N^{\prime}, the upper bound of (17) is quite tight, since the fact that no source word is longer than L_(s)L_{s} immediately implies rho >= L_(c)//L_(s)\rho \geq L_{c} / L_{s}. 尽管 N^(')N^{\prime} 对 NN 有相当粗糙的过度估计,但式(17)的上界是相当紧的,因为没有源词长于 L_(s)L_{s} 这一事实意味着 rho >= L_(c)//L_(s)\rho \geq L_{c} / L_{s} 。
We can state now the following result. 我们现在可以陈述以下结果。
Theorem 1: If the buffer length nn for a source with known hh-parameters is chosen according to (9), then 定理 1:如果对于已知参数的源,缓冲区长度 nn 按照公式 (9) 选择,那么
and since alpha^(m) <= sigma(ℓ)\alpha^{m} \leq \sigma(\ell), for 1 <= m <= lambda1 \leq m \leq \lambda, we have 并且自从 alpha^(m) <= sigma(ℓ)\alpha^{m} \leq \sigma(\ell) ,为了 1 <= m <= lambda1 \leq m \leq \lambda ,我们有
Substituting this result into (17), we obtain the bound of Theorem 1. 将这个结果代入(17)式,我们就得到定理 1 的界限。
Q.E.D. 证明完毕。
Remarks 评论
The value of epsilon(L_(s))\epsilon\left(L_{s}\right) decreases with L_(s)L_{s} and, consequently, the compression ratio rho\rho approaches the value of h(L_(s)-1)h\left(L_{s}-1\right), the hh-parameter associated with the second largest word-length processed by the encoder. Given any delta > 0\delta>0, one can always find the least integer ℓ_(s)\ell_{s} such that rho-h(ℓ_(s)-1) <= delta\rho-h\left(\ell_{s}-1\right) \leq \delta. The magnitude of the acceptable de- 值 epsilon(L_(s))\epsilon\left(L_{s}\right) 随 L_(s)L_{s} 而降低,因此压缩率 rho\rho 接近 h(L_(s)-1)h\left(L_{s}-1\right) 的值,这是编码器处理的第二大单词长度相关的 hh 参数。对于任何给定的 delta > 0\delta>0 ,总可以找到最小整数 ℓ_(s)\ell_{s} 使得 rho-h(ℓ_(s)-1) <= delta\rho-h\left(\ell_{s}-1\right) \leq \delta 成立。可接受的解压缩幅度的大小
viation delta\delta determines the operational range of L_(s)L_{s}, namely, L_(s) >= ℓ_(s)L_{s} \geq \ell_{s}. 航空 delta\delta 决定了 L_(s)L_{s} 的操作范围,即 L_(s) >= ℓ_(s)L_{s} \geq \ell_{s} 。
Since our code maps source words of variable length at most L_(s)L_{s} into codewords of fixed length L_(c)L_{c}, we adopt as a reference for comparison the best VBV B code C_(VB)C_{V B} discussed in Subsection III-B. The counterpart of our block-length L_(c)L_{c} is L(C)L(C) of (5), and by (7) we can write 由于我们的代码将长度最长为 L_(s)L_{s} 的源词映射到长度固定为 L_(c)L_{c} 的码字,我们采用第 III-B 小节中讨论的最佳 VBV B 码 C_(VB)C_{V B} 作为比较参考。我们的块长 L_(c)L_{c} 的对应项是(5)中的 L(C)L(C) ,根据(7)我们可以写为
L_(c)~~log M~~L_(M)h(L_(M))L_{c} \approx \log M \approx L_{M} h\left(L_{M}\right)
where MM is the code book size and h(L_(M))h\left(L_{M}\right) is the lower bound (see Lemma 2) on the compression ratio rho_(VB)\rho_{V B} attainable by C_(VB)C_{V B}. Since for sufficiently large L_(s)L_{s}, we have also 其中 MM 是码书大小, h(L_(M))h\left(L_{M}\right) 是可实现的压缩比下限 (见引理 2), rho_(VB)\rho_{V B} 是可实现的压缩比. 当 L_(s)L_{s} 足够大时, 还有
it follows that rho~~rho_(VB)\rho \approx \rho_{V B}. 故此, rho~~rho_(VB)\rho \approx \rho_{V B} 。
We turn now to investigate the universal performance of the proposed algorithm, i.e., the performance when no a priori knowledge of the source to be compressed is available. 我们现在转向研究所提出算法的普遍性能,即在没有预先了解要压缩的源信息的情况下的性能。
Given delta_(1)\delta_{1} and h_(1)h_{1} such that 0 < delta_(1) < h_(1) < 10<\delta_{1}<h_{1}<1, let ℓ_(1)\ell_{1} be the least positive integer satisfying 给定 delta_(1)\delta_{1} 和 h_(1)h_{1} 满足 0 < delta_(1) < h_(1) < 10<\delta_{1}<h_{1}<1 条件,求 ℓ_(1)\ell_{1} 的最小正整数值
and let K=alpha^(ℓ_(1)h_(1))K=\alpha^{\ell_{1} h_{1}} and lambda_(1)=|__ℓ_(1)h_(1)__|\lambda_{1}=\left\lfloor\ell_{1} h_{1}\right\rfloor. Consider the encoder E_(1)E_{1} which employs a buffer of length n_(1)=n(h_(1),ℓ_(1))n_{1}=n\left(h_{1}, \ell_{1}\right), obtained from (9) and (10) by setting ℓ=ℓ_(1),lambda=lambda_(1)\ell=\ell_{1}, \lambda=\lambda_{1}, and sigma(ℓ)\sigma(\ell)=K=K, and whose maximal word-length L_(s)L_{s} is equal to ℓ_(1)+\ell_{1}+ 1. 并让 K=alpha^(ℓ_(1)h_(1))K=\alpha^{\ell_{1} h_{1}} 和 lambda_(1)=|__ℓ_(1)h_(1)__|\lambda_{1}=\left\lfloor\ell_{1} h_{1}\right\rfloor 。考虑采用长度为 n_(1)=n(h_(1),ℓ_(1))n_{1}=n\left(h_{1}, \ell_{1}\right) 的缓冲区的编码器 E_(1)E_{1} ,该缓冲区由(9)和(10)设置 ℓ=ℓ_(1),lambda=lambda_(1)\ell=\ell_{1}, \lambda=\lambda_{1} 和 sigma(ℓ)\sigma(\ell)=K=K 获得,其最大字长 L_(s)L_{s} 等于 ℓ_(1)+\ell_{1}+ 1。
It follows from Theorem 1 that if this encoder is applied to a source sigma_(1)\sigma_{1} such that h_(sigma_(1))(ℓ_(1))=h_(1)h_{\sigma_{1}}\left(\ell_{1}\right)=h_{1}, then the resulting compression ratio rho_(1)(sigma_(1))\rho_{1}\left(\sigma_{1}\right) satisfies 根据定理 1,如果对源 sigma_(1)\sigma_{1} 应用此编码器,使得 h_(sigma_(1))(ℓ_(1))=h_(1)h_{\sigma_{1}}\left(\ell_{1}\right)=h_{1} ,则所得压缩比 rho_(1)(sigma_(1))\rho_{1}\left(\sigma_{1}\right) 满足
Suppose now that h_(1)h_{1} and delta_(1)\delta_{1} were chosen to fit the prescribed upper value rho_(1)=h_(1)+delta_(1)\rho_{1}=h_{1}+\delta_{1} of a prospective compression ratio range ( rho_(0),rho_(1)\rho_{0}, \rho_{1} ) with 假设现在 h_(1)h_{1} 和 delta_(1)\delta_{1} 被选择来适应预期压缩比范围 ( rho_(0),rho_(1)\rho_{0}, \rho_{1} ) 的规定上限 rho_(1)=h_(1)+delta_(1)\rho_{1}=h_{1}+\delta_{1}
where RR is an adjustment parameter to be determined later. As shown above, the encoder E_(1)E_{1} is then matched exactly to the upper value rho_(1)\rho_{1} of the prospective compression range. In order to accommodate the whole given range, we propose to employ a slightly larger encoder E_(0)E_{0} whose buffer is of length n_(0)=n_(1)-ℓ_(1)+ℓ_(0)n_{0}=n_{1}-\ell_{1}+\ell_{0}, where n_(1)n_{1} and ℓ_(1)\ell_{1} are the parameters of E_(1)E_{1}, and where ℓ_(0)\ell_{0} is an integer, greater than ℓ_(1)\ell_{1}, for which the solution h_(0)h_{0} of the equation 其中 RR 是稍后确定的调整参数。如上所示,编码器 E_(1)E_{1} 精确地匹配于预期压缩范围的上限值 rho_(1)\rho_{1} 。为了容纳整个给定范围,我们建议使用略大的编码器 E_(0)E_{0} ,其缓冲区长度为 n_(0)=n_(1)-ℓ_(1)+ℓ_(0)n_{0}=n_{1}-\ell_{1}+\ell_{0} ,其中 n_(1)n_{1} 和 ℓ_(1)\ell_{1} 是 E_(1)E_{1} 的参数,而 ℓ_(0)\ell_{0} 是一个大于 ℓ_(1)\ell_{1} 的整数,其方程的解为 h_(0)h_{0} 。
Noting that n_(0)-ℓ_(0)=n_(1)-ℓ_(1)n_{0}-\ell_{0}=n_{1}-\ell_{1} is fixed, it is clear from (9) and (10) that as ℓ_(0)\ell_{0} increases h_(0)h_{0} decreases; also, (21) and the fact that ℓ_(0) > ℓ_(1)\ell_{0}>\ell_{1} imply that rho_(0)-epsilon(ℓ_(0)+1) > rho_(0)-epsilon(ℓ_(1)+1)\rho_{0}-\epsilon\left(\ell_{0}+1\right)>\rho_{0}-\epsilon\left(\ell_{1}+1\right)>= rho_(0)-delta_(1) > 0\geq \rho_{0}-\delta_{1}>0. Hence, there exist h_(0) > 0h_{0}>0 and ℓ_(0) > ℓ_(1)\ell_{0}>\ell_{1} that satisfy both (22) and (23). 注意到 n_(0)-ℓ_(0)=n_(1)-ℓ_(1)n_{0}-\ell_{0}=n_{1}-\ell_{1} 是固定的,从(9)和(10)可以清楚地看出,随着 ℓ_(0)\ell_{0} 的增加, h_(0)h_{0} 会减少;同时,(21)和 ℓ_(0) > ℓ_(1)\ell_{0}>\ell_{1} 的事实也意味着 rho_(0)-epsilon(ℓ_(0)+1) > rho_(0)-epsilon(ℓ_(1)+1)\rho_{0}-\epsilon\left(\ell_{0}+1\right)>\rho_{0}-\epsilon\left(\ell_{1}+1\right)>= rho_(0)-delta_(1) > 0\geq \rho_{0}-\delta_{1}>0 。因此,存在 h_(0) > 0h_{0}>0 和 ℓ_(0) > ℓ_(1)\ell_{0}>\ell_{1} 同时满足(22)和(23)。
In analogy with (20), it is also clear that if E_(0)E_{0} is applied to a source sigma_(0)\sigma_{0} such that h_(sigma_(0))(ℓ_(0))=h_(0)h_{\sigma_{0}}\left(\ell_{0}\right)=h_{0}, then the resulting compression ratio rho_(0)(sigma_(0))\rho_{0}\left(\sigma_{0}\right) satisfies 根据(20)的类比,如果对源 sigma_(0)\sigma_{0} 应用 E_(0)E_{0} ,使得 h_(sigma_(0))(ℓ_(0))=h_(0)h_{\sigma_{0}}\left(\ell_{0}\right)=h_{0} ,则所得的压缩比 rho_(0)(sigma_(0))\rho_{0}\left(\sigma_{0}\right) 满足
where delta_(0)=epsilon(ℓ_(0)+1)\delta_{0}=\epsilon\left(\ell_{0}+1\right). 在哪里 delta_(0)=epsilon(ℓ_(0)+1)\delta_{0}=\epsilon\left(\ell_{0}+1\right) 。
From (23) and (24), we have rho_(0)(sigma_(0)) <= rho_(0)-epsilon(ℓ_(0)+1)+delta_(0)\rho_{0}\left(\sigma_{0}\right) \leq \rho_{0}-\epsilon\left(\ell_{0}+1\right)+\delta_{0}=rho_(0)=\rho_{0}, and 从(23)和(24)中,我们有 rho_(0)(sigma_(0)) <= rho_(0)-epsilon(ℓ_(0)+1)+delta_(0)\rho_{0}\left(\sigma_{0}\right) \leq \rho_{0}-\epsilon\left(\ell_{0}+1\right)+\delta_{0}=rho_(0)=\rho_{0}
Hence, h_(0)h_{0} can be made arbitrarily close to rho_(0)\rho_{0}, and consequently, the encoder E_(0)E_{0} matches as closely as desired the lower end rho_(0)\rho_{0} of the given compression range. 因此, h_(0)h_{0} 可以任意接近 rho_(0)\rho_{0} ,因此,编码器 E_(0)E_{0} 可以匹配给定压缩范围的下端 rho_(0)\rho_{0} 。
Theorem 2: Let sigma\sigma be a source for which a matched encoder EE achieves a compression ratio rho(sigma)\rho(\sigma) within the range (rho_(0),rho_(1))\left(\rho_{0}, \rho_{1}\right). Then the compression ratio rho_(0)(sigma)\rho_{0}(\sigma), achieved for sigma\sigma by E_(0)E_{0}, satisfies 定理 2:设 sigma\sigma 为一个源,其匹配编码器 EE 可在范围 (rho_(0),rho_(1))\left(\rho_{0}, \rho_{1}\right) 内实现压缩比 rho(sigma)\rho(\sigma) 。那么由 E_(0)E_{0} 为 sigma\sigma 实现的压缩比 rho_(0)(sigma)\rho_{0}(\sigma) 满足
(Typically, (h_(1)//h_(0)) > (1//1-h_(0))\left(h_{1} / h_{0}\right)>\left(1 / 1-h_{0}\right) and d=(h_(1)//h_(0))d=\left(h_{1} / h_{0}\right).) (通常情况下, (h_(1)//h_(0)) > (1//1-h_(0))\left(h_{1} / h_{0}\right)>\left(1 / 1-h_{0}\right) 和 d=(h_(1)//h_(0))d=\left(h_{1} / h_{0}\right) 。)
Proof: To prove the theorem we shall consider the obviously worst case of applying E_(0)E_{0} to the source sigma_(1)\sigma_{1} whose matched encoder E_(1)E_{1} realizes rho_(1)\rho_{1}. 证明:为了证明该定理,我们将考虑应用 E_(0)E_{0} 到其编码器 E_(1)E_{1} 实现 rho_(1)\rho_{1} 的源 sigma_(1)\sigma_{1} 的显而易见的最坏情况。
Let rho_(0)(sigma_(1))\rho_{0}\left(\sigma_{1}\right) denote the compression ratio achievable by E_(0)E_{0} when applied to sigma_(1)\sigma_{1}. According to (12), we have 令 rho_(0)(sigma_(1))\rho_{0}\left(\sigma_{1}\right) 表示在应用于 sigma_(1)\sigma_{1} 时 E_(0)E_{0} 可达到的压缩率。根据(12)式,我们有
L_(ci)=1+|~log(ℓ_(i)+1)~|+|~log(n_(i)-ℓ_(i)-1)~|,quad i in{0,1}L_{c i}=1+\left\lceil\log \left(\ell_{i}+1\right)\right\rceil+\left\lceil\log \left(n_{i}-\ell_{i}-1\right)\right\rceil, \quad i \in\{0,1\}
and N_(i)(sigma_(j)),i,j in{0,1}N_{i}\left(\sigma_{j}\right), i, j \in\{0,1\}, is the maximum number of words into which a string S insigma_(j){n_(i)-ℓ_(i)-1}S \in \sigma_{j}\left\{n_{i}-\ell_{i}-1\right\} is parsed by E_(i)E_{i}. 和 N_(i)(sigma_(j)),i,j in{0,1}N_{i}\left(\sigma_{j}\right), i, j \in\{0,1\} ,是将字符串 S insigma_(j){n_(i)-ℓ_(i)-1}S \in \sigma_{j}\left\{n_{i}-\ell_{i}-1\right\} 通过 E_(i)E_{i} 解析的最大单词数。
Since n_(0)-ℓ_(0)=n_(1)-ℓ_(1)n_{0}-\ell_{0}=n_{1}-\ell_{1} and ℓ_(0) > ℓ_(1)\ell_{0}>\ell_{1}, it is easy to verify that ^(2)N_(0)(sigma_(1)) <= N_(1)(sigma_(1)){ }^{2} N_{0}\left(\sigma_{1}\right) \leq N_{1}\left(\sigma_{1}\right). Also by (16), 自从 n_(0)-ℓ_(0)=n_(1)-ℓ_(1)n_{0}-\ell_{0}=n_{1}-\ell_{1} 和 ℓ_(0) > ℓ_(1)\ell_{0}>\ell_{1} ,很容易验证 ^(2)N_(0)(sigma_(1)) <= N_(1)(sigma_(1)){ }^{2} N_{0}\left(\sigma_{1}\right) \leq N_{1}\left(\sigma_{1}\right) 。同时由式(16)可得,
Now, either k(1-h_(0)) < 1k\left(1-h_{0}\right)<1, or else the exponent on the right side of (26) must be nonnegative and thus k <= (h_(1)//:}k \leq\left(h_{1} /\right.h_(0)h_{0} ). In either case, 现在,要么 k(1-h_(0)) < 1k\left(1-h_{0}\right)<1 ,否则(26)右侧的指数必须是非负的,因此 k <= (h_(1)//:}k \leq\left(h_{1} /\right.h_(0)h_{0} ) 。在任何一种情况下,
k <= max{(h_(1))/(h_(0)),(1)/(1-h_(0))}=dk \leq \max \left\{\frac{h_{1}}{h_{0}}, \frac{1}{1-h_{0}}\right\}=d
Q.E.D. 证明完毕。
Theorems 1 and 2 demonstrate the efficiency and universality of the proposed algorithm. They show that an encoder designed to operate over a prescribed compression range performs, practically, as well as one designed to match a specific compression ratio within the given range. 定理 1 和 2 证明了所提出的算法的效率和广泛适用性。它们表明,设计用于指定压缩范围的编码器在实际应用中的表现与专门针对给定范围内的特定压缩比设计的编码器相当。
Moreover, for given complexity i.e., a given codeword length, the compression efficiency is comparable to that of an optimal variable-to-block code book designed to match a given source. 此外,对于给定的复杂性,即给定的码字长度,压缩效率与设计用于匹配给定源的最佳可变到块码本相当。
REFERENCES 参考文献
[1] D. A. Huffman, “A method for the construction of minimum-redundancy codes,” Proc. IRE, vol. 40, pp. 1098-1101, 1952. [1] 赫夫曼, "最小冗余码构建方法",《IRE 杂志》,第 40 卷,第 1098-1101 页,1952 年。
[2] R. M. Karp, “Minimum-redundancy coding for the discrete noiseless channel,” IRE Trans. Inform. Theory, vol. IT-17, pp. 27-38, Jan. 1961. [2] R. M. Karp, "最小冗余编码离散无噪信道," IRE 信息理论会议论文集, 第 IT-17 卷, 第 27-38 页, 1961 年 1 月.
[3] B. F. Varn, “Optimal variable length codes,” Inform. Contr., vol. 19, pp. 289-301, 1971. [3] B. F. Varn, "可变长度码的最优编码,"《信息控制》,第 19 卷,289-301 页,1971 年。
[4] Y. Perl, M. R. Gary, and S. Even, “Efficient generation of optimal prefix code: Equiprobable words using unequal cost letters,” J. ACM, vol. 22, pp. 202-214, April 1975. [4] 玉, M. R. Gary, 和 S. Even, "高效生成最佳前缀码: 使用不等成本字母的等概率单词," J. ACM, 第 22 卷, 第 202-214 页, 1975 年 4 月。
[5] A. Lempel, S. Even, and M. Cohn, “An algorithm for optimal prefix parsing of a noiseless and memoryless channel,” IEEE Trans. Inform. Theory, vol. IT-19, pp. 208-214, March 1973. [5] A. 勒姆泊尔, S. 伊文, 以及 M. 科恩, "一种无噪音和无记忆信道最优前缀解析算法", IEEE 信息理论学报, 第 19 卷, 第 208-214 页, 1973 年 3 月。
[6] F. Jelinek and K. S. Schneider, “On variable length to block coding,” IEEE Trans. Inform. Theory, vol. IT-18, pp. 765-774, Nov. 1972. [6] F. Jelinek 和 K. S. Schneider, "关于变长编码到定长编码的转换," IEEE 信息理论学报, 卷 IT-18, 页 765-774, 1972 年 11 月.
[7] R. G. Gallager, Information Theory and Reliable Communication. New York: Wiley, 1968. [7] R. G. 高勒,信息论与可靠通信。纽约:威利,1968。
[8] J. Ziv, “Coding of sources with unknown statistics—Part I; Probability of encoding error,” IEEE Trans. Inform. Theory, vol. IT-18, pp. 384-394, May 1972. [8] J. Ziv, "未知统计源编码—第一部分;编码误差概率," IEEE 信息理论交易, vol. IT-18, 第 384-394 页, 1972 年 5 月.
[9] L. D. Davisson, “Universal noiseless coding,” IEEE Trans. Inform. Theory, vol. IT-19, pp. 783-795, Nov. 1973. [9] L. D. Davisson, "通用无噪编码," IEEE 信息理论汇刊, vol. IT-19, pp. 783-795, 1973 年 11 月.
[10] A. Lempel and J. Ziv, “On the complexity of finite sequences,” IEEE Trans. Inform. Theory, vol. IT-22, pp. 75-81, Jan. 1976. [10] A. 伦佩尔和 J. 齐夫,"论有限序列的复杂性,"IEEE 信息论交易杂志,第 IT-22 卷,第 75-81 页,1976 年 1 月。
[11] B. M. Fitingof, “Optimal coding in the case of unknown and changing message statistics,” Prob. Inform. Transm., vol. 2, pp. 3-11, 1966. [11] B. M. 费廷诺夫, "在未知和变化的消息统计情况下的最优编码,"《信息传输问题》, 第 2 卷, 第 3-11 页, 1966 年。
Manuscript received June 23, 1975; revised July 6, 1976. Paper previously presented at the IEEE International Symposium on Information Theory, Ronneby, Sweden, June 21-24, 1976. 手稿于 1975 年 6 月 23 日收到;1976 年 7 月 6 日修订。该论文曾在 1976 年 6 月 21 日至 24 日于瑞典隆内比举行的 IEEE 国际信息理论研讨会上进行过演讲。
J. Ziv was with the Department of Electrical Engineering, TechnionIsrael Institute of Technology, Haifa, Israel. He is now with the Bell Telephone Laboratories, Murray Hill, NJ 07974. 齐郑六治已与以色列海法技术学院电气工程系工作,现任职于美国新泽西州默里山贝尔电话实验室。
A. Lempel was with the Department of Electrical Engineering, Tech-nion-Israel Institute of Technology, Haifa, Israel. He is now with the Sperry Research Center, Sudbury, MA 01776. 列佩尔先生曾在以色列海法技术学院电气工程系工作,目前他在马萨诸塞州苏德伯里斯皮里研究中心任职。
2 The assertion here is analogous to that of [10[10, theorem 1]. 这里的断言类似于[ [10[10 ,定理 1]。