这是用户在 2025-1-12 18:12 为 https://pubs.acs.org/doi/10.1021/acsphotonics.4c00003?ref=pdf 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
ACS Publications. Most Trusted. Most Cited. Most Read
Scalable On-Chip Optoelectronic Ising Machine Utilizing Thin-Film Lithium Niobate Photonics
My Activity
  • Subscribed
ADDITION/CORRECTION. This article has been corrected. View the notice.
Article

Scalable On-Chip Optoelectronic Ising Machine Utilizing Thin-Film Lithium Niobate Photonics
利用薄膜铌酸锂光子学技术的可扩展片上光电伊辛机
Click to copy article link
点击复制文章链接
Article link copied!

  • Zhenhua Li
    Zhenhua Li
    State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, China
    More by Zhenhua Li
  • Ranfeng Gan
    Ranfeng Gan
    State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, China
    More by Ranfeng Gan
  • Zihao Chen
    Zihao Chen
    State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, China
    More by Zihao Chen
  • Zhaoang Deng
    Zhaoang Deng
    State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, China
    More by Zhaoang Deng
  • Ran Gao
    Ran Gao
    School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
    More by Ran Gao
  • Kaixuan Chen
    Kaixuan Chen
    Guangdong Provincial Key Laboratory of Optical Information Materials and Technology, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou 510006, China
    More by Kaixuan Chen
  • Changjian Guo*
    Changjian Guo
    Guangdong Provincial Key Laboratory of Optical Information Materials and Technology, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou 510006, China
    *Email: changjian.guo@coer-scnu.org
  • Yanfeng Zhang
    Yanfeng Zhang
    State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, China
    Hefei National Laboratory, Hefei 230088, China
  • Liu Liu
    Liu Liu
    State Key Laboratory of Extreme Photonics and Instrumentation, College of Optical Science and Engineering, International Research Center for Advanced Photonics, Zhejiang University, Hangzhou 310058, China
    Jiaxing Key Laboratory of Photonic Sensing & Intelligent Imaging, Intelligent Optics & Photonics Research Center, Jiaxing Research Institute, Zhejiang University, Jiaxing 314000, China
    More by Liu Liu
  • Siyuan Yu
    Siyuan Yu
    State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, China
    More by Siyuan Yu
  • Jie Liu*
    Jie Liu
    State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, China
    *Email: liujie47@mail.sysu.edu.cn
    More by Jie Liu
Open PDFSupporting Information (1)

ACS Photonics

Cite this: ACS Photonics 2024, 11, 4, 1703–1714
Click to copy citationCitation copied!
https://doi.org/10.1021/acsphotonics.4c00003
Published March 14, 2024
Copyright © 2024 American Chemical Society

Abstract  摘要

Click to copy section link
点击复制部分链接
Section link copied!

The Ising machine (IM) has emerged as a promising tool for tackling nondeterministic polynomial-time hard combinatorial optimization problems in real-world applications. Among various types of IMs, optoelectronic IMs based on electro-optical (EO) modulators stand out as an impressive platform for Ising computations. They offer a simple and stable architecture, with the EO modulator providing a natural inline nonlinear transfer function for the Ising model. However, integrated optoelectronic IMs have not been demonstrated until now, and exploring large-scale computations within the constraints of digital hardware resources remains an open challenge for these systems. In this paper, an integrated optoelectronic IM based on a thin-film lithium niobate (TFLN) photonic chip is presented, in conjunction with a sparse matrix–vector multiplication algorithm embedded in a field-programmable gate array that optimizes hardware resource utilization and minimizes computational latency. This setup allows us to solve multiple types of MAX-CUT problems with up to 2048 spins and achieve a remarkably low iteration latency of 1.78 μs. To further address the constraints posed by digital devices when tackling larger-scale Ising problems, we extend the application of the TFLN chip to yet another new scheme in which the single, compact on-chip modulator concurrently performs operations of linear multiplication and nonlinear transformation. This scheme demonstrates the capability to address large-scale MAX-CUT problems involving up to 16,384 spins, which, to the best of our knowledge, are the largest-scale problems solved on an on-chip IM, highlighting its potential to overcome digital limitations. The TFLN-based optoelectronic IMs provide a compact solution with high scalability for potentially practical applications in addressing complex combinatorial optimization problems.
在现实应用中,伊辛机(IM)已成为解决非确定性多项式时间组合优化难题的有力工具。在各种伊辛机中,基于电光(EO)调制器的光电伊辛机是伊辛计算领域令人瞩目的平台。它们具有简单稳定的架构,电光调制器为伊辛模型提供了天然的线性非线性传递函数。然而,集成光电集成模块直到现在才得到验证,在数字硬件资源的限制下探索大规模计算仍然是这些系统面临的挑战。本文介绍了一种基于薄膜铌酸锂(TFLN)光子芯片的光电集成微处理器,以及嵌入现场可编程门阵列中的稀疏矩阵-向量乘法算法,该算法可优化硬件资源利用率并最大限度地减少计算延迟。这种设置使我们能够解决多达2048个自旋的多种类型的MAX-CUT问题,并实现1.78微秒的极低迭代延迟。 为了进一步解决数字设备在处理大规模Ising问题时所面临的限制,我们将TFLN芯片的应用扩展到另一个新方案中,在该方案中,单个紧凑的片上调制器同时执行线性乘法和非线性变换操作。该方案能够解决涉及多达16384个自旋的大规模MAX-CUT问题,据我们所知,这是片上IM能够解决的规模最大的问题,凸显了它克服数字限制的潜力。基于TFLN的光电子IM为解决复杂组合优化问题的潜在实际应用提供了具有高度可扩展性的紧凑解决方案。

This publication is licensed under the terms of your institutional subscription. Request reuse permissions.
本出版物根据您的机构订阅条款获得许可。申请重用权限。

Copyright © 2024 American Chemical Society
版权 © 2024 美国化学学会

Introduction  简介

Click to copy section link
点击复制部分链接
Section link copied!

Combinatorial optimization problems (1,2) have a wide range of practical applications, including circuit design, (3) drug discovery, (4) machine learning, (5) communications, (6) and so on. Unfortunately, these problems often belong to the class of nondeterministic polynomial-time hard (NP-hard) problems, which are computationally challenging to solve at large scales using traditional algorithms on digital computers. (7) Therefore, there is a pressing need for novel computing architectures to efficiently address NP-hard problems efficiently.
组合优化问题(1,2)具有广泛的实际应用,包括电路设计(3)、药物发现(4)、机器学习(5)、通信(6)等。遗憾的是,这些问题通常属于非确定性多项式时间复杂(NP-hard)问题,使用数字计算机上的传统算法进行大规模计算时,解决这些问题的难度很大。(7) 因此,迫切需要新型计算架构来高效解决NP-hard问题。
Recently, the Ising machine (IM) has emerged as a highly effective solver for NP-hard problems. (8) This development stems from the concept that combinatorial optimization problems can be mapped onto ground-state search problems of the Ising model, which can be implemented using artificial spin networks in the form of both classical and quantum physical systems. (9) Over the past few years, there has been extensive exploration and research on IMs, and various IM architectures have been developed, including those based on time-multiplexed optical parametric oscillators (OPO)/optoelectronic parametric oscillators (OEPO), (8,10−13) spatial light modulators (SLMs), (14) hybrid optoelectronics, (15) chip-scale photonic integrated circuits, (16,17) and superconducting quantum annealers. (18,19)
最近,伊辛机(IM)成为NP-hard问题的高效求解器。(8) 这一发展源于组合优化问题可以映射到伊辛模型的基态搜索问题,而伊辛模型可以使用经典和量子物理系统形式的人工自旋网络来实现。(9) 过去几年,人们对集成光路进行了广泛探索和研究,并开发了各种集成光路架构,包括基于时间复用光参量振荡器(OPO)/光电子参量振荡器(OEPO)、(8,10−13)空间光调制器(SLM)、(14)混合光电子、(15)芯片级光子集成电路、(16,17)和超导量子退火器(18,19)的架构。
Table 1 summarizes several recently proposed IMs and their key performance metrics, indicating that IMs, depending on their underlying implementation schemes, offer distinct advantages in various aspects. IMs based on SLMs offer the possibility of achieving massive all-optical spin couplings, enabling large-scale computation involving up to 75,000 Ising spins. (14) The D-wave system, utilizing quantum annealing (QA) approaches, excels in terms of time-to-solution for some kinds of hard problems owing to its quantum tunneling nature, (20) particularly when compared to other classical IM schemes. (9) In addition, IMs employing OPO/OEPO have shown significant potential in solving MAX-CUT problems with a considerable number of spins while maintaining low computational latency. (12,13) In contrast to the previous IM architectures, which rely on discrete bulk modules and devices, the integration schemes such as the Mach–Zehnder interferometer (MZI) mesh (16) have been utilized to demonstrate linear spin coupling of the Ising model, showing advantages for enhancing energy efficiency and computing density of the IMs.
表1总结了最近提出的几种即时通讯及其关键性能指标,表明即时通讯根据其基本实现方案,在各个方面具有独特的优势。基于SLM的即时通讯可以实现大规模的全光自旋耦合,从而实现涉及多达75,000个Ising自旋的大规模计算。(14) D-wave系统利用量子退火(QA)方法,由于其量子隧穿特性,在某些类型的硬问题方面具有出色的求解时间, (20) 特别是与其他经典IM方案相比。(9) 此外,采用OPO/OEPO的IM在解决具有相当数量自旋的MAX-CUT问题方面显示出巨大的潜力,同时保持较低的计算延迟。(12,13) 与之前依赖离散大容量模块和设备的集成架构不同,Mach-Zehnder干涉仪(MZI)网格(16)等集成方案已被用于证明Ising模型的线性自旋耦合,从而展示了提高集成架构能效和计算密度的优势。
Table 1. Key Parameters for Diverse Ising Machinesa
表1.各种Ising机器的关键参数a
scheme  方案graph scale  图形比例iteration latency (s/iteration)
迭代延迟(s/迭代)
energy efficiency  能效computing density (MAC/s/mm2)
计算密度(MAC/s/mm2)
SLM (14)75,076∼1 × 10–2 DD
OPO (12)100,5122.47 × 10–517 pJ/MAC  17 pJ/MACDD
OEPO (13)25,6002.44 × 10–7 DD
OE (15)  OE(15)1001 DD
QA (21)  质量检查 (21)>5000  >5000depended on annealing time
取决于退火时间
chip 150 pW + cooler 25 kW
芯片 150 pW + 散热器 25 kW
 
MZI (16)  MZI(16)643 × 10–90.28 pJ/MAC  0.28 pJ/MAC3.4 × 1010
OE in this work (predicted)
本作品中的原始能量(预测值)
16,3841.78 × 10–642.6 pJ/MAC (2.4 pJ/MAC)  42.6 pJ/MAC (2.4 pJ/MAC)2 × 109 (1.4 × 1010)
a

SLM: spatial light modulator, QA: quantum annealing, O(OE)PO: optical (optoelectronic) parametric oscillator, MR: micro ring, MZI: Mach–Zehnder interferometer, OE: optoelectronic, MAC: multiply accumulate (including 1 multiplication and 1 accumulation), DD: discrete devices.


aSLM:空间光调制器,QA:量子退火,O(OE)PO:光学(光电子)参量振荡器,MR:微环,MZI:马赫-曾德干涉仪,OE:光电子,MAC:乘法累加(包括1次乘法和1次累加),DD:离散器件。
Despite the significant progress made previously, there exist disadvantages in these IM schemes concerning their practical applications, including high computational latency (around tens or hundreds of milliseconds per iteration for the SLM-based IMs (14)), low energy efficiency (high power for cooling in the D-wave systems (20,21)), low computing density for IMs based on discrete bulk devices, (12−15) relatively low spin scalability given the complexity and imperfections associated with fabricating large-scale spatially multiplexed devices, (16,18) as well as the low robustness induced by harsh operating conditions. (12,20,21) As a result, it is imperative to propose solutions that can exhibit good performance in all of the aforementioned aspects to accelerate the practical applications of IM.
尽管之前取得了重大进展,但这些即时消息方案在实际应用中仍存在一些缺点,包括计算延迟高(基于SLM的即时消息每次迭代需要数十或数百毫秒(14))、能效低(D-wave系统冷却功率高(20,21))、基于离散大容量设备的即时消息计算密度低(12-15)、由于制造大规模空间多路复用设备复杂且不完善,自旋可扩展性相对较低(16,18),以及恶劣的工作条件导致的低稳定性。(12,20,21) 因此,迫切需要提出能够在上述所有方面表现出良好性能的解决方案,以加速IM的实际应用。
With simple and stable architecture, hybrid optoelectronic IM is a promising solution for realizing high computing density through device integration, although reported demonstrations (5,15,22,23) still rely on discrete bulk devices, including electro-optic (EO) modulators, photodetectors, as well as the field-programmable gate arrays (FPGAs) in conjunction with a digital-to-analog converter (DAC) and an analog-to-digital converter (ADC). Particularly, the high-speed EO modulator offers a natural inline nonlinear transfer function for the Ising model, which enables them to bypass the additional simulated nonlinearities in the digital domain (16) and to eliminate the necessity for a high-power pumping light source when implemented with optical nonlinearities. (24−26) Meanwhile, the linear matrix-vector multiplication (MVM) required for the Ising calculation is performed through digital hardware (e.g., FPGA) in currently proposed optoelectronic IMs. This strategy contributes to reducing the computational latency in time-division multiplexed (TDM) feedback schemes of the IMs. (15,27) Nevertheless, practical implementations still have resource constraints in the form of limited digital hardware availability driven by factors such as power consumption and cost considerations. These constraints, in turn, impose limitations on the scalability of MVMs and, thus, Ising computations. Therefore, the exploration of optoelectronic IMs for large-scale computations within the confines of constrained digital hardware resources is desirable, especially with the utilization of integrated on-chip devices.
混合光电IM具有简单稳定的架构,是设备集成实现高计算密度的有前景的解决方案,尽管已报道的演示(5、15、22、23)仍依赖于离散的批量设备,包括电光(EO)调制器、光电探测器以及现场可编程门阵列(FPGA)与数字模拟转换器(DAC)和模拟数字转换器(ADC)的组合。特别是高速EO调制器为Ising模型提供了自然的内联非线性传递函数,使其能够绕过数字域中的额外模拟非线性(16),并在实现光学非线性时消除对高功率泵浦光源的需求(24-26)。同时,Ising计算所需的线性矩阵-向量乘法(MVM)通过数字硬件(例如,FPGA)来实现。这种策略有助于减少IM的时分复用(TDM)反馈方案中的计算延迟。 (15,27) 然而,实际应用仍然存在资源限制,例如功耗和成本因素导致的数字硬件可用性有限。这些限制反过来又限制了MVM的可扩展性,从而限制了Ising计算的可扩展性。因此,在受限的数字硬件资源范围内探索用于大规模计算的光电集成模块是可取的,特别是利用集成的片上器件。
In this paper, an optoelectronic IM is experimentally demonstrated using a thin-film lithium niobate (TFLN) modulator hybrid-integrated with an InGaAs/InP photodetector. To achieve efficient hardware resource utilization and low computational latency, a sparse matrix–vector multiplication (SpMV) algorithm embedded in an FPGA is employed to realize feedback signal calculation and data scheduling for the photonic integrated devices. The optoelectronic IM successfully tackles MAX-CUT tasks for graphs with 2048 spins in sparse configurations. Remarkably, it only takes no more than 1.78 μs for each iteration for the sparse graphs. Furthermore, by coupling the optical signal carrying the information on the coupling matrix into the TFLN devices to replace the continuous wave optical light, the single, compact on-chip TFLN modulator not only provides a natural nonlinear transfer function but also plays a significant role in performing multiplication required for the linear MVM, supporting the demonstrations of large-scale MAX-CUT problems with 16,384 spins, which is the largest scale of problems ever solved on an on-chip Ising machine, indicating that the computing scale of the optoelectronic IM can be further expanded even in situations with limited digital hardware resources. Moreover, it is also possible to further reduce the computing latency and power consumption of the chip-scale optoelectronic IM by incorporating higher-speed and lower-driving-voltage optical modulation devices. The demonstrated and predicted parameters of the optoelectronic IM are listed in Table 1 for more specific reference.
本文通过实验演示了一种光电集成模块,该模块使用与InGaAs/InP光电探测器混合集成的薄膜铌酸锂(TFLN)调制器。为了实现高效的硬件资源利用率和低计算延迟,本文采用嵌入FPGA的稀疏矩阵-向量乘法(SpMV)算法,为光子集成器件实现反馈信号计算和数据调度。光电集成模块成功解决了稀疏配置中2048个旋转图的MAX-CUT问题。值得注意的是,稀疏图的每次迭代时间不超过1.78微秒。 此外,通过将耦合矩阵中携带信息的光信号耦合到TFLN器件中以取代连续波光,单个紧凑的片上TFLN调制器不仅提供了自然的非线性传递函数,而且在执行线性MVM所需的乘法运算中发挥了重要作用,支持了具有16384个自旋的大规模MAX-CUT问题的演示,这是片上Ising机器上解决的最大规模问题,表明即使在数字硬件资源有限的情况下,光电IM的计算规模也可以进一步扩大。此外,通过集成更高速、更低驱动电压的光调制器件,还可以进一步降低芯片级光电IM的计算延迟和功耗。光电IM的实测和预测参数列于表1中,以供更具体的参考。
In addition to the high scalability, low computational latency, and potentially high energy efficiency, optoelectronic IM utilizing high-speed electro-optic modulation devices offers a range of distinct advantages that are well-suited to the specific characteristics of Ising computations. Different from conventional artificial intelligence (AI) computations, in which the training process and the inference process are typically separated, Ising computations align the evolution of spins with the pursuit of the ground state of the Ising Hamiltonian. (28) In the former AI scenario, the training process can be conducted offline, without stringent demands for the speed of parameter updates, such as weight adjustments. (29) Conversely, the latter Ising scenario typically requires real-time updates and feedback of spin amplitudes to minimize computational latency, for which high-speed EO modulation devices readily present obvious advantages. Moreover, the coupling edges among spins within the Ising model exhibit variations depending on the specific combinatorial optimization problem types. (9) Therefore, the coupling matrix formed by the spin coupling coefficients displays diversity, including instances where certain matrices are nonunitary or sparse. This increases mapping complexities from the coupling matrix to certain on-chip architectures, such as the MZI mesh. (30) By contrast, the time-division multiplexing (TDM) scheme based on high-speed on-chip electro-optic modulators offers enhanced flexibility in mapping these coupling matrices. This flexibility, in turn, facilitates compatibility with the solution of a wide range of Ising problems. Therefore, optoelectronic IMs based on high-speed on-chip modulators provide a promising solution for scalable, compatible, and low-latency Ising computations.
除了高可扩展性、低计算延迟和潜在的高能效之外,利用高速电光调制设备的光电IM还具有一系列独特的优势,非常适合Ising计算的特定特性。与传统的AI计算不同,在传统的AI计算中,训练过程和推理过程通常是分开的,而Ising计算则将自旋的演化与对Ising哈密顿基态的追求相结合。(28) 在前一种AI场景中,训练过程可以离线进行,对参数更新(如权重调整)的速度没有严格要求。(29) 相反,后一种Ising场景通常需要实时更新和反馈自旋振幅,以最大限度地减少计算延迟,而高速EO调制设备在这方面具有明显的优势。此外,Ising模型中自旋之间的耦合边根据特定的组合优化问题类型而变化。 (9) 因此,由自旋耦合系数形成的耦合矩阵具有多样性,包括某些矩阵是非单位矩阵或稀疏矩阵的情况。这增加了从耦合矩阵到某些片上架构(如MZI网格)的映射复杂性。(30) 相比之下,基于高速片上电光调制器的时分复用(TDM)方案在映射这些耦合矩阵方面具有更高的灵活性。这种灵活性反过来又促进了与各种Ising问题解决方案的兼容性。因此,基于高速片上调制器的光电集成模块为可扩展、兼容且低延迟的Ising计算提供了有前景的解决方案。
The paper is organized as follows. Section Principles of Optoelectronic IM provides a brief model of the optoelectronic IM and suggests two proposed schemes. Section Experiments and Results describes the experiment implementation that is divided into four parts: the characterization of the TFLN chip, demonstrations of Scheme I and Scheme II, and comparison of key metrics of the two schemes. Finally, the conclusion of our works and the further discussion and perspective on integrated optoelectronic IM are presented in the section Conclusions and Discussion.
本文的结构如下。光电集成模型原理部分简要介绍了光电集成模型,并提出了两种方案。实验与结果部分描述了实验实施过程,分为四个部分:TFLN芯片的表征、方案I和方案II的演示以及两种方案关键指标的比较。最后,在结论与讨论部分,我们总结了我们的工作,并对集成光电IM进行了进一步的讨论和展望。

Principles of Optoelectronic IM
光电集成制造原理

Click to copy section link
点击复制部分链接
Section link copied!

As described in the previous work, (15) by leveraging the nonlinear transfer function of the Mach–Zehnder modulator (MZM) incorporated in the optoelectronic IM scheme (refer to Figure 1), the time evolution of the ith time-discrete spin xi(k) in the Ising spin network during iteration step k can be described as
如前文(15)所述,通过利用光电IM方案中整合的Mach-Zehnder调制器(MZM)的非线性传递函数(参见图1),迭代步骤k中Ising自旋网络中时间离散自旋x(k)的时间演化可以描述为
xi(k)=cos2(fi(k)π4+ηi(k))12=sin(2fi(k)+2ηi(k))
(1)
where ηi(k) represents the Gaussian white noise and cos2(·) follows the EO transfer function of the MZM. (31) The feedback term fi can be expressed as
其中,η(k)表示高斯白噪声,cos2(·)遵循MZM的EO传递函数。(31) 反馈项f可以表示为
fi(k+1)=αxi(k)+βijJijxj(k)
(2)
where α and β denote the strength of feedback and coupling during each iteration, respectively. Jij is the coupling coefficient between the spins xi and xj and related to the Ising Hamiltonian H = −∑ijJijσiσj, in which σi = sign(xi) ∈ {−1, 1} is the sign of corresponding spin’s amplitude. Driven by the dynamics of the Hamiltonian landscape, following a sufficient number of iterations as per eqs 1 and 2, the coupled spins configuration will converge toward states with lower Hamiltonian, until reaching the ground state with the lowest Hamiltonian. (28) For the MAX-CUT problems, the cut value C related to the Hamiltonian can be written as C=12(ijJij+H). (7,32) The calculation of fi(k+1) can be mapped into an MVM
其中,α和β分别表示每次迭代过程中的反馈和耦合强度。Jij是自旋x和x之间的耦合系数,与Ising哈密顿H = −∑≠Jijσσ相关,其中σ = sign(x) ∈ {−1, 1}是相应自旋振幅的符号。在哈密顿动力学的驱动下,根据公式1和公式2进行足够次数的迭代后,耦合自旋的配置将收敛到哈密顿值更低的态,直到达到哈密顿值最低的基态。(28) 对于最大切割问题,与哈密顿值相关的切割值C可以写成 C=12(ijJij+H) 。(7,32) f(k+1)的计算可以映射到MVM中
f(k+1)=Wx(k)=(αI+βJ)x(k)
(3)
where f(k+1) and x(k) are the N × 1 vectors whose ith elements are fi(k+1) and x(k) in eq 2, respectively. W = αI + βJ = [ωij] is the gain and coupling matrix, where I is the identity matrix and J is the N × N matrix whose element in the ith row and jth column is Jij.
其中,f(k+1)和x(k)是N×1向量,其第个元素分别是公式2中的f(k+1)和x(k)。W = αI + βJ = [ωij]是增益和耦合矩阵,其中I是单位矩阵,J是N×N矩阵,其第行和第列的元素是Jij。

Figure 1  图1

Figure 1. Schematic architecture of the optoelectronic Ising machine with 2 types of potential operational schemes.
图1.光电伊辛机的示意图,具有两种可能的运行方案。

Based upon the above derivation, the evolution of the Ising spin is a process involving the calculation of both linear MVM and nonlinear transfer function. In addition to the nonlinear transfer function that can be implemented through the EO transfer function of the MZM, it is crucial to efficiently map the linear MVMs in eq 3 onto the optoelectronic IM architecture, especially when faced with limited digital hardware resources. Two schemes, as illustrated in Figure 1, are implemented in this study, in which Scheme I accelerates massive MVMs by a designed algorithm on FPGA, and Scheme II improves energy efficiency and scalability by migrating multiplications on OE modulation under constrained digital hardware resource. Details of the two schemes can be found in Section Demonstration of Scheme I and Demonstration of Scheme II.
根据上述推导,Ising自旋的演化是一个涉及线性MVM和非线性传递函数计算的过程。除了可通过MZM的EO传递函数实现非线性传递函数外,将公式3中的线性MVM有效地映射到光电IM架构中至关重要,特别是在数字硬件资源有限的情况下。如图1所示,本研究中采用了两种方案,其中方案I通过FPGA上的设计算法加速大规模MVM,方案II通过在受限的数字硬件资源下迁移OE调制上的乘法运算来提高能效和可扩展性。两种方案的详细信息可在方案I演示和方案II演示中找到。

Experiments and Results  实验和结果

Click to copy section link
点击复制部分链接
Section link copied!

Characterization of the TFLN Chip
TFLN芯片的特性

As illustrated in Figure 1, the optoelectronic IM primarily comprises two sections: a commercial FPGA module in conjunction with an ADC and a DAC and a laboratory-fabricated TFLN-folded MZM (FMZM) with an InGaAs/InP photodiode (PD). Before presenting experiment results based on optoelectronic IM, it is necessary to introduce and characterize the on-chip devices first.
如图1所示,光电IM主要由两部分组成:一个商用FPGA模块,与一个ADC和一个DAC结合使用;一个实验室制造的TFLN折叠MZM(FMZM),带有InGaAs/InP光电二极管(PD)。在介绍基于光电IM的实验结果之前,有必要先介绍和描述片上器件。
The chip includes two grating couplers (GCs) as the fiber-to-chip interface, an FMZM, and an InGaAs/InP photodiode, as depicted in Figure 2a–c. The FMZM is configured in a single-drive push–pull arrangement, including a waveguide crossing, two 3 dB 1 × 2 multimode interferometer couplers, and a U-turn coplanar-line traveling-wave electrode (TWE) with a 50 Ω terminator bonded at the terminal end (detail information about simulation of TWE can be found in Supporting Information, Section S1). The waveguide crossing is applied in the U-turn section to maintain the same phase variation of each U-turn MZI arm. Both RF excitation ports and optical input GCs of the FMZM are aligned on the same edge of the chip, facilitating the setup of optical and electrical interfaces when testing and packaging. A commercial high-speed InGaAs/InP PD with a responsivity of 0.8A/W and 3 dB bandwidth up to 18 GHz is flip-chip-bonded on the output GC. Detailed information regarding the fabrication of the FMZM can be found in Supporting Information, Section S2. The FMZM has a total optical loss of about 10 dB (with on-chip insertion losses of ∼3 dB and GCs’ coupling loss of ∼3.5 dB/facet) and extinction ratio of ∼35 dB. Half-wave voltage Vπ of 3 V and 3 dB bandwidth of ∼12.5 GHz of the whole chip, consisting of an FMZM and a PD, are measured, as shown in Figure 2d,e.
如图2a-c所示,该芯片包括两个作为光纤到芯片接口的栅格耦合器(GC)、一个FMZM和一个InGaAs/InP光电二极管。FMZM采用单驱动推挽式结构,包括一个波导交叉器、两个3 dB 1×2多模干涉仪耦合器和一个在终端连接50 Ω终端电阻器的U形共面行波电极(TWE)(关于TWE仿真的详细信息,请参见支持信息中的S1部分)。波导交叉应用于U型转弯部分,以保持每个U型转弯MZI臂相同的相位变化。FMZM的射频激励端口和光学输入GC都位于芯片的同一边缘,便于测试和封装时设置光学和电气接口。输出GC上采用倒装芯片封装了商用高速InGaAs/InP PD,其响应度为0.8A/W,3 dB带宽高达18 GHz。FMZM的详细制造信息可在支持信息S2部分找到。FMZM的总光损耗约为10 dB(片上插入损耗约为3 dB,GC的耦合损耗约为3.如图2d、e所示,测量了由FMZM和PD组成的整个芯片的半波电压Vπ为3 V,3 dB带宽为12.5 GHz。

Figure 2  图2

Figure 2. Microscopy image of a (a) partially enlarged detail of the waveguide crossing, the (b) TFLN chip, and a (c) partially enlarged detail of the bonded PD and terminator. (d) Measured results of Vπ. The linear scanning 200 kHz sawtooth input waveform (red dash) and PD output (or transmission, blue solid). (e) Measured bandwidth (S21 parameter) of the whole device.
图2.显微镜图像,显示(a)波导交叉部分放大细节、(b)TFLN芯片以及(c)粘合的PD和终端部分放大细节。(d)Vπ的测量结果。线性扫描200 kHz锯齿输入波形(红色虚线)和PD输出(或传输,蓝色实线)。(e)整个设备的测量带宽(S21参数)。

Demonstration of Scheme I
方案一的演示

In this section, Scheme I of the optoelectronic IM depicted in Figure 1 is demonstrated. This scheme shares a similar idea to that from a previous work, (15) where the nonlinear transformations and linear MVMs in the Ising evolution are performed by the optical chip and electrical FPGA, respectively. To address the particular challenge of performing large-scale calculations with low latency, given the constraints of limited digital hardware resources (specifically, a Xilinx KU115 FPGA, paired with a DAC and an ADC operating at a 2.6 GHz sampling rate, is utilized in our demonstration), an efficient SpMV algorithm embedded in the FPGA is introduced.
在本节中,我们将演示图1中描述的光电IM方案I。该方案与之前的工作(15)有相似之处,其中,Ising演化中的非线性变换和线性MVM分别由光学芯片和电子FPGA执行。为了应对在有限数字硬件资源(具体来说,在我们的演示中,我们使用了一个Xilinx KU115 FPGA,与一个DAC和一个ADC配对,以2.6 GHz的采样率运行)的限制下进行低延迟大规模计算的特殊挑战,我们引入了嵌入在FPGA中的高效SpMV算法。

Methods of SpMV Based on FPGA
基于FPGA的SpMV方法

Considering that the spins in the Ising network may not be densely coupled, arbitrary real matrix W (as displayed in eq 3) often takes a sparse form in many combinatorial optimization problems. As a result, a SpMV algorithm is introduced to fully utilize the hardware resource of the FPGA. In this approach, the sparse matrix W is transformed into the Compressed Sparse Row (CSR) format. As in the example shown in Figure 3a, three vectors, val, col, and ptr are constructed in the CSR format. The sparsely distributed nonzero elements in W are stored densely in the vector val in linear order. Meanwhile, the column indices associated with every nonzero element in W are recorded in the vector col, and the indices of the first nonzero element in every row are stored in ptr.
考虑到Ising网络中的自旋可能不是紧密耦合的,在许多组合优化问题中,任意实矩阵W(如公式3所示)通常采用稀疏形式。因此,引入SpMV算法以充分利用FPGA的硬件资源。在这种方法中,稀疏矩阵W被转换为压缩稀疏行(CSR)格式。如图3a所示的示例中,以CSR格式构造三个向量val、col和ptr。W中稀疏分布的非零元素以线性顺序密集存储在向量val中。同时,与W中每个非零元素相关的列索引记录在向量col中,每行中第一个非零元素的索引存储在ptr中。

Figure 3  图3

Figure 3. (a) Example for the CSR format of a sparse matrix. (b) Inner architecture of an AMU. (c) Architecture of a SpMV MACC.
图3.(a)稀疏矩阵的CSR格式示例。(b)AMU的内部架构。(c)SpMV MACC的架构。

Next, the prepared CSR format is loaded into the computation architecture based on the multiplication-accumulation core (MACC). The elements with the same row number in vectors val and col are fed into a MACC consisting of a series of access-and-multiplication units (AMUs) followed by a summarization tree (ST) as shown in Figure 3c. Each AMU, illustrated in Figure 3b, performs a multiplication between a nonzero element val of W and the associated vector element xcol of x (referenced in eq 3) and outputs the product. Particularly, a vector element xcol of x is accessed from the Block RAM (BRAM) based on the index in the col and then multiplied with the corresponding matrix element val of W by a multiplier. Therefore, all the output results from AMUs in an MACC are accumulated by ST to obtain a target element of vector f as outlined in eq 3. Multiple MACCs can be deployed simultaneously to obtain multiple elements of vector f, enhancing the parallel computing performance and reducing the latency.
接下来,将准备好的CSR格式加载到基于乘法累加核心(MACC)的计算架构中。向量val和col中具有相同行号的元素被输入到由一系列访问和乘法单元(AMU)组成的MACC中,后面是汇总树(ST),如图3c所示。如图3b所示,每个AMU都会对W的非零元素val和x的相关向量元素xcol(参考公式3)进行乘法运算,并输出乘积。具体来说,x的向量元素xcol是通过基于col中的索引从块RAM(BRAM)中访问的,然后与W的相应矩阵元素val乘以一个乘数。因此,MACC中所有来自AMU的输出结果都由ST累加,以获得方程3中所述向量f的目标元素。可以同时部署多个MACC,以获得向量f的多个元素,从而提高并行计算性能并减少延迟。
It is noted that the aforementioned calculation process is only feasible under the assumption that the maximum number of nonzero elements in each row of W is not greater than the number of AMUs in a single MACC. If this assumption is broken, the configurable parallel accumulator and the bubble layers should be additionally involved but only slightly affect the latency and energy consumption of the scheme, the details of which can be found in Supporting Information, Section S3. A trade-off should be made among logic resources, spin number, and computational latency to facilitate the solution of a large-scale Ising problem by using limited hardware resources.
值得注意的是,上述计算过程只有在以下假设条件下才可行:W的每一行中非零元素的最大数量不超过单个MACC中的AMU数量。如果该假设被打破,可配置并行累加器和气泡层应额外参与,但只会略微影响方案的延迟和能耗,其详细信息可在支持信息S3部分中找到。应在逻辑资源、自旋数和计算延迟之间进行权衡,以便利用有限的硬件资源解决大规模伊辛问题。

Results of Scheme I  方案一的结果

Two types of MAX-CUT tasks are demonstrated for Scheme I. The first task involves a 32 × 64 checkerboard graph (comprising a total of 2048 spins) with coupling interactions encompassing all adjacent spin pairs, as depicted in Figure 4a. Here, we noted that the checkerboard graph problem does not fall into the category of NP-hard problems. (33) However, our analysis is designed to demonstrate the efficient implementation of both linear MVM and nonlinear transformation using the hardware of Scheme I. The demonstration includes 20 independent trials, each consisting of 2000 iterations. Each iteration requires approximately 1.78 μs, corresponding to an iteration rate of approximately 0.56 MHz. As illustrated in Figure 4b, the ground state (GS) energy, with a maximum cut value of 4000, can be achieved with a 100% success rate within 214 iterations in a trial under parameters α = 1.2 and β = 1.0 (as shown in eq 3). Figure 4c depicts a histogram of trial numbers achieving various cut values (expressed as percentages of the GS value) for the 20 independent trials. It can be confirmed that all trials reach at least 96.5% of the GS cut value, and a better performance could potentially appear by fine-tuning parameters α and β. This indicates the robust performance of the optoelectronic IM in efficiently tackling MAX-CUT problems associated with large-scale lattice graphs.
方案I演示了两种MAX-CUT任务。第一个任务涉及一个32×64的棋盘图(总共包含2048次旋转),其耦合交互包含所有相邻的旋转对,如图4a所示。在这里,我们注意到棋盘图问题不属于NP-hard问题。(33) 然而,我们的分析旨在演示使用方案I的硬件高效实现线性MVM和非线性变换。演示包括20次独立试验,每次试验包含2000次迭代。每次迭代大约需要1.78微秒,相当于大约0.56兆赫的迭代率。如图4b所示,在参数α=1.2和β=1.0(如公式3所示)的试验中,在214次迭代内,可以100%的成功率实现最大截断值为4000的基态(GS)能量。图4c描述了20次独立试验中达到不同截断值(以GS值的百分比表示)的试验次数直方图。可以确认,所有试验至少达到了96%。GS 切割值的 5%,通过微调参数 α 和 β,可能会出现更好的性能。这表明光电 IM 在有效处理与大规模晶格图相关的 MAX-CUT 问题方面具有强大的性能。

Figure 4  图4

Figure 4. (a) Checkerboard graph. (b) Evolution of spin amplitudes and cut value of the checkerboard graph. (c) Success rate of the checkerboard graph. (d) G22 graph. (e) Evolution of spin amplitudes and cut value of the G22 graph. (f) Success rate of the G22 graph.
图4.(a) 棋盘图。(b) 棋盘图的旋转幅度和截断值变化。(c) 棋盘图的成功率。(d) G22图。(e) G22图的旋转幅度和截断值变化。(f) G22图的成功率。

To further demonstrate the capability of Scheme I in addressing NP-hard challenges, an additional MAX-CUT task focusing on a G22 graph, which is acknowledged as an NP-hard benchmark problem, (34) is further explored. Figure 4d depicts the G22 graph consisting of a total of 2000 spins interconnected by 19,990 edges (with 41,980 nonzero elements in matrix W), randomly and sparsely distributed. Similar to the approach taken with the previous checkerboard graph demonstration, 20 independent trials are performed, each encompassing 2000 iterations with an approximate computation time of 1.78 μs per iteration. In this context, the parameters are set at α = 0.4 and β = 0.8. Following the time evolution process shown in Figure 4e, the spin states eventually converge to a local minimum on the Hamiltonian landscape, leading to an optimal cut value of 13,052. This value represents approximately 97.6% of the best-known (BK) value of 13,359. (34) Figure 4f demonstrates that all 20 trials achieve cut values surpassing 96.6% of the BK value. It is noteworthy that, as indicated in Figure 4e, the spin amplitude inhomogeneity persists even as the cut value closely approaches the maximum cut value after 900 iterations. This inhomogeneity tends to lead the spin states trapped into local optima, (35) which could potentially be mitigated through the application of optimized methods by dynamically controlling the parameters (22) or adding assistant error variables. (36)
为了进一步证明方案I在解决NP-hard挑战方面的能力,我们进一步探索了另一个MAX-CUT任务,该任务专注于G22图,该图被公认为NP-hard基准问题(34)。图4d描绘了G22图,该图由总共2000个旋转点组成,通过19990条边(矩阵W中有41980个非零元素)相互连接,这些边是随机且稀疏分布的。与之前棋盘图演示所采用的方法类似,进行了20次独立试验,每次包含2000次迭代,每次迭代的大约计算时间为1.78微秒。在这种情况下,参数设置为α=0.4和β=0.8。根据图4e所示的时间演化过程,自旋状态最终收敛到哈密顿图上的局部最小值,从而得出最佳切割值为13052。这个数值大约是13359(34)这个最著名(BK)数值的97.6%。图4f表明,所有20次试验的切割值都超过了BK值的96.6%。 值得注意的是,如图4e所示,即使在900次迭代后,切值接近最大切值时,自旋振幅不均匀性仍然存在。这种不均匀性往往会导致自旋状态陷入局部最优状态(35),这可以通过动态控制参数(22)或添加辅助误差变量来应用优化方法来缓解。 (36)

Demonstration of Scheme II
方案二的演示

In Scheme I of the optoelectronic IM, the linear matrix multiplication in eq 3 is exclusively executed within the FPGA. While the utilization of the SpMV algorithm effectively exploits FPGA resources and reduces computational latency, it must be acknowledged that further enhancing the scalability of the IM may encounter bottlenecks in the available FPGA hardware resources, especially when dealing with ultralarge-scale Ising spin coupling networks (e.g., spin number exceeding 10,000). Although application-specific integrated circuits (ASICs) or advanced digital modules might enhance performance to an extent, they remain limited by advancements in on-chip element density and the rate of progress dictated by Moore’s Law. Therefore, it becomes imperative to explore alternative solutions to further improve the scalability of the optoelectronic IM, even within the constraints of limited electrical hardware resources. Scheme II, implemented on the optoelectronic IM setup as depicted in Figures 1 and 5, presents a novel architecture designed to overcome these limitations.
在光电IM的方案一中,公式3中的线性矩阵乘法仅在FPGA内执行。虽然SpMV算法的使用有效地利用了FPGA资源并减少了计算延迟,但必须承认,进一步提高IM的可扩展性可能会遇到可用FPGA硬件资源的瓶颈,特别是在处理超大规模Ising自旋耦合网络(例如自旋数超过10,000)时。虽然专用集成电路(ASIC)或高级数字模块可能会在一定程度上提高性能,但它们仍然受到芯片元件密度进步和摩尔定律所决定的进步速度的限制。因此,探索替代解决方案以进一步提高光电集成模块的可扩展性变得势在必行,即使在有限的电子硬件资源限制下也是如此。方案二在图1和图5所示的光电集成模块上实施,它提出了一种新颖的架构,旨在克服这些限制。

Figure 5  图5

Figure 5. Schematic diagram for IM with proposed Scheme II.
图5. 采用方案二的IM示意图。

Methods of SpVM by Modulations
调制SpVM的方法

The SpMV algorithm presented in Section Demonstration of Scheme I is also adopted in Scheme II, but it strategically relocates the multiplications in the MVM operations from the FPGA to the on-chip FMZM. As illustrated in Figure 5, the nonzero elements of the N × N sparse matrix W (refer to eq 3) are sequentially extracted row by row and arranged into an Nnz-dimensional vector , where Nnz represents the total number of nonzero elements in . At the same time, the elements in the vector f(k) (refer to eq 1) with indices corresponding to the column indices of nonzero elements in W are selected and organized into an Nnz-dimensional vector f¯(k).
方案I演示部分介绍的SpMV算法也用于方案II,但它在策略上将MVM运算中的乘法从FPGA重新定位到片上FMZM。如图5所示,N×N稀疏矩阵W(参考公式3)的非零元素按行顺序提取并排列成Nnz维向量W̅,其中Nnz表示W̅中非零元素的总数。同时,选择向量f(k)(参见公式1)中与W中非零元素的列索引对应的元素,并将其组织成一个Nnz维向量 f¯(k)
The optical signal, serially modulated with elements of vector on its intensity, is coupled into the on-chip FMZM. At the same time, the vector f¯(k) is serially encoded to drive the on-chip FMZM, resulting in an optical intensity signal vector x¯(k)=sinf¯(k) (refer to eq 1) due to the nonlinear (sine-shaped) EO modulation transfer function. By carefully aligning the corresponding elements in vectors and x¯(k) in the time domain, the PD can extract the element-wise multiplied results f¯(k+1)=W¯×x¯(k) (× stands for Hadamard product). Then, the accumulations for the MVM operations are performed in the digital domain using the FPGA to finally generate desired vector f(k+1) with N spins. With N spins, the elements in f¯(k+1) are divided into N parts on the basis of row index i of the corresponding matrix element in W and then accumulated respectively to be fi(k+1),i = 1,2,···,N in the digital processor. Consequently, Scheme II utilizes the TDM strategy, which serially processes each element-wise multiplication, leading to latency that escalates linearly with the increase in nonzero elements.
光信号在强度上通过向量W̅的元素进行串行调制,并耦合到片上FMZM中。同时,向量 f¯(k) 被串行编码以驱动片上FMZM,由于非线性(正弦形)EO调制传递函数,产生光强度信号向量 x¯(k)=sinf¯(k) (参见公式1)。通过在时域中仔细对齐向量W̅和 x¯(k) 中的相应元素,PD可以提取元素级乘积结果 f¯(k+1)=W¯×x¯(k) (×代表Hadamard乘积)。然后,使用FPGA在数字域中执行MVM操作的累加,最终生成具有N个自旋的所需矢量f(k+1)。对于N次旋转, f¯(k+1) 中的元素根据W中对应矩阵元素的行索引被分成N个部分,然后在数字处理器中分别累加为f(k+1),= 1,2,···,N。因此,方案II利用TDM策略,逐次处理每个元素乘法,导致延迟随着非零元素的增加而线性增加。
It is noteworthy that both the spin states and the elements of the coupling matrix are encoded as the intensity of light, i.e., as non-negative real values. To accommodate calculations involving both positive and negative real-value spins and weights within the Ising model, opposite copies of spins and weight data should be encoded on light as well. Further details on this technique can be found in Supporting Information, Section S4.
值得注意的是,自旋状态和耦合矩阵的元素都以光强度(即非负实数)的形式编码。为了在Ising模型中同时处理正负实数自旋和权重的计算,自旋和权重数据的相反副本也应编码在光上。关于该技术的更多细节,请参见支持信息中的S4部分。

Experimental Setup of Scheme II
方案二的实验设置

Given the limitations in the number of channels supported by the DAC employed in the experiment, the originally intended real-time processing by the FPGA is substituted with offline processing. As in the Schematic diagram in Figure 5, this alternative approach is executed using a multichannel Arbitrary Waveform Generator (AWG, Keysight M8195, with 8 bit resolution) and a real-time oscilloscope (RTO, Tektronix DSA73304D, with 8 bit resolution). 6.25 Gb/s intensity-modulated optical signal, serially encoded by elements in the vector W¯ generated from one channel of the AWG, is coupled into the on-chip FMZM. Simultaneously, the FMZM is biased at the quadrature point and driven by a 6.25 Gb/s electrical signal encoded with elements of the vector f¯(k) from another channel of the AWG. It should be emphasized that the time-domain alignment between the two signals, each carrying elements of vectors of W¯ and f¯(k), respectively, is achieved by adjusting software-defined channel-wise delay. With a relatively larger modulation depth (i.e., peak-to-peak value of electrical signal amplitude approaching Vπ), the FMZM not only provides appropriate nonlinear transformation on f¯(k) but also facilitates the Hadamard product f¯(k+1)=W¯×x¯(k). Afterward, the optical signal is detected by the on-chip PD. Finally, the generated electrical signal is amplified by the electrical amplifier and sampled by the RTO for off-line accumulation operations through the computer (Intel Core i7-8700), to generate the desired vector f¯(k+1).
鉴于实验中使用的DAC支持的通道数量有限,FPGA的实时处理被离线处理所取代。如图5中的原理图所示,这种替代方法使用多通道任意波形发生器(AWG,Keysight M8195,8位分辨率)和实时示波器(RTO,Tektronix DSA73304D,8位分辨率)执行。6。25 Gb/s强度调制光信号,由AWG的一个通道生成的向量 W¯ 中的元素串行编码,耦合到片上FMZM中。同时,FMZM在正交点偏置,并由AWG另一个通道的向量 f¯(k) 中的元素编码的6.25 Gb/s电信号驱动。需要强调的是,两个信号的时间域对齐是通过调整软件定义的通道延迟来实现的,每个信号分别携带 W¯ f¯(k) 向量的元素。,电信号振幅的峰峰值接近Vπ),FMZM不仅对 f¯(k) 进行适当的非线性变换,还促进哈达玛乘积 f¯(k+1)=W¯×x¯(k) 。之后,光信号由片上PD检测。最后,产生的电信号由电子放大器放大,并由RTO采样,通过计算机(英特尔酷睿i7-8700)进行离线累积运算,以生成所需的矢量 f¯(k+1)

Results of Scheme II  方案二的结果

A 128 × 128 checkerboard graph comprising 16,384 spins is demonstrated to prove the feasibility of Scheme II of the optoelectronic IM in solving large-scale problems. In the time evolutionary process depicted in Figure 6a, it can be observed that spin amplitudes grow toward saturation within the first 100 iterations, accompanied by a rapid increase in the cut value C. Over the subsequent iterations, spanning from 100th to 2000th, spins undergo flipping before settling at saturation states, while the cut value C gradually approaches the ground state.
通过展示一个包含16384次旋转的128×128棋盘图,证明了光电IM方案II在解决大规模问题上的可行性。在图6a所示的时间演化过程中,可以观察到旋转幅度在前100次迭代中逐渐趋于饱和,同时截断值C也迅速增加。在随后的迭代中,从第100次迭代到第2000次迭代,自旋在达到饱和状态之前会发生翻转,而截断值C逐渐接近基态。

Figure 6  图6

Figure 6. (a) Evolution of spins amplitudes and cut value of checkerboard graph in Scheme II. (b–f) snapshots of the spin graph at the 2nd, 20th, 50th, 100th, 200th, 500th, 1000th, and 2000th iterations.
图6.(a)方案II中棋盘图旋转振幅和截断值的演变。(b-f)旋转图在第2、20、50、100、200、500、1000和2000次迭代时的快照。

Figure 6b provides snapshots of the evolving graph at the 2nd, 20th, 50th, 100th, 200th, 500th, 1000th, and 2000th iterations. The formation of spins “domains” on the checkerboard graph is observed within the first 100 iterations. During subsequent converging stages, spins flip, causing most domains to shrink, ultimately reaching a stable global stable state. This pattern convergence to ground states serves as compelling evidence that Scheme II of the optoelectronic IM is capable of imparting gain and coupling effects on spins, steering the evolution toward the desired ground states.
图6b显示了第2、20、50、100、200、500、1000和2000次迭代时演化图的变化。在最初的100次迭代中,观察到棋盘图上自旋“域”的形成。在随后的收敛阶段,自旋翻转,导致大多数域缩小,最终达到稳定的全局稳定状态。这种模式收敛到基态,有力地证明了光电IM的方案II能够对自旋产生增益和耦合效应,引导演化到所需的基态。

Comparison on Key Metrics of Schemes I and II
方案I和方案II的关键指标对比

In order to comprehensively evaluate the feasibility of Schemes I and II of the optoelectronic IM, key metrics of the two schemes, in terms of the latency and energy efficiency, are compared in Table 2.These metrics are crucial as they relate to operations and processes whose latency and energy consumption increase with the number of Ising spins N or nonzeros Nnz.
为了全面评估光电子IM方案I和II的可行性,表2对这两个方案的关键指标(延迟和能效)进行了比较。这些指标至关重要,因为它们与操作和流程有关,而操作和流程的延迟和能耗会随着Ising自旋数N或非零数NNz的增加而增加。
Table 2. Comparison of the Proposed Schemesa
表2.所提方案的比较a
scheme  方案electrical processing devices
电子处理设备
data Baudate  数据 波特率time complexity  时间复杂度scalability  可扩展性transmission latency (s/iteration)operation latency (s/iteration)energy efficiency
Scheme I  方案一FPGA (Xilinx KU115) and ADC/DAC (2.6 GS/s)
FPGA(Xilinx KU115)和ADC/DAC(2.6 GS/s)
2.6 GBaud  2.6 GBaud
N = 2000 Nnz = 41,980∼0.5 × 10–6∼1.28 × 10–651.9 pJ/MAC (L) +35 pJ/symbol (NL)
Scheme II  方案二PC (Intel Core i7–8700) and AWG, OSC
PC(英特尔酷睿i7-8700)和AWG、OSC
6.25 GBaud  6.25 GBaud
N = 16,384 Nnz = 81,408∼0.4∼370 × 10–642.6 pJ/MAC (total)
Scheme II (predicted)  方案二(预测)FPGA100 GBaud  100 GBaud
N > 16,384 Nnz > 81,408∼0.5 × 10–6∼2.14 × 10–62.4 pJ/MAC (total)
a

AWG: arbitrary waveform generator, OSC: oscilloscope, N: number of spins, Nnz: number of nonzeros, L/NL: linear/nonlinear operation.


aAWG:任意波形发生器,OSC:示波器,N:旋转次数,Nnz:非零次数,L/NL:线性/非线性操作。

Latency  延迟

The latency of the optoelectronic IM in our demonstrations can be divided into two parts: transmission latency and operation latency.
演示中的光电IM延迟可分为两部分:传输延迟和操作延迟。
The transmission latency refers to the time required for data transmission between individual constituent units of the optoelectronic IM. It can be measured through the protocol of communication between the FPGA and ADC/DAC or the profiler in remote processing software Matlab.
传输延迟是指光电IM各个组成单元之间传输数据所需的时间。可通过FPGA和ADC/DAC之间的通信协议或远程处理软件Matlab中的探查器进行测量。
The operation latency primarily focuses on the time required for MVM operations within the Ising calculations, as both nonlinear transformations are implemented in-line based on the on-chip FMZM in the demonstrations of the two schemes. The FPGA computational latency can refer to the FPGA design tool vivado, and the modulation latency can be evaluated by the product of the number of spins (N in Table 2, for Scheme I)/nonzeros elements (Nnz in Table 2, for Scheme II) and the baud rate B of the ADC/DAC and modulation, i.e., N × B or Nnz × B. For Scheme I, the total operation latency can be assessed as the combined duration of FPGA calculation latency for MVM operations and modulation latency for nonlinear transformations. By contrast, Scheme II includes FPGA calculation latency for only accumulations and modulation latency for both multiplications and nonlinear transformations.
操作延迟主要关注的是MVM操作在Ising计算中所需的时间,因为在这两种方案的演示中,两个非线性变换都是基于片上FMZM在线实现的。FPGA计算延迟可参照FPGA设计工具vivado,调制延迟可通过自旋数(表2中的N,方案I)/非零元素数(表2中的NNZ,方案II)与ADC/DAC和调制波特率B的乘积进行评估,即N×B或NNZ×B。对于方案I,总操作延迟可评估为MVM操作的FPGA计算延迟和非线性变换的调制延迟的总持续时间。相比之下,方案II仅包括累加的FPGA计算延迟和乘法和非线性变换的调制延迟。
As demonstrated in Table 2, a latency of 1.78 μs per iteration can be reached in the demonstration of Scheme I with a spin number (N in Table 2) of 2048, including ∼0.5 μs for data transmission and the remaining ∼1.28 μs for linear MVM operations. This low latency can be attributed to the efficient SpMV algorithm, the highly parallel computing units, and the efficient pipelines for data scheduling on the FPGA.
如表2所示,在方案I的演示中,自旋数(表2中的N)为2048时,每次迭代可达到1.78μs的延迟,其中数据传输占约0.5μs,剩余约1.28μs用于线性MVM操作。这种低延迟可归因于高效的SpMV算法、高度并行的计算单元以及FPGA上用于数据调度的有效流水线。
Scheme II has a large transmission latency, primarily caused by the remote data communication time among the AWG, RTO, and computer (see Figure 5 for the demonstration setup). This transmission latency can be significantly reduced by introducing multichannel DACs/ADCs in conjunction with an FPGA to conduct real-time processing in the implementation of Scheme II. As the example illustrated in the third row of Table 2, a low transmission latency of approximately 0.5 μs in Scheme II is predicted through the utilization of the Xilinx KU115 FPGA in conjunction with multichannel DACs and ADCs operating at a 2.6 GHz sampling rate; refer to the demonstration of Scheme I.
方案II的传输延迟较大,主要原因是AWG、RTO和计算机之间的远程数据通信时间较长(演示设置见图5)。在方案II的实施中,通过引入多通道DAC/ADC与FPGA结合进行实时处理,可以显著降低传输延迟。如表2第三行所示,通过利用Xilinx KU115 FPGA与以2.6 GHz采样率运行的多通道DAC和ADC相结合,预计方案II的传输延迟约为0.5μs,这是通过将Xilinx KU115 FPGA与工作在2.6 GHz采样率的多通道DAC和ADC结合使用而实现的,请参考方案I的演示。
An operation latency of ∼370 μs was also achieved in the Scheme II demonstration with a 16,384-spin checkerboard graph. This value is predicted to decrease to ∼2.14 μs, provided that the optical modulator, (37,38) PDs, (39) and the ADC/DAC (40) supporting a 100 Gbaud signal are utilized in the demonstration of scheme II (refer to the third row of Table 2).
在方案II演示中,使用16384自旋的棋盘图时,操作延迟也达到了约370微秒。如果方案II演示中使用支持100 Gbaud信号的光调制器(37,38)PD、(39)和ADC/DAC(40)(参见表2第三行),则该值预计将降至约2.14微秒。

Energy Efficiency  能源效率

Energy efficiency, which refers to the energy cost per single operation, is also evaluated in this section. Here, the energy consumptions for the linear MVMs and nonlinear transformations within the Ising calculations are the main focus.
本节还将评估能量效率,即每次操作的能量成本。在此,线性MVM和Ising计算中的非线性变换的能耗是主要关注点。
For Scheme I, as presented in the Demonstration of Scheme I, the linear MVMs, defined as single operations encompassing multiplication and accumulation (MAC), were implemented using the FPGA, achieving an energy efficiency of 51.9 pJ/MAC. The nonlinear transformations required for Ising computations were performed using on-chip FMZM, yielding an energy efficiency of 35 pJ/symbol. More detailed calculations can be found in Supporting Information, Section S5.
对于方案I,如方案I演示中所述,线性MVM(定义为包含乘法和累加(MAC)的单个操作)使用FPGA实现,能效为51.9 pJ/MAC。Ising计算所需的非线性变换使用片上FMZM执行,能效为35 pJ/符号。更详细的计算结果可在支持信息S5部分中找到。
In Scheme II, both multiplications and nonlinear transformations are concurrently executed by the on-chip FMZM, operating at Vπ of 3 V, leading to an energy efficiency of 14.4 pJ/OP. Subsequently, the accumulations are performed using FPGA, achieving an energy efficiency of 28.2 pJ/OP. As a result, Scheme II attains a total energy efficiency of 42.6 pJ/MAC. By incorporating advanced on-chip MZM operating at high bandwidth up to 100 GHz and featuring a low Vπ of less than 1 V, (17,37) Scheme II is anticipated to significantly enhance energy efficiency to as low as 2.4 pJ/MAC. Detailed calculations are provided in Supporting Information, Section S5.
在方案II中,乘法和非线性变换同时由片上FMZM执行,工作电压为3 V,能效为14.4 pJ/OP。随后,使用FPGA进行累加,能效为28.2 pJ/OP。因此,方案II的总能效为42.6 pJ/MAC。通过整合工作在高达100 GHz的高带宽下且具有小于1 V的低Vπ特性的先进片上MZM,(17,37)方案II有望将能效显著提高至低至2.4 pJ/MAC。详细计算结果见支持信息中的S5部分。

Conclusions and Discussion
结论与讨论

Click to copy section link
点击复制部分链接
Section link copied!

In this paper, two schemes of optoelectronic IM have been experimentally demonstrated using the on-chip TFLN modulator hybrid-integrated with the InGaAs/InP PD. In Scheme I, the nonlinear transformations of the Ising evolution were performed by the on-chip TFLN modulator, while linear MVM operations in the feedback signal calculation of Ising evolution were carried out by the electrical FPGA. The SpMV algorithm embedded in the FPGA was introduced to efficiently utilize the limited hardware resource and achieve large-scale Ising calculations with low latency. MAX-CUT tasks for graphs with 2048 spins were successfully realized with computational latency of only 1.78 μs for each iteration. In order to improve the scalability under digital hardware constraints, Scheme II was introduced, in which a single on-chip TFLN modulator not only provided a natural nonlinear transfer function but also conducted multiplications within the MVM operations. The latter functionality was essential for addressing bottlenecks of the available FPGA hardware resources, especially when dealing with large-scale computations. A large-scale MAX-CUT task with 16,384 spins was demonstrated based in Scheme II. This largest scale of problems ever solved on an on-chip IM emphasize the potential for expanding the computing scale of the optoelectronic IM, even in scenarios where digital resources are limited.
本文通过实验演示了两种光电IM方案,使用与InGaAs/InP PD混合集成的片上TFLN调制器。在方案I中,片上TFLN调制器执行Ising演化的非线性变换,而Ising演化的反馈信号计算中的线性MVM运算则由电子FPGA执行。FPGA中嵌入的SpMV算法被引入,以有效利用有限的硬件资源,并实现低延迟的大规模Ising计算。对于具有2048次旋转的图,MAX-CUT任务已成功实现,每次迭代的计算延迟仅为1.78μs。为了在数字硬件限制下提高可扩展性,引入了方案II,其中单个片上TFLN调制器不仅提供了自然的非线性传递函数,还可在MVM操作中执行乘法。后者对于解决可用FPGA硬件资源的瓶颈至关重要,特别是在处理大规模计算时。基于方案II,演示了具有16384次旋转的大规模MAX-CUT任务。 这是有史以来在片上集成光电子模块上解决的最大规模问题,这凸显了光电子集成模块计算规模扩展的潜力,即使在数字资源有限的情况下也是如此。
In the presented works, only a simple set of TFLN modulators and PD is integrated on-chip. Looking ahead, ASICs, DACs, and ADCs─the substituted components of FPGAs, AWGs, and oscilloscopes in the presented demonstrations─as well as light sources all have the potential to be integrated or copacked on the same chip by the copackaged optics techniques (41) to achieve a smaller footprint, higher bandwidth, and lower energy consumption. In Schemes I and II, the use of EO modulations aligns the nonlinear transformation seamlessly with EO signal conversion and multiplication, thereby eliminating the need for additional computational time and energy for nonlinear transformation. This approach contrasts with all-optical on-chip nonlinearities, (24,25) which, despite their low latency and energy consumption, require extra on-chip photonic components and pulse pump sources. Consequently, EO modulation is preferred in these schemes.
在展示的作品中,只有一组简单的TFLN调制器和PD集成在芯片上。展望未来,ASIC、DAC和ADC(在展示的演示中,它们是FPGA、AWG和示波器的替代组件)以及光源都有潜力通过共封装光学技术(41)集成或共封装在同一芯片上,以实现更小的尺寸、更高的带宽和更低的能耗。在方案I和方案II中,使用EO调制将非线性变换与EO信号转换和倍增无缝结合,从而无需为非线性变换花费额外的计算时间和能量。这种方法与全光片上非线性(24,25)形成对比,后者尽管延迟和能耗低,但需要额外的片上光子元件和脉冲泵浦源。因此,EO调制在这些方案中更受欢迎。
While the TDM scheme based on high-speed EO modulation devices has been demonstrated with characteristics such as high scalability, low computational latency, compatibility with multiple problem-solving scenarios, and the potential to enhance computational efficiency, it is possible to introduce multidimensional multiplexing strategies in future studies. These strategies may involve combining TDM with space-division multiplexing (SDM) and wavelength-division multiplexing (WDM) techniques. Such an integration scheme aims to fully leverage the strengths of different strategies, leading to a more substantial improvement in the performance of optoelectronic IMs.
虽然基于高速EO调制设备的TDM方案已经证明了其具有高可扩展性、低计算延迟、与多种问题解决场景兼容以及提高计算效率的潜力等特点,但在未来的研究中引入多维多路复用策略是可能的。这些策略可能涉及将TDM与空分复用(SDM)和波分复用(WDM)技术相结合。这种集成方案旨在充分利用不同策略的优势,从而显著提升光电集成模块的性能。

Strategy of TDM Combined with SDM
TDM与SDM相结合的策略

The typical size of photonic devices, typically comparable to the wavelength of optical light and ranging from hundreds of nanometers to many micrometers, imposes limitations on the density of integrated photonic elements. This density cannot reach the levels achieved by electronic elements, whose typical size is much smaller, typically on the order of several nanometers. (42) Consequently, the spatial scalability of IMs based on integrated photonics has inherent limitations. This challenge will be further exacerbated when the fabrication and implementation complexity of large-scale SDM photonic devices are taken into consideration.
光子器件的典型尺寸通常与光波长相当,从数百纳米到几微米不等,这限制了集成光子元件的密度。这种密度无法达到电子元件的水平,因为电子元件的典型尺寸要小得多,通常只有几纳米左右。(42) 因此,基于集成光子学的集成光路的空间可扩展性存在固有的局限性。如果考虑到大规模SDM光子器件的制造和实施复杂性,这一挑战将更加严峻。
However, high-speed optoelectronic devices enable the realization of large-scale optical computation with high computing throughput, even though the number of spatially integrated photonic/optoelectronic elements is significantly lower than that of the electronic elements. As an instance shown in Figure 7b, based in Scheme II presented in Section Demonstration of Scheme II, deploying 4 sets of parallel high-speed MZMs (37,38) and PDs (39) (e.g., 100 GHz bandwidth) can yield competitive high computing speeds (e.g., 400 × 109 multiplications per second) for large-scale MVMs within the Ising calculations. This stands in contrast to the electronic scheme depicted in Figure 7a, which employs large-scale SDM elements at a relatively low computational clock rate. In addition, the low transmission loss of optical waveguides among individual photonic elements contributes to improving the energy efficiency of large-scale IMs based on integrated photonics.
然而,高速光电子器件能够实现具有高计算吞吐量的大规模光学计算,即使空间集成的光子/光电子元件的数量大大低于电子元件的数量。如图7b所示,基于方案II演示部分中介绍的方案II,部署4组并行高速MZM(37,38)和PD(39)(例如,100 GHz带宽)可以为Ising计算中的大规模MVM提供具有竞争力的计算速度(例如每秒400×109次乘法)。这与图7a中描述的电子方案形成鲜明对比,该方案以相对较低的计算时钟速率使用大规模SDM元件。此外,光波导在单个光子元件之间的传输损耗低,有助于提高基于集成光子学的大规模集成模块的能源效率。

Figure 7  图7

Figure 7. Schematic diagrams of the architectures utilizing TDM and SDM for MVM computing based on (a) digital electronics (Scheme I) and (b) OE modulation (Scheme II). Inset: TDM vs SDM that commonly contribute computation throughput. (c) Concept architecture with additionally introduced WDM.
图7.基于(a)数字电子(方案I)和(b)OE调制(方案II)的利用TDM和SDM进行MVM计算的架构示意图。插图:通常有助于计算吞吐量的TDM与SDM。(c)额外引入WDM的概念架构。

Strategy of TDM Combined with WDM and SDM
TDM与WDM和SDM相结合的策略

An additional computational dimension can be theoretically introduced when utilizing the WDM strategy. As illustrated in Figure 7c, by injecting optical carriers with varying weights (e.g., elements in the matrix W of eq 3) at different wavelengths into the on-chip TDM IM depicted in Figure 1, it is possible to achieve both linear and nonlinear operations within the Ising model. Scaling the IM architectures in spatial dimensions can further increase the computational scale.
理论上,当使用WDM策略时,可以引入额外的计算维度。如图7c所示,通过将不同波长下具有不同权值的光载波(例如,公式3中矩阵W中的元素)注入图1中所示的片上TDM IM,可以在Ising模型中实现线性和非线性操作。在空间维度上扩展IM架构可以进一步提高计算规模。
However, in practical implementations, the achievement of reconfigurable on-chip optical spectrum modulators capable of accommodating multiwavelength-weight loading often comes with the limitation of occupying on-chip spatial resources. (43,44) This implies that the wavelength dimension is not entirely independent of the spatial dimension in this context. Consequently, the WDM strategy encounters challenges akin to those faced by the SDM strategy, characterized by limited scalability.
然而,在实际应用中,可重构片上光谱调制器能够适应多波长负载,但往往存在占用片上空间资源的限制。(43,44)这意味着,在此背景下,波长维度并不完全独立于空间维度。因此,WDM策略面临的挑战与SDM策略面临的挑战类似,即可扩展性有限。
Nevertheless, the WDM strategy presents an attractive way to realize on-chip fan-in and fan-out optical signals for Ising calculations. The direct detection of multiple wavelengths at the PDs provides robust, phase-insensitive intensity accumulations for multiple signals. (45−50) Thereby, utilizing high-sensitivity PD detection, in which, in principle, no additional energy is needed because of efficient photon-electron conversion, (51) conserves a vital portion of power consumption from digital accumulations as mentioned in Section Energy Efficiency. Moreover, fanning out different wavelengths to multiple computation cores through WDM filters can achieve a relatively lower insertion loss compared to schemes based on power splitting.
然而,WDM策略为在芯片上实现用于Ising计算的光信号扇入和扇出提供了一种有吸引力的方法。PD对多个波长的直接检测为多个信号提供了稳健的、相位不敏感的强度累积。(45−50) 因此,利用高灵敏度PD检测,原则上不需要额外的能量,因为光子-电子转换效率高,(51) 节省了数字累积中相当一部分的功耗,如“能源效率”一节所述。此外,通过WDM滤波器将不同波长扇出到多个计算核心,与基于功率分束的方案相比,可以实现相对较低的插入损耗。

Supporting Information  支持信息

Click to copy section link
点击复制部分链接
Section link copied!

The Supporting Information is available free of charge at: https://pubs.acs.org/doi/10.1021/acsphotonics.4c00003.
支持信息可免费获取,网址:https://pubs.acs.org/doi/10.1021/acsphotonics.4c00003。

  • Additional details about simulation of the travel-wave electrodes, fabrication of the on-chip devices, multiplication-accumulation core with configurable parallel accumulator and bubble layers, real-value spin and weight calculation based on intensity modulation, and calculation of energy efficiency (PDF)
    关于仿真行波电极、片上器件的制造、带有可配置并行累加器的乘积累加核和气泡层、基于强度调制的实值自旋和重量计算以及能量效率计算的更多详细信息(PDF)

Scalable On-Chip Optoelectronic Ising Machine Utilizing Thin-Film Lithium Niobate Photonics

16 views

0 shares

0 downloads

Supporting Information:  支持信息:
Scalable On-chip Optoelectronic Ising Machine
可扩展片上光电伊辛机
Utilizing Thin-Film Lithium Niobate Photonics
利用薄膜铌酸锂光子学
Zhenhua Li,  李振华、
,
Ranfeng Gan,  甘然峰、
,
Zihao Chen,  Zihao Chen,
,
Zhaoang Deng, Ran Gao, Kaixuan
邓兆昂、高冉、陈凯旋
Chen, Changjian Guo,  陈、郭长建、
,
Yanfeng Zhang, Liu Liu, Siyuan Yu, and Jie Liu
张延峰、刘柳、于思源和刘杰
,
E-mail: changjian.guo@coer-scnu.org; liujie47@mail.sysu.edu.cn
Content
Section S1. Simulation of the Travel-Wave Electrodes (page. S-2)
Section S2. Fabrication of the On-Chip Devices (page. S-3)
Section S3. Multiplication-Accumulation Core with Configurable Parallel Accumulator
and Bubble Layers (page. S-4)
Section S4. Real-Value Spin and Weight Calculation Based on Intensity Modulation
(page. S-6)
Section S5. Calculation of Energy efficiency (page. S-8)
S-1
S1 Simulation of the Travel-Wave Electrodes
Traveling-wave electrodes are widely used in integrated photonic electro-optic modulators
due to their compatibility with various processes and flexible design. These modulators,
especially those using thin-film lithium niobate, face challenges in optimizing performance.
The electro-optic bandwidth is limited mainly by the mismatch between the optical waveg-
uide modes’ refractive index and the microwaves’ effective refractive index, along with a
disparity in characteristic impedance. Additionally, achieving a low half-wave voltage (
V
π
)
often involves narrowing the electrode gap. However, this increases optical absorption as
the electrodes are closer to the waveguide, necessitating a balance in the design phase to
optimize these parameters.
The
V
π
, critical for modulator efficiency, is calculated using the formula:
V
π
=
λ
2
·
n
eff
·
L
(S1)
where
λ
represents wavelength of light in the waveguide, ∆
n
eff
is the refractive index differ-
ence between electrical field and optical mode, and
L
is the effective interaction length.
Utilizing COMSOL’s multiphysics capabilities, the interface between the optical waveg-
uide mode field and the high-frequency electromagnetic field of the traveling-wave electrode
can be effectively designed and simulated (see in Figure S1). This approach optimizes device
performance by focusing on the waveguide and electrode’s vertical propagation direction,
significantly streamlining the computational aspect of design.
Simulations reveal that the optimized device achieves a half-wave voltage of 3.8 V
·
cm
in push-pull mode, with a characteristic impedance of 56Ω, an optical field group refractive
index of 2.2, and a microwave effective refractive index of 1.99. By integrating a 56Ω load
at the electrode’s distal end, the modulator’s 3-dB EO bandwidth reaches approximately 30
GHz.
S-2

Terms & Conditions   条款与条件

Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
大多数电子辅助信息文件无需订阅ACS网络版即可使用。此类文件可按文章下载,用于研究(如果相关文章有公共使用许可,则该许可可能允许其他用途)。可通过权利链接许可系统向ACS申请其他用途的许可:http://pubs.acs.org/page/copyright/permissions.html。

Author Information  作者信息

Click to copy section link
点击复制部分链接
Section link copied!

  • Corresponding Authors  通讯作者
    • Changjian Guo - Guangdong Provincial Key Laboratory of Optical Information Materials and Technology, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou 510006, China Email: changjian.guo@coer-scnu.org
      郭长建 - 广东省光信息材料与技术重点实验室,华南师范大学先进光电子研究院,广州 510006,中国;电子邮件:changjian.guo@coer-scnu.org
    • Jie Liu - State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, ChinaOrcidhttps://orcid.org/0000-0001-7994-2516 Email: liujie47@mail.sysu.edu.cn
      刘杰——中山大学光电材料与技术国家重点实验室,电子与信息技术学院,广州 510006,中国; Orcid https://orcid.org/0000-0001-7994-2516;电子邮件:liujie47@mail.sysu.edu.cn
  • Authors  作者
    • Zhenhua Li - State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, China
    • Ranfeng Gan - State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, China
      甘然峰——中山大学光电材料与技术国家重点实验室,电子与信息技术学院,广州 510006,中国
    • Zihao Chen - State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, China
      陈子豪——中山大学光电材料与技术国家重点实验室,电子与信息技术学院,广州 510006,中国
    • Zhaoang Deng - State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, China
      邓昭昂——中山大学光电材料与技术国家重点实验室,电子与信息技术学院,广州 510006,中国
    • Ran Gao - School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
      高冉——北京理工大学信息与电子学院,中国北京,100081
    • Kaixuan Chen - Guangdong Provincial Key Laboratory of Optical Information Materials and Technology, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou 510006, China
      陈凯旋——华南师范大学华南先进光电子研究院广东省光信息材料与技术重点实验室,广州,510006
    • Yanfeng Zhang - State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, ChinaHefei National Laboratory, Hefei 230088, China
    • Liu Liu - State Key Laboratory of Extreme Photonics and Instrumentation, College of Optical Science and Engineering, International Research Center for Advanced Photonics, Zhejiang University, Hangzhou 310058, ChinaJiaxing Key Laboratory of Photonic Sensing & Intelligent Imaging, Intelligent Optics & Photonics Research Center, Jiaxing Research Institute, Zhejiang University, Jiaxing 314000, ChinaOrcidhttps://orcid.org/0000-0002-3651-544X
      刘柳——浙江大学光电科学与工程学院,极端光子学与仪器国家重点实验室,国际先进光子学研究中心,杭州 310058;浙江大学嘉兴研究院,智能光学与光子研究中心,嘉兴光子传感与智能成像重点实验室,嘉兴 314000; Orcid https://orcid.org/0000-0002-3651-544X
    • Siyuan Yu - State Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, China
  • Author Contributions

    Z.L. R.G. and Z.C. contributed equally to this work.

  • Notes
    The authors declare no competing financial interest.

Acknowledgments

Click to copy section linkSection link copied!

This work was supported by the Innovation Program for Quantum Science and Technology (2021ZD0301401), National Natural Science Foundation of China (62335019), Key Technologies Research and Development Program (2019YFA0706300), National Natural Science Foundation of China-Guangdong Joint Fund (U2001601), National Natural Science Foundation of China (62135012), and National Natural Science Foundation of China (61961146003).

References

Click to copy section linkSection link copied!

This article references 51 other publications.

  1. 1
    Korte, B. H.; Vygen, J.; Korte, B.; Vygen, J. Combinatorial Optimization; Springer, 2011; Vol. 1.
  2. 2
    Lucas, A. Ising formulations of many NP problems. Front. Phys. 2014, 2, 5,  DOI: 10.3389/fphy.2014.00005
  3. 3
    Terada, K.; Oku, D.; Kanamaru, S.; Tanaka, S.; Hayashi, M.; Yamaoka, M.; Yanagisawa, M.; Togawa, N. An Ising model mapping to solve rectangle packing problem. 2018 International Symposium on VLSI Design, Automation and Test (VLSI-DAT): Hsinchu, Taiwan, China, 2018; pp 14.
  4. 4
    Mao, Z. T.; Matsuda, Y.; Tamura, R.; Tsuda, K. Chemical design with GPU-based Ising machines. Digital Discovery 2023, 2, 10981103,  DOI: 10.1039/D3DD00047H
  5. 5
    Bohm, F.; Alonso-Urquijo, D.; Verschaffelt, G.; Van der Sande, G. Noise-injected analog Ising machines enable ultrafast statistical sampling and machine learning. Nat. Commun. 2022, 13, 5847,  DOI: 10.1038/s41467-022-33441-3
  6. 6
    Singh, A. K.; Kapelyan, A.; Venturelli, D.; Jamieson, K. Uplink MIMO Detection using Ising Machines: A Multi-Stage Ising Approach. arXiv 2023, arXiv:2304.12830

    accessed March 3, 2024

  7. 7
    Garey, M. R.; Johnson, D. S. Computers and Intractability: A Guide to the Theory of NP-Completeness; W. H. Freeman & Co., 1979.
  8. 8
    Inagaki, T.; Haribara, Y.; Igarashi, K.; Sonobe, T.; Tamate, S.; Honjo, T.; Marandi, A.; McMahon, P. L.; Umeki, T.; Enbutsu, K. A coherent Ising machine for 2000-node optimization problems. Science 2016, 354, 603606,  DOI: 10.1126/science.aah4243
  9. 9
    Mohseni, N.; McMahon, P. L.; Byrnes, T. Ising machines as hardware solvers of combinatorial optimization problems. Nat. Rev. Phys. 2022, 4, 363379,  DOI: 10.1038/s42254-022-00440-8
  10. 10
    Marandi, A.; Wang, Z.; Takata, K.; Byer, R. L.; Yamamoto, Y. Network of time-multiplexed optical parametric oscillators as a coherent Ising machine. Nat. Photonics 2014, 8, 937942,  DOI: 10.1038/nphoton.2014.249
  11. 11
    McMahon, P. L.; Marandi, A.; Haribara, Y.; Hamerly, R.; Langrock, C.; Tamate, S.; Inagaki, T.; Takesue, H.; Utsunomiya, S.; Aihara, K.; Byer, R. L.; Fejer, M. M.; Mabuchi, H.; Yamamoto, Y. A fully programmable 100-spin coherent Ising machine with all-to-all connections. Science 2016, 354, 614617,  DOI: 10.1126/science.aah5178
  12. 12
    Honjo, T.; Sonobe, T.; Inaba, K.; Inagaki, T.; Ikuta, T.; Yamada, Y.; Kazama, T.; Enbutsu, K.; Umeki, T.; Kasahara, R.; Kawarabayashi, K. I.; Takesue, H. 100,000-spin coherent Ising machine. Sci. Adv. 2021, 7, eabh0952  DOI: 10.1126/sciadv.abh0952
  13. 13
    Cen, Q.; Ding, H.; Hao, T.; Guan, S.; Qin, Z.; Lyu, J.; Li, W.; Zhu, N.; Xu, K.; Dai, Y.; Li, M. Large-scale coherent Ising machine based on optoelectronic parametric oscillator. Light: Sci. Appl. 2022, 11, 333,  DOI: 10.1038/s41377-022-01013-1
  14. 14
    Pierangeli, D.; Marcucci, G.; Conti, C. Large-Scale Photonic Ising Machine by Spatial Light Modulation. Phys. Rev. Lett. 2019, 122, 213902,  DOI: 10.1103/PhysRevLett.122.213902
  15. 15
    Bohm, F.; Verschaffelt, G.; Van der Sande, G. A poor man’s coherent Ising machine based on opto-electronic feedback systems for solving optimization problems. Nat. Commun. 2019, 10, 3538,  DOI: 10.1038/s41467-019-11484-3
  16. 16
    Prabhu, M.; Roques-Carmes, C.; Shen, Y.; Harris, N.; Jing, L.; Carolan, J.; Hamerly, R.; Baehr-Jones, T.; Hochberg, M.; Čeperić, V.; Joannopoulos, J. D.; Englund, D. R.; Soljačić, M. Accelerating recurrent Ising machines in photonic integrated circuits. Optica 2020, 7, 551558,  DOI: 10.1364/OPTICA.386613
  17. 17
    Roques-Carmes, C.; Shen, Y.; Zanoci, C.; Prabhu, M.; Atieh, F.; Jing, L.; Dubcek, T.; Mao, C.; Johnson, M. R.; Ceperic, V.; Joannopoulos, J. D.; Englund, D.; Soljacic, M. Heuristic recurrent algorithms for photonic Ising machines. Nat. Commun. 2020, 11, 249,  DOI: 10.1038/s41467-019-14096-z
  18. 18
    Johnson, M. W.; Amin, M. H. S.; Gildert, S.; Lanting, T.; Hamze, F.; Dickson, N.; Harris, R.; Berkley, A. J.; Johansson, J.; Bunyk, P. Quantum annealing with manufactured spins. Nature 2011, 473, 194198,  DOI: 10.1038/nature10012
  19. 19
    Harris, R.; Sato, Y.; Berkley, A. J.; Reis, M.; Altomare, F.; Amin, M. H.; Boothby, K.; Bunyk, P.; Deng, C.; Enderud, C. Phase transitions in a programmable quantum spin glass simulator. Science 2018, 361, 162165,  DOI: 10.1126/science.aat2025
  20. 20
    Mandrà, S.; Katzgraber, H. G. A deceptive step towards quantum speedup detection. Quantum Sci. Technol. 2018, 3, 04LT01,  DOI: 10.1088/2058-9565/aac8b2
  21. 21
    D-Wave Systems Inc. Advantage Data Sheet, 2022; https://www.dwavesys.com/media/htjclcey/advantage_datasheet_v10.pdf, (accessed March 3, 2024).
  22. 22
    Li, Z. H.; Liu, J.; Yu, S. Y. A Dynamic Time-Evolution Control Method to Improve the Performance of Optoelectronic Coherent Ising Machine. OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXPOSITION (OFC): San Diego, California, United States, 2021; p Tu1H.4.
  23. 23
    Mwamsojo, N.; Lehmann, F.; Merghem, K.; Benkelfat, B. E.; Frignac, Y. Optoelectronic coherent Ising machine for combinatorial optimization problems. Opt. Lett. 2023, 48, 21502153,  DOI: 10.1364/OL.485215
  24. 24
    Cheng, Z. Z.; Tsang, H. K.; Wang, X. M.; Xu, K.; Xu, J. B. In-Plane Optical Absorption and Free Carrier Absorption in Graphene-on-Silicon Waveguides. IEEE J. Sel. Top. Quantum Electron. 2014, 20, 4348,  DOI: 10.1109/JSTQE.2013.2263115
  25. 25
    Li, G. H.; Sekine, R.; Nehra, R.; Gray, R. M.; Ledezma, L.; Guo, Q.; Marandi, A. All-optical ultrafast ReLU function for energy-efficient nanophotonic deep learning. Nanophotonics 2023, 12, 847855,  DOI: 10.1515/nanoph-2022-0137
  26. 26
    Zuo, Y.; Li, B. H.; Zhao, Y. J.; Jiang, Y.; Chen, Y. C.; Chen, P.; Jo, G. B.; Liu, J. W.; Du, S. W. All-optical neural network with nonlinear activation functions. Optica 2019, 6, 11321137,  DOI: 10.1364/OPTICA.6.001132
  27. 27
    Chen, Z.; Li, Z.; Deng, Z.; Liu, J.; Yu, S. An Optoelectronic Analog Ising Machine Enabling 2048-Spin and Low-Latency Calculations. OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXPOSITION (OFC): San Diego, California, United States, 2023; p M2J.2.
  28. 28
    Wang, Z.; Marandi, A.; Wen, K.; Byer, R. L.; Yamamoto, Y. Coherent Ising machine based on degenerate optical parametric oscillators. Phys. Rev. A 2013, 88, 063853,  DOI: 10.1103/PhysRevA.88.063853
  29. 29
    LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436444,  DOI: 10.1038/nature14539
  30. 30
    Shen, Y.; Harris, N. C.; Skirlo, S.; Prabhu, M.; Baehr-Jones, T.; Hochberg, M.; Sun, X.; Zhao, S.; Larochelle, H.; Englund, D.; Soljačić, M. Deep learning with coherent nanophotonic circuits. Nat. Photonics 2017, 11, 441446,  DOI: 10.1038/nphoton.2017.93
  31. 31
    He, M. B. High-performance hybrid silicon and lithium niobate Mach-Zehnder modulators for 100 Gbit s and beyond. Nat. Photonics 2019, 13, 359365,  DOI: 10.1038/s41566-019-0378-6
  32. 32
    Haribara, Y.; Utsunomiya, S.; Yamamoto, Y. Principles and Methods of Quantum Information Technologies. Lecture Notes in Physics. Chapter 12; Springer Japan, 2016; pp 251262.
  33. 33
    Cipra, B. A. The Ising model is NP-complete. SIAM News 2000, 33, 13
  34. 34
    Kochenberger, G. A.; Hao, J. K.; Lü, Z.; Wang, H. B.; Glover, F. Solving large scale Max Cut problems via tabu search. J. Heuristics 2013, 19, 565571,  DOI: 10.1007/s10732-011-9189-8
  35. 35
    Leleu, T.; Yamamoto, Y.; Utsunomiya, S.; Aihara, K. Combinatorial optimization using dynamical phase transitions in driven-dissipative systems. Phys. Rev. E 2017, 95, 022118,  DOI: 10.1103/PhysRevE.95.022118
  36. 36
    Leleu, T.; Yamamoto, Y.; McMahon, P. L.; Aihara, K. Destabilization of Local Minima in Analog Spin Systems by Correction of Amplitude Heterogeneity. Phys. Rev. Lett. 2019, 122, 040607,  DOI: 10.1103/PhysRevLett.122.040607
  37. 37
    Xu, M. Y.; Zhu, Y. T.; Pittalà, F.; Tang, J.; He, M. B.; Ng, W. C.; Wang, J. Y.; Ruan, Z. L.; Tang, X. F.; Kuschnerov, M.; Liu, L.; Yu, S. Y.; Zheng, B. F.; Cai, X. L. Dual-polarization thin-film lithium niobate in-phase quadrature modulators for terabit-per-second transmission. Optica 2022, 9, 6162,  DOI: 10.1364/OPTICA.449691
  38. 38
    Han, C.; Zheng, Z.; Shu, H.; Jin, M.; Qin, J.; Chen, R.; Tao, Y.; Shen, B.; Bai, B.; Yang, F. Slow-light silicon modulator with 110-GHz bandwidth. Sci. Adv. 2023, 9, eadi5339  DOI: 10.1126/sciadv.adi5339
  39. 39
    Maes, D.; Reis, L.; Poelman, S.; Vissers, E.; Avramovic, V.; Zaknoune, M.; Roelkens, G.; Lemey, S.; Peytavit, E.; Kuyken, B. High-Speed Photodiodes on Silicon Nitride with a Bandwidth beyond 100 GHz. Conference on Lasers and Electro-Optics; San Jose: California, United States, 2022; p SM3K.3.
  40. 40
    Nagatani, M.; Wakita, H.; Jyo, T.; Takeya, T.; Yamazaki, H.; Ogiso, Y.; Mutoh, M.; Shiratori, Y.; Ida, M.; Hamaoka, F.; Nakamura, M.; Kobayashi, T.; Takahashi, H.; Miyamoto, Y. 110-GHz-Bandwidth InP-HBT AMUX/ADEMUX Circuits for Beyond-1-Tb/s/ch Digital Coherent Optical Transceivers. IEEE Custom Integrated Circuits Conference (CICC); Newport Beach: California, United States, 2022; pp 18.
  41. 41
    Tan, M.; Xu, J.; Liu, S.; Feng, J.; Zhang, H.; Yao, C.; Chen, S.; Guo, H.; Han, G.; Wen, Z. Co-packaged optics (CPO): status, challenges, and solutions. Front. Optoelectron. 2023, 16, 1,  DOI: 10.1007/s12200-022-00055-y
  42. 42
    McMahon, P. L. The physics of optical computing. Nat. Rev. Phys. 2023, 5, 717734,  DOI: 10.1038/s42254-023-00645-5
  43. 43
    El Srouji, L.; Krishnan, A.; Ravichandran, R.; Lee, Y.; On, M.; Xiao, X.; Ben Yoo, S. J. Photonic and optoelectronic neuromorphic computing. APL Photonics 2022, 7, 051101,  DOI: 10.1063/5.0072090
  44. 44
    Peserico, N.; Shastri, B. J.; Sorger, V. J. Integrated Photonic Tensor Processing Unit for a Matrix Multiply: A Review. J. Lightwave Technol. 2023, 41, 37043716,  DOI: 10.1109/JLT.2023.3269957
  45. 45
    Yang, L.; Ji, R.; Zhang, L.; Ding, J.; Xu, Q. On-chip CMOS-compatible optical signal processor. Opt. Express 2012, 20, 13560,  DOI: 10.1364/OE.20.013560
  46. 46
    Tait, A. N.; Nahmias, M. A.; Shastri, B. J.; Prucnal, P. R. Broadcast and Weight: An Integrated Network For Scalable Photonic Spike Processing. J. Lightwave Technol. 2014, 32, 40294041,  DOI: 10.1109/JLT.2014.2345652
  47. 47
    Xu, X.; Tan, M.; Corcoran, B.; Wu, J.; Boes, A.; Nguyen, T. G.; Chu, S. T.; Little, B. E.; Hicks, D. G.; Morandotti, R.; Mitchell, A.; Moss, D. J. 11 TOPS photonic convolutional accelerator for optical neural networks. Nature 2021, 589, 4451,  DOI: 10.1038/s41586-020-03063-0
  48. 48
    Feldmann, J.; Youngblood, N.; Karpov, M.; Gehring, H.; Li, X.; Stappers, M.; Le Gallo, M.; Fu, X.; Lukashchuk, A.; Raja, A. S. Parallel convolutional processing using an integrated photonic tensor core. Nature 2021, 589, 5258,  DOI: 10.1038/s41586-020-03070-1
  49. 49
    Shi, B.; Calabretta, N.; Stabile, R. Deep Neural Network Through an InP SOA-Based Photonic Integrated Cross-Connect. IEEE J. Sel. Top. Quantum Electron. 2020, 26, 111,  DOI: 10.1109/JSTQE.2019.2945548
  50. 50
    Zhong, Z.; Yang, M.; Lang, J.; Williams, C.; Kronman, L.; Sludds, A.; Esfahanizadeh, H.; Englund, D.; Ghobadi, M. Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference. Proceedings of the ACM SIGCOMM 2023 Conference: New York, United States, 2023; pp 452472.
  51. 51
    Hamerly, R.; Bernstein, L.; Sludds, A.; Soljačić, M.; Englund, D. Large-Scale Optical Neural Networks Based on Photoelectric Multiplication. Phys. Rev. X 2019, 9, 021032,  DOI: 10.1103/PhysRevX.9.021032

Cited By

Click to copy section linkSection link copied!

This article is cited by 3 publications.

  1. Yuan Gao, Guanyu Chen, Luo Qi, Wujie Fu, Zifeng Yuan, Aaron J. Danner. Photonic Ising machines for combinatorial optimization problems. Applied Physics Reviews 2024, 11 (4) https://doi.org/10.1063/5.0216656
  2. Zhixian Zhou, Zhenhua Li, Zihao Chen, Jie Liu, Siyuan Yu. An Optoelectronic Ising Machine with Low-Cost FPGA for 10,000-Spin High-Accuracy Calculations. 2024, 1-3. https://doi.org/10.1109/ACP/IPOC63121.2024.10809451
  3. Xin Ye, Wenjia Zhang, Zuyuan He. InteGrated Spatial Photonic Ising Sampler Based on High-Uniformity 1 × 8 Multi-Mode Interferometer. 2024, 1-4. https://doi.org/10.1109/ACP/IPOC63121.2024.10810070

ACS Photonics

Cite this: ACS Photonics 2024, 11, 4, 1703–1714
Click to copy citationCitation copied!
https://doi.org/10.1021/acsphotonics.4c00003
Published March 14, 2024
Copyright © 2024 American Chemical Society

Article Views

1323

Altmetric

-

Citations

Learn about these metrics

Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.

Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.

The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.

  • Abstract

    Figure 1

    Figure 1. Schematic architecture of the optoelectronic Ising machine with 2 types of potential operational schemes.

    Figure 2

    Figure 2. Microscopy image of a (a) partially enlarged detail of the waveguide crossing, the (b) TFLN chip, and a (c) partially enlarged detail of the bonded PD and terminator. (d) Measured results of Vπ. The linear scanning 200 kHz sawtooth input waveform (red dash) and PD output (or transmission, blue solid). (e) Measured bandwidth (S21 parameter) of the whole device.

    Figure 3

    Figure 3. (a) Example for the CSR format of a sparse matrix. (b) Inner architecture of an AMU. (c) Architecture of a SpMV MACC.

    Figure 4

    Figure 4. (a) Checkerboard graph. (b) Evolution of spin amplitudes and cut value of the checkerboard graph. (c) Success rate of the checkerboard graph. (d) G22 graph. (e) Evolution of spin amplitudes and cut value of the G22 graph. (f) Success rate of the G22 graph.

    Figure 5

    Figure 5. Schematic diagram for IM with proposed Scheme II.

    Figure 6

    Figure 6. (a) Evolution of spins amplitudes and cut value of checkerboard graph in Scheme II. (b–f) snapshots of the spin graph at the 2nd, 20th, 50th, 100th, 200th, 500th, 1000th, and 2000th iterations.

    Figure 7

    Figure 7. Schematic diagrams of the architectures utilizing TDM and SDM for MVM computing based on (a) digital electronics (Scheme I) and (b) OE modulation (Scheme II). Inset: TDM vs SDM that commonly contribute computation throughput. (c) Concept architecture with additionally introduced WDM.

  • References


    This article references 51 other publications.

    1. 1
      Korte, B. H.; Vygen, J.; Korte, B.; Vygen, J. Combinatorial Optimization; Springer, 2011; Vol. 1.
    2. 2
      Lucas, A. Ising formulations of many NP problems. Front. Phys. 2014, 2, 5,  DOI: 10.3389/fphy.2014.00005
    3. 3
      Terada, K.; Oku, D.; Kanamaru, S.; Tanaka, S.; Hayashi, M.; Yamaoka, M.; Yanagisawa, M.; Togawa, N. An Ising model mapping to solve rectangle packing problem. 2018 International Symposium on VLSI Design, Automation and Test (VLSI-DAT): Hsinchu, Taiwan, China, 2018; pp 14.
    4. 4
      Mao, Z. T.; Matsuda, Y.; Tamura, R.; Tsuda, K. Chemical design with GPU-based Ising machines. Digital Discovery 2023, 2, 10981103,  DOI: 10.1039/D3DD00047H
    5. 5
      Bohm, F.; Alonso-Urquijo, D.; Verschaffelt, G.; Van der Sande, G. Noise-injected analog Ising machines enable ultrafast statistical sampling and machine learning. Nat. Commun. 2022, 13, 5847,  DOI: 10.1038/s41467-022-33441-3
    6. 6
      Singh, A. K.; Kapelyan, A.; Venturelli, D.; Jamieson, K. Uplink MIMO Detection using Ising Machines: A Multi-Stage Ising Approach. arXiv 2023, arXiv:2304.12830

      accessed March 3, 2024

    7. 7
      Garey, M. R.; Johnson, D. S. Computers and Intractability: A Guide to the Theory of NP-Completeness; W. H. Freeman & Co., 1979.
    8. 8
      Inagaki, T.; Haribara, Y.; Igarashi, K.; Sonobe, T.; Tamate, S.; Honjo, T.; Marandi, A.; McMahon, P. L.; Umeki, T.; Enbutsu, K. A coherent Ising machine for 2000-node optimization problems. Science 2016, 354, 603606,  DOI: 10.1126/science.aah4243
    9. 9
      Mohseni, N.; McMahon, P. L.; Byrnes, T. Ising machines as hardware solvers of combinatorial optimization problems. Nat. Rev. Phys. 2022, 4, 363379,  DOI: 10.1038/s42254-022-00440-8
    10. 10
      Marandi, A.; Wang, Z.; Takata, K.; Byer, R. L.; Yamamoto, Y. Network of time-multiplexed optical parametric oscillators as a coherent Ising machine. Nat. Photonics 2014, 8, 937942,  DOI: 10.1038/nphoton.2014.249
    11. 11
      McMahon, P. L.; Marandi, A.; Haribara, Y.; Hamerly, R.; Langrock, C.; Tamate, S.; Inagaki, T.; Takesue, H.; Utsunomiya, S.; Aihara, K.; Byer, R. L.; Fejer, M. M.; Mabuchi, H.; Yamamoto, Y. A fully programmable 100-spin coherent Ising machine with all-to-all connections. Science 2016, 354, 614617,  DOI: 10.1126/science.aah5178
    12. 12
      Honjo, T.; Sonobe, T.; Inaba, K.; Inagaki, T.; Ikuta, T.; Yamada, Y.; Kazama, T.; Enbutsu, K.; Umeki, T.; Kasahara, R.; Kawarabayashi, K. I.; Takesue, H. 100,000-spin coherent Ising machine. Sci. Adv. 2021, 7, eabh0952  DOI: 10.1126/sciadv.abh0952
    13. 13
      Cen, Q.; Ding, H.; Hao, T.; Guan, S.; Qin, Z.; Lyu, J.; Li, W.; Zhu, N.; Xu, K.; Dai, Y.; Li, M. Large-scale coherent Ising machine based on optoelectronic parametric oscillator. Light: Sci. Appl. 2022, 11, 333,  DOI: 10.1038/s41377-022-01013-1
    14. 14
      Pierangeli, D.; Marcucci, G.; Conti, C. Large-Scale Photonic Ising Machine by Spatial Light Modulation. Phys. Rev. Lett. 2019, 122, 213902,  DOI: 10.1103/PhysRevLett.122.213902
    15. 15
      Bohm, F.; Verschaffelt, G.; Van der Sande, G. A poor man’s coherent Ising machine based on opto-electronic feedback systems for solving optimization problems. Nat. Commun. 2019, 10, 3538,  DOI: 10.1038/s41467-019-11484-3
    16. 16
      Prabhu, M.; Roques-Carmes, C.; Shen, Y.; Harris, N.; Jing, L.; Carolan, J.; Hamerly, R.; Baehr-Jones, T.; Hochberg, M.; Čeperić, V.; Joannopoulos, J. D.; Englund, D. R.; Soljačić, M. Accelerating recurrent Ising machines in photonic integrated circuits. Optica 2020, 7, 551558,  DOI: 10.1364/OPTICA.386613
    17. 17
      Roques-Carmes, C.; Shen, Y.; Zanoci, C.; Prabhu, M.; Atieh, F.; Jing, L.; Dubcek, T.; Mao, C.; Johnson, M. R.; Ceperic, V.; Joannopoulos, J. D.; Englund, D.; Soljacic, M. Heuristic recurrent algorithms for photonic Ising machines. Nat. Commun. 2020, 11, 249,  DOI: 10.1038/s41467-019-14096-z
    18. 18
      Johnson, M. W.; Amin, M. H. S.; Gildert, S.; Lanting, T.; Hamze, F.; Dickson, N.; Harris, R.; Berkley, A. J.; Johansson, J.; Bunyk, P. Quantum annealing with manufactured spins. Nature 2011, 473, 194198,  DOI: 10.1038/nature10012
    19. 19
      Harris, R.; Sato, Y.; Berkley, A. J.; Reis, M.; Altomare, F.; Amin, M. H.; Boothby, K.; Bunyk, P.; Deng, C.; Enderud, C. Phase transitions in a programmable quantum spin glass simulator. Science 2018, 361, 162165,  DOI: 10.1126/science.aat2025
    20. 20
      Mandrà, S.; Katzgraber, H. G. A deceptive step towards quantum speedup detection. Quantum Sci. Technol. 2018, 3, 04LT01,  DOI: 10.1088/2058-9565/aac8b2
    21. 21
      D-Wave Systems Inc. Advantage Data Sheet, 2022; https://www.dwavesys.com/media/htjclcey/advantage_datasheet_v10.pdf, (accessed March 3, 2024).
    22. 22
      Li, Z. H.; Liu, J.; Yu, S. Y. A Dynamic Time-Evolution Control Method to Improve the Performance of Optoelectronic Coherent Ising Machine. OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXPOSITION (OFC): San Diego, California, United States, 2021; p Tu1H.4.
    23. 23
      Mwamsojo, N.; Lehmann, F.; Merghem, K.; Benkelfat, B. E.; Frignac, Y. Optoelectronic coherent Ising machine for combinatorial optimization problems. Opt. Lett. 2023, 48, 21502153,  DOI: 10.1364/OL.485215
    24. 24
      Cheng, Z. Z.; Tsang, H. K.; Wang, X. M.; Xu, K.; Xu, J. B. In-Plane Optical Absorption and Free Carrier Absorption in Graphene-on-Silicon Waveguides. IEEE J. Sel. Top. Quantum Electron. 2014, 20, 4348,  DOI: 10.1109/JSTQE.2013.2263115
    25. 25
      Li, G. H.; Sekine, R.; Nehra, R.; Gray, R. M.; Ledezma, L.; Guo, Q.; Marandi, A. All-optical ultrafast ReLU function for energy-efficient nanophotonic deep learning. Nanophotonics 2023, 12, 847855,  DOI: 10.1515/nanoph-2022-0137
    26. 26
      Zuo, Y.; Li, B. H.; Zhao, Y. J.; Jiang, Y.; Chen, Y. C.; Chen, P.; Jo, G. B.; Liu, J. W.; Du, S. W. All-optical neural network with nonlinear activation functions. Optica 2019, 6, 11321137,  DOI: 10.1364/OPTICA.6.001132
    27. 27
      Chen, Z.; Li, Z.; Deng, Z.; Liu, J.; Yu, S. An Optoelectronic Analog Ising Machine Enabling 2048-Spin and Low-Latency Calculations. OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXPOSITION (OFC): San Diego, California, United States, 2023; p M2J.2.
    28. 28
      Wang, Z.; Marandi, A.; Wen, K.; Byer, R. L.; Yamamoto, Y. Coherent Ising machine based on degenerate optical parametric oscillators. Phys. Rev. A 2013, 88, 063853,  DOI: 10.1103/PhysRevA.88.063853
    29. 29
      LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436444,  DOI: 10.1038/nature14539
    30. 30
      Shen, Y.; Harris, N. C.; Skirlo, S.; Prabhu, M.; Baehr-Jones, T.; Hochberg, M.; Sun, X.; Zhao, S.; Larochelle, H.; Englund, D.; Soljačić, M. Deep learning with coherent nanophotonic circuits. Nat. Photonics 2017, 11, 441446,  DOI: 10.1038/nphoton.2017.93
    31. 31
      He, M. B. High-performance hybrid silicon and lithium niobate Mach-Zehnder modulators for 100 Gbit s and beyond. Nat. Photonics 2019, 13, 359365,  DOI: 10.1038/s41566-019-0378-6
    32. 32
      Haribara, Y.; Utsunomiya, S.; Yamamoto, Y. Principles and Methods of Quantum Information Technologies. Lecture Notes in Physics. Chapter 12; Springer Japan, 2016; pp 251262.
    33. 33
      Cipra, B. A. The Ising model is NP-complete. SIAM News 2000, 33, 13
    34. 34
      Kochenberger, G. A.; Hao, J. K.; Lü, Z.; Wang, H. B.; Glover, F. Solving large scale Max Cut problems via tabu search. J. Heuristics 2013, 19, 565571,  DOI: 10.1007/s10732-011-9189-8
    35. 35
      Leleu, T.; Yamamoto, Y.; Utsunomiya, S.; Aihara, K. Combinatorial optimization using dynamical phase transitions in driven-dissipative systems. Phys. Rev. E 2017, 95, 022118,  DOI: 10.1103/PhysRevE.95.022118
    36. 36
      Leleu, T.; Yamamoto, Y.; McMahon, P. L.; Aihara, K. Destabilization of Local Minima in Analog Spin Systems by Correction of Amplitude Heterogeneity. Phys. Rev. Lett. 2019, 122, 040607,  DOI: 10.1103/PhysRevLett.122.040607
    37. 37
      Xu, M. Y.; Zhu, Y. T.; Pittalà, F.; Tang, J.; He, M. B.; Ng, W. C.; Wang, J. Y.; Ruan, Z. L.; Tang, X. F.; Kuschnerov, M.; Liu, L.; Yu, S. Y.; Zheng, B. F.; Cai, X. L. Dual-polarization thin-film lithium niobate in-phase quadrature modulators for terabit-per-second transmission. Optica 2022, 9, 6162,  DOI: 10.1364/OPTICA.449691
    38. 38
      Han, C.; Zheng, Z.; Shu, H.; Jin, M.; Qin, J.; Chen, R.; Tao, Y.; Shen, B.; Bai, B.; Yang, F. Slow-light silicon modulator with 110-GHz bandwidth. Sci. Adv. 2023, 9, eadi5339  DOI: 10.1126/sciadv.adi5339
    39. 39
      Maes, D.; Reis, L.; Poelman, S.; Vissers, E.; Avramovic, V.; Zaknoune, M.; Roelkens, G.; Lemey, S.; Peytavit, E.; Kuyken, B. High-Speed Photodiodes on Silicon Nitride with a Bandwidth beyond 100 GHz. Conference on Lasers and Electro-Optics; San Jose: California, United States, 2022; p SM3K.3.
    40. 40
      Nagatani, M.; Wakita, H.; Jyo, T.; Takeya, T.; Yamazaki, H.; Ogiso, Y.; Mutoh, M.; Shiratori, Y.; Ida, M.; Hamaoka, F.; Nakamura, M.; Kobayashi, T.; Takahashi, H.; Miyamoto, Y. 110-GHz-Bandwidth InP-HBT AMUX/ADEMUX Circuits for Beyond-1-Tb/s/ch Digital Coherent Optical Transceivers. IEEE Custom Integrated Circuits Conference (CICC); Newport Beach: California, United States, 2022; pp 18.
    41. 41
      Tan, M.; Xu, J.; Liu, S.; Feng, J.; Zhang, H.; Yao, C.; Chen, S.; Guo, H.; Han, G.; Wen, Z. Co-packaged optics (CPO): status, challenges, and solutions. Front. Optoelectron. 2023, 16, 1,  DOI: 10.1007/s12200-022-00055-y
    42. 42
      McMahon, P. L. The physics of optical computing. Nat. Rev. Phys. 2023, 5, 717734,  DOI: 10.1038/s42254-023-00645-5
    43. 43
      El Srouji, L.; Krishnan, A.; Ravichandran, R.; Lee, Y.; On, M.; Xiao, X.; Ben Yoo, S. J. Photonic and optoelectronic neuromorphic computing. APL Photonics 2022, 7, 051101,  DOI: 10.1063/5.0072090
    44. 44
      Peserico, N.; Shastri, B. J.; Sorger, V. J. Integrated Photonic Tensor Processing Unit for a Matrix Multiply: A Review. J. Lightwave Technol. 2023, 41, 37043716,  DOI: 10.1109/JLT.2023.3269957
    45. 45
      Yang, L.; Ji, R.; Zhang, L.; Ding, J.; Xu, Q. On-chip CMOS-compatible optical signal processor. Opt. Express 2012, 20, 13560,  DOI: 10.1364/OE.20.013560
    46. 46
      Tait, A. N.; Nahmias, M. A.; Shastri, B. J.; Prucnal, P. R. Broadcast and Weight: An Integrated Network For Scalable Photonic Spike Processing. J. Lightwave Technol. 2014, 32, 40294041,  DOI: 10.1109/JLT.2014.2345652
    47. 47
      Xu, X.; Tan, M.; Corcoran, B.; Wu, J.; Boes, A.; Nguyen, T. G.; Chu, S. T.; Little, B. E.; Hicks, D. G.; Morandotti, R.; Mitchell, A.; Moss, D. J. 11 TOPS photonic convolutional accelerator for optical neural networks. Nature 2021, 589, 4451,  DOI: 10.1038/s41586-020-03063-0
    48. 48
      Feldmann, J.; Youngblood, N.; Karpov, M.; Gehring, H.; Li, X.; Stappers, M.; Le Gallo, M.; Fu, X.; Lukashchuk, A.; Raja, A. S. Parallel convolutional processing using an integrated photonic tensor core. Nature 2021, 589, 5258,  DOI: 10.1038/s41586-020-03070-1
    49. 49
      Shi, B.; Calabretta, N.; Stabile, R. Deep Neural Network Through an InP SOA-Based Photonic Integrated Cross-Connect. IEEE J. Sel. Top. Quantum Electron. 2020, 26, 111,  DOI: 10.1109/JSTQE.2019.2945548
    50. 50
      Zhong, Z.; Yang, M.; Lang, J.; Williams, C.; Kronman, L.; Sludds, A.; Esfahanizadeh, H.; Englund, D.; Ghobadi, M. Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference. Proceedings of the ACM SIGCOMM 2023 Conference: New York, United States, 2023; pp 452472.
    51. 51
      Hamerly, R.; Bernstein, L.; Sludds, A.; Soljačić, M.; Englund, D. Large-Scale Optical Neural Networks Based on Photoelectric Multiplication. Phys. Rev. X 2019, 9, 021032,  DOI: 10.1103/PhysRevX.9.021032
  • Supporting Information

    Supporting Information


    The Supporting Information is available free of charge at: https://pubs.acs.org/doi/10.1021/acsphotonics.4c00003.

    • Additional details about simulation of the travel-wave electrodes, fabrication of the on-chip devices, multiplication-accumulation core with configurable parallel accumulator and bubble layers, real-value spin and weight calculation based on intensity modulation, and calculation of energy efficiency (PDF)


    Terms & Conditions

    Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.