Scalable On-Chip Optoelectronic Ising Machine Utilizing Thin-Film Lithium Niobate Photonics
利用薄膜铌酸锂光子学技术的可扩展片上光电伊辛机Click to copy article link
点击复制文章链接Article link copied!
点击复制文章链接Article link copied!
- Zhenhua LiZhenhua LiState Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, ChinaMore by Zhenhua Li
- Ranfeng GanRanfeng GanState Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, ChinaMore by Ranfeng Gan
- Zihao ChenZihao ChenState Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, ChinaMore by Zihao Chen
- Zhaoang DengZhaoang DengState Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, ChinaMore by Zhaoang Deng
- Ran GaoRan GaoSchool of Information and Electronics, Beijing Institute of Technology, Beijing 100081, ChinaMore by Ran Gao
- Kaixuan ChenKaixuan ChenGuangdong Provincial Key Laboratory of Optical Information Materials and Technology, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou 510006, ChinaMore by Kaixuan Chen
- Changjian Guo*Changjian Guo*Email: changjian.guo@coer-scnu.orgGuangdong Provincial Key Laboratory of Optical Information Materials and Technology, South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou 510006, ChinaMore by Changjian Guo
- Yanfeng ZhangYanfeng ZhangState Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, ChinaHefei National Laboratory, Hefei 230088, ChinaMore by Yanfeng Zhang
- Liu LiuLiu LiuState Key Laboratory of Extreme Photonics and Instrumentation, College of Optical Science and Engineering, International Research Center for Advanced Photonics, Zhejiang University, Hangzhou 310058, ChinaJiaxing Key Laboratory of Photonic Sensing & Intelligent Imaging, Intelligent Optics & Photonics Research Center, Jiaxing Research Institute, Zhejiang University, Jiaxing 314000, ChinaMore by Liu Liu
- Siyuan YuSiyuan YuState Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, ChinaMore by Siyuan Yu
- Jie Liu*Jie Liu*Email: liujie47@mail.sysu.edu.cnState Key Laboratory of Optoelectronic Materials and Technologies, School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou 510006, ChinaMore by Jie Liu
Abstract 摘要
点击复制部分链接Section link copied!
The Ising machine (IM) has emerged as a promising tool for tackling nondeterministic polynomial-time hard combinatorial optimization problems in real-world applications. Among various types of IMs, optoelectronic IMs based on electro-optical (EO) modulators stand out as an impressive platform for Ising computations. They offer a simple and stable architecture, with the EO modulator providing a natural inline nonlinear transfer function for the Ising model. However, integrated optoelectronic IMs have not been demonstrated until now, and exploring large-scale computations within the constraints of digital hardware resources remains an open challenge for these systems. In this paper, an integrated optoelectronic IM based on a thin-film lithium niobate (TFLN) photonic chip is presented, in conjunction with a sparse matrix–vector multiplication algorithm embedded in a field-programmable gate array that optimizes hardware resource utilization and minimizes computational latency. This setup allows us to solve multiple types of MAX-CUT problems with up to 2048 spins and achieve a remarkably low iteration latency of 1.78 μs. To further address the constraints posed by digital devices when tackling larger-scale Ising problems, we extend the application of the TFLN chip to yet another new scheme in which the single, compact on-chip modulator concurrently performs operations of linear multiplication and nonlinear transformation. This scheme demonstrates the capability to address large-scale MAX-CUT problems involving up to 16,384 spins, which, to the best of our knowledge, are the largest-scale problems solved on an on-chip IM, highlighting its potential to overcome digital limitations. The TFLN-based optoelectronic IMs provide a compact solution with high scalability for potentially practical applications in addressing complex combinatorial optimization problems.
在现实应用中,伊辛机(IM)已成为解决非确定性多项式时间组合优化难题的有力工具。在各种伊辛机中,基于电光(EO)调制器的光电伊辛机是伊辛计算领域令人瞩目的平台。它们具有简单稳定的架构,电光调制器为伊辛模型提供了天然的线性非线性传递函数。然而,集成光电集成模块直到现在才得到验证,在数字硬件资源的限制下探索大规模计算仍然是这些系统面临的挑战。本文介绍了一种基于薄膜铌酸锂(TFLN)光子芯片的光电集成微处理器,以及嵌入现场可编程门阵列中的稀疏矩阵-向量乘法算法,该算法可优化硬件资源利用率并最大限度地减少计算延迟。这种设置使我们能够解决多达2048个自旋的多种类型的MAX-CUT问题,并实现1.78微秒的极低迭代延迟。 为了进一步解决数字设备在处理大规模Ising问题时所面临的限制,我们将TFLN芯片的应用扩展到另一个新方案中,在该方案中,单个紧凑的片上调制器同时执行线性乘法和非线性变换操作。该方案能够解决涉及多达16384个自旋的大规模MAX-CUT问题,据我们所知,这是片上IM能够解决的规模最大的问题,凸显了它克服数字限制的潜力。基于TFLN的光电子IM为解决复杂组合优化问题的潜在实际应用提供了具有高度可扩展性的紧凑解决方案。
This publication is licensed under the terms of your
institutional subscription.
Request reuse permissions.
本出版物根据您的机构订阅条款获得许可。申请重用权限。
版权 © 2024 美国化学学会
Introduction 简介
点击复制部分链接Section link copied!
组合优化问题(1,2)具有广泛的实际应用,包括电路设计(3)、药物发现(4)、机器学习(5)、通信(6)等。遗憾的是,这些问题通常属于非确定性多项式时间复杂(NP-hard)问题,使用数字计算机上的传统算法进行大规模计算时,解决这些问题的难度很大。(7) 因此,迫切需要新型计算架构来高效解决NP-hard问题。
最近,伊辛机(IM)成为NP-hard问题的高效求解器。(8) 这一发展源于组合优化问题可以映射到伊辛模型的基态搜索问题,而伊辛模型可以使用经典和量子物理系统形式的人工自旋网络来实现。(9) 过去几年,人们对集成光路进行了广泛探索和研究,并开发了各种集成光路架构,包括基于时间复用光参量振荡器(OPO)/光电子参量振荡器(OEPO)、(8,10−13)空间光调制器(SLM)、(14)混合光电子、(15)芯片级光子集成电路、(16,17)和超导量子退火器(18,19)的架构。
表1总结了最近提出的几种即时通讯及其关键性能指标,表明即时通讯根据其基本实现方案,在各个方面具有独特的优势。基于SLM的即时通讯可以实现大规模的全光自旋耦合,从而实现涉及多达75,000个Ising自旋的大规模计算。(14) D-wave系统利用量子退火(QA)方法,由于其量子隧穿特性,在某些类型的硬问题方面具有出色的求解时间, (20) 特别是与其他经典IM方案相比。(9) 此外,采用OPO/OEPO的IM在解决具有相当数量自旋的MAX-CUT问题方面显示出巨大的潜力,同时保持较低的计算延迟。(12,13) 与之前依赖离散大容量模块和设备的集成架构不同,Mach-Zehnder干涉仪(MZI)网格(16)等集成方案已被用于证明Ising模型的线性自旋耦合,从而展示了提高集成架构能效和计算密度的优势。
表1.各种Ising机器的关键参数a
scheme 方案 | graph scale 图形比例 | iteration latency (s/iteration) 迭代延迟(s/迭代) | energy efficiency 能效 | computing density (MAC/s/mm2) 计算密度(MAC/s/mm2) |
---|---|---|---|---|
SLM (14) | 75,076 | ∼1 × 10–2 | DD | |
OPO (12) | 100,512 | 2.47 × 10–5 | 17 pJ/MAC 17 pJ/MAC | DD |
OEPO (13) | 25,600 | 2.44 × 10–7 | DD | |
OE (15) OE(15) | 100 | 1 | DD | |
QA (21) 质量检查 (21) | >5000 >5000 | depended on annealing time 取决于退火时间 | chip 150 pW + cooler 25 kW 芯片 150 pW + 散热器 25 kW | |
MZI (16) MZI(16) | 64 | 3 × 10–9 | 0.28 pJ/MAC 0.28 pJ/MAC | 3.4 × 1010 |
OE in this work (predicted) 本作品中的原始能量(预测值) | 16,384 | 1.78 × 10–6 | 42.6 pJ/MAC (2.4 pJ/MAC) 42.6 pJ/MAC (2.4 pJ/MAC) | 2 × 109 (1.4 × 1010) |
SLM: spatial light modulator, QA: quantum annealing, O(OE)PO: optical (optoelectronic) parametric oscillator, MR: micro ring, MZI: Mach–Zehnder interferometer, OE: optoelectronic, MAC: multiply accumulate (including 1 multiplication and 1 accumulation), DD: discrete devices.
aSLM:空间光调制器,QA:量子退火,O(OE)PO:光学(光电子)参量振荡器,MR:微环,MZI:马赫-曾德干涉仪,OE:光电子,MAC:乘法累加(包括1次乘法和1次累加),DD:离散器件。
尽管之前取得了重大进展,但这些即时消息方案在实际应用中仍存在一些缺点,包括计算延迟高(基于SLM的即时消息每次迭代需要数十或数百毫秒(14))、能效低(D-wave系统冷却功率高(20,21))、基于离散大容量设备的即时消息计算密度低(12-15)、由于制造大规模空间多路复用设备复杂且不完善,自旋可扩展性相对较低(16,18),以及恶劣的工作条件导致的低稳定性。(12,20,21) 因此,迫切需要提出能够在上述所有方面表现出良好性能的解决方案,以加速IM的实际应用。
混合光电IM具有简单稳定的架构,是设备集成实现高计算密度的有前景的解决方案,尽管已报道的演示(5、15、22、23)仍依赖于离散的批量设备,包括电光(EO)调制器、光电探测器以及现场可编程门阵列(FPGA)与数字模拟转换器(DAC)和模拟数字转换器(ADC)的组合。特别是高速EO调制器为Ising模型提供了自然的内联非线性传递函数,使其能够绕过数字域中的额外模拟非线性(16),并在实现光学非线性时消除对高功率泵浦光源的需求(24-26)。同时,Ising计算所需的线性矩阵-向量乘法(MVM)通过数字硬件(例如,FPGA)来实现。这种策略有助于减少IM的时分复用(TDM)反馈方案中的计算延迟。 (15,27) 然而,实际应用仍然存在资源限制,例如功耗和成本因素导致的数字硬件可用性有限。这些限制反过来又限制了MVM的可扩展性,从而限制了Ising计算的可扩展性。因此,在受限的数字硬件资源范围内探索用于大规模计算的光电集成模块是可取的,特别是利用集成的片上器件。
本文通过实验演示了一种光电集成模块,该模块使用与InGaAs/InP光电探测器混合集成的薄膜铌酸锂(TFLN)调制器。为了实现高效的硬件资源利用率和低计算延迟,本文采用嵌入FPGA的稀疏矩阵-向量乘法(SpMV)算法,为光子集成器件实现反馈信号计算和数据调度。光电集成模块成功解决了稀疏配置中2048个旋转图的MAX-CUT问题。值得注意的是,稀疏图的每次迭代时间不超过1.78微秒。 此外,通过将耦合矩阵中携带信息的光信号耦合到TFLN器件中以取代连续波光,单个紧凑的片上TFLN调制器不仅提供了自然的非线性传递函数,而且在执行线性MVM所需的乘法运算中发挥了重要作用,支持了具有16384个自旋的大规模MAX-CUT问题的演示,这是片上Ising机器上解决的最大规模问题,表明即使在数字硬件资源有限的情况下,光电IM的计算规模也可以进一步扩大。此外,通过集成更高速、更低驱动电压的光调制器件,还可以进一步降低芯片级光电IM的计算延迟和功耗。光电IM的实测和预测参数列于表1中,以供更具体的参考。
除了高可扩展性、低计算延迟和潜在的高能效之外,利用高速电光调制设备的光电IM还具有一系列独特的优势,非常适合Ising计算的特定特性。与传统的AI计算不同,在传统的AI计算中,训练过程和推理过程通常是分开的,而Ising计算则将自旋的演化与对Ising哈密顿基态的追求相结合。(28) 在前一种AI场景中,训练过程可以离线进行,对参数更新(如权重调整)的速度没有严格要求。(29) 相反,后一种Ising场景通常需要实时更新和反馈自旋振幅,以最大限度地减少计算延迟,而高速EO调制设备在这方面具有明显的优势。此外,Ising模型中自旋之间的耦合边根据特定的组合优化问题类型而变化。 (9) 因此,由自旋耦合系数形成的耦合矩阵具有多样性,包括某些矩阵是非单位矩阵或稀疏矩阵的情况。这增加了从耦合矩阵到某些片上架构(如MZI网格)的映射复杂性。(30) 相比之下,基于高速片上电光调制器的时分复用(TDM)方案在映射这些耦合矩阵方面具有更高的灵活性。这种灵活性反过来又促进了与各种Ising问题解决方案的兼容性。因此,基于高速片上调制器的光电集成模块为可扩展、兼容且低延迟的Ising计算提供了有前景的解决方案。
本文的结构如下。光电集成模型原理部分简要介绍了光电集成模型,并提出了两种方案。实验与结果部分描述了实验实施过程,分为四个部分:TFLN芯片的表征、方案I和方案II的演示以及两种方案关键指标的比较。最后,在结论与讨论部分,我们总结了我们的工作,并对集成光电IM进行了进一步的讨论和展望。
Principles of Optoelectronic IM
光电集成制造原理
点击复制部分链接Section link copied!
如前文(15)所述,通过利用光电IM方案中整合的Mach-Zehnder调制器(MZM)的非线性传递函数(参见图1),迭代步骤k中Ising自旋网络中时间离散自旋x(k)的时间演化可以描述为
其中,η(k)表示高斯白噪声,cos2(·)遵循MZM的EO传递函数。(31) 反馈项f可以表示为
其中,α和β分别表示每次迭代过程中的反馈和耦合强度。Jij是自旋x和x之间的耦合系数,与Ising哈密顿H = −∑≠Jijσσ相关,其中σ = sign(x) ∈ {−1, 1}是相应自旋振幅的符号。在哈密顿动力学的驱动下,根据公式1和公式2进行足够次数的迭代后,耦合自旋的配置将收敛到哈密顿值更低的态,直到达到哈密顿值最低的基态。(28) 对于最大切割问题,与哈密顿值相关的切割值C可以写成 。(7,32) f(k+1)的计算可以映射到MVM中
其中,f(k+1)和x(k)是N×1向量,其第个元素分别是公式2中的f(k+1)和x(k)。W = αI + βJ = [ωij]是增益和耦合矩阵,其中I是单位矩阵,J是N×N矩阵,其第行和第列的元素是Jij。
Figure 1 图1
Figure 1. Schematic architecture of the optoelectronic Ising machine with 2 types of potential operational schemes.
图1.光电伊辛机的示意图,具有两种可能的运行方案。
根据上述推导,Ising自旋的演化是一个涉及线性MVM和非线性传递函数计算的过程。除了可通过MZM的EO传递函数实现非线性传递函数外,将公式3中的线性MVM有效地映射到光电IM架构中至关重要,特别是在数字硬件资源有限的情况下。如图1所示,本研究中采用了两种方案,其中方案I通过FPGA上的设计算法加速大规模MVM,方案II通过在受限的数字硬件资源下迁移OE调制上的乘法运算来提高能效和可扩展性。两种方案的详细信息可在方案I演示和方案II演示中找到。
Experiments and Results 实验和结果
点击复制部分链接Section link copied!
Characterization of the TFLN Chip
TFLN芯片的特性
如图1所示,光电IM主要由两部分组成:一个商用FPGA模块,与一个ADC和一个DAC结合使用;一个实验室制造的TFLN折叠MZM(FMZM),带有InGaAs/InP光电二极管(PD)。在介绍基于光电IM的实验结果之前,有必要先介绍和描述片上器件。
如图2a-c所示,该芯片包括两个作为光纤到芯片接口的栅格耦合器(GC)、一个FMZM和一个InGaAs/InP光电二极管。FMZM采用单驱动推挽式结构,包括一个波导交叉器、两个3 dB 1×2多模干涉仪耦合器和一个在终端连接50 Ω终端电阻器的U形共面行波电极(TWE)(关于TWE仿真的详细信息,请参见支持信息中的S1部分)。波导交叉应用于U型转弯部分,以保持每个U型转弯MZI臂相同的相位变化。FMZM的射频激励端口和光学输入GC都位于芯片的同一边缘,便于测试和封装时设置光学和电气接口。输出GC上采用倒装芯片封装了商用高速InGaAs/InP PD,其响应度为0.8A/W,3 dB带宽高达18 GHz。FMZM的详细制造信息可在支持信息S2部分找到。FMZM的总光损耗约为10 dB(片上插入损耗约为3 dB,GC的耦合损耗约为3.如图2d、e所示,测量了由FMZM和PD组成的整个芯片的半波电压Vπ为3 V,3 dB带宽为12.5 GHz。
Figure 2 图2
Figure 2. Microscopy image of a (a) partially enlarged detail of the waveguide crossing, the (b) TFLN chip, and a (c) partially enlarged detail of the bonded PD and terminator. (d) Measured results of Vπ. The linear scanning 200 kHz sawtooth input waveform (red dash) and PD output (or transmission, blue solid). (e) Measured bandwidth (S21 parameter) of the whole device.
图2.显微镜图像,显示(a)波导交叉部分放大细节、(b)TFLN芯片以及(c)粘合的PD和终端部分放大细节。(d)Vπ的测量结果。线性扫描200 kHz锯齿输入波形(红色虚线)和PD输出(或传输,蓝色实线)。(e)整个设备的测量带宽(S21参数)。
Demonstration of Scheme I
方案一的演示
在本节中,我们将演示图1中描述的光电IM方案I。该方案与之前的工作(15)有相似之处,其中,Ising演化中的非线性变换和线性MVM分别由光学芯片和电子FPGA执行。为了应对在有限数字硬件资源(具体来说,在我们的演示中,我们使用了一个Xilinx KU115 FPGA,与一个DAC和一个ADC配对,以2.6 GHz的采样率运行)的限制下进行低延迟大规模计算的特殊挑战,我们引入了嵌入在FPGA中的高效SpMV算法。
Methods of SpMV Based on FPGA
基于FPGA的SpMV方法
考虑到Ising网络中的自旋可能不是紧密耦合的,在许多组合优化问题中,任意实矩阵W(如公式3所示)通常采用稀疏形式。因此,引入SpMV算法以充分利用FPGA的硬件资源。在这种方法中,稀疏矩阵W被转换为压缩稀疏行(CSR)格式。如图3a所示的示例中,以CSR格式构造三个向量val、col和ptr。W中稀疏分布的非零元素以线性顺序密集存储在向量val中。同时,与W中每个非零元素相关的列索引记录在向量col中,每行中第一个非零元素的索引存储在ptr中。
Figure 3 图3
Figure 3. (a) Example for the CSR format of a sparse matrix. (b) Inner architecture of an AMU. (c) Architecture of a SpMV MACC.
图3.(a)稀疏矩阵的CSR格式示例。(b)AMU的内部架构。(c)SpMV MACC的架构。
接下来,将准备好的CSR格式加载到基于乘法累加核心(MACC)的计算架构中。向量val和col中具有相同行号的元素被输入到由一系列访问和乘法单元(AMU)组成的MACC中,后面是汇总树(ST),如图3c所示。如图3b所示,每个AMU都会对W的非零元素val和x的相关向量元素xcol(参考公式3)进行乘法运算,并输出乘积。具体来说,x的向量元素xcol是通过基于col中的索引从块RAM(BRAM)中访问的,然后与W的相应矩阵元素val乘以一个乘数。因此,MACC中所有来自AMU的输出结果都由ST累加,以获得方程3中所述向量f的目标元素。可以同时部署多个MACC,以获得向量f的多个元素,从而提高并行计算性能并减少延迟。
值得注意的是,上述计算过程只有在以下假设条件下才可行:W的每一行中非零元素的最大数量不超过单个MACC中的AMU数量。如果该假设被打破,可配置并行累加器和气泡层应额外参与,但只会略微影响方案的延迟和能耗,其详细信息可在支持信息S3部分中找到。应在逻辑资源、自旋数和计算延迟之间进行权衡,以便利用有限的硬件资源解决大规模伊辛问题。
Results of Scheme I 方案一的结果
方案I演示了两种MAX-CUT任务。第一个任务涉及一个32×64的棋盘图(总共包含2048次旋转),其耦合交互包含所有相邻的旋转对,如图4a所示。在这里,我们注意到棋盘图问题不属于NP-hard问题。(33) 然而,我们的分析旨在演示使用方案I的硬件高效实现线性MVM和非线性变换。演示包括20次独立试验,每次试验包含2000次迭代。每次迭代大约需要1.78微秒,相当于大约0.56兆赫的迭代率。如图4b所示,在参数α=1.2和β=1.0(如公式3所示)的试验中,在214次迭代内,可以100%的成功率实现最大截断值为4000的基态(GS)能量。图4c描述了20次独立试验中达到不同截断值(以GS值的百分比表示)的试验次数直方图。可以确认,所有试验至少达到了96%。GS 切割值的 5%,通过微调参数 α 和 β,可能会出现更好的性能。这表明光电 IM 在有效处理与大规模晶格图相关的 MAX-CUT 问题方面具有强大的性能。
Figure 4 图4
Figure 4. (a) Checkerboard graph. (b) Evolution of spin amplitudes and cut value of the checkerboard graph. (c) Success rate of the checkerboard graph. (d) G22 graph. (e) Evolution of spin amplitudes and cut value of the G22 graph. (f) Success rate of the G22 graph.
图4.(a) 棋盘图。(b) 棋盘图的旋转幅度和截断值变化。(c) 棋盘图的成功率。(d) G22图。(e) G22图的旋转幅度和截断值变化。(f) G22图的成功率。
为了进一步证明方案I在解决NP-hard挑战方面的能力,我们进一步探索了另一个MAX-CUT任务,该任务专注于G22图,该图被公认为NP-hard基准问题(34)。图4d描绘了G22图,该图由总共2000个旋转点组成,通过19990条边(矩阵W中有41980个非零元素)相互连接,这些边是随机且稀疏分布的。与之前棋盘图演示所采用的方法类似,进行了20次独立试验,每次包含2000次迭代,每次迭代的大约计算时间为1.78微秒。在这种情况下,参数设置为α=0.4和β=0.8。根据图4e所示的时间演化过程,自旋状态最终收敛到哈密顿图上的局部最小值,从而得出最佳切割值为13052。这个数值大约是13359(34)这个最著名(BK)数值的97.6%。图4f表明,所有20次试验的切割值都超过了BK值的96.6%。 值得注意的是,如图4e所示,即使在900次迭代后,切值接近最大切值时,自旋振幅不均匀性仍然存在。这种不均匀性往往会导致自旋状态陷入局部最优状态(35),这可以通过动态控制参数(22)或添加辅助误差变量来应用优化方法来缓解。 (36)
Demonstration of Scheme II
方案二的演示
在光电IM的方案一中,公式3中的线性矩阵乘法仅在FPGA内执行。虽然SpMV算法的使用有效地利用了FPGA资源并减少了计算延迟,但必须承认,进一步提高IM的可扩展性可能会遇到可用FPGA硬件资源的瓶颈,特别是在处理超大规模Ising自旋耦合网络(例如自旋数超过10,000)时。虽然专用集成电路(ASIC)或高级数字模块可能会在一定程度上提高性能,但它们仍然受到芯片元件密度进步和摩尔定律所决定的进步速度的限制。因此,探索替代解决方案以进一步提高光电集成模块的可扩展性变得势在必行,即使在有限的电子硬件资源限制下也是如此。方案二在图1和图5所示的光电集成模块上实施,它提出了一种新颖的架构,旨在克服这些限制。
Figure 5 图5
Figure 5. Schematic diagram for IM with proposed Scheme II.
图5. 采用方案二的IM示意图。
Methods of SpVM by Modulations
调制SpVM的方法
方案I演示部分介绍的SpMV算法也用于方案II,但它在策略上将MVM运算中的乘法从FPGA重新定位到片上FMZM。如图5所示,N×N稀疏矩阵W(参考公式3)的非零元素按行顺序提取并排列成Nnz维向量W̅,其中Nnz表示W̅中非零元素的总数。同时,选择向量f(k)(参见公式1)中与W中非零元素的列索引对应的元素,并将其组织成一个Nnz维向量 。
光信号在强度上通过向量W̅的元素进行串行调制,并耦合到片上FMZM中。同时,向量 被串行编码以驱动片上FMZM,由于非线性(正弦形)EO调制传递函数,产生光强度信号向量 (参见公式1)。通过在时域中仔细对齐向量W̅和 中的相应元素,PD可以提取元素级乘积结果 (×代表Hadamard乘积)。然后,使用FPGA在数字域中执行MVM操作的累加,最终生成具有N个自旋的所需矢量f(k+1)。对于N次旋转, 中的元素根据W中对应矩阵元素的行索引被分成N个部分,然后在数字处理器中分别累加为f(k+1),= 1,2,···,N。因此,方案II利用TDM策略,逐次处理每个元素乘法,导致延迟随着非零元素的增加而线性增加。
值得注意的是,自旋状态和耦合矩阵的元素都以光强度(即非负实数)的形式编码。为了在Ising模型中同时处理正负实数自旋和权重的计算,自旋和权重数据的相反副本也应编码在光上。关于该技术的更多细节,请参见支持信息中的S4部分。
Experimental Setup of Scheme II
方案二的实验设置
鉴于实验中使用的DAC支持的通道数量有限,FPGA的实时处理被离线处理所取代。如图5中的原理图所示,这种替代方法使用多通道任意波形发生器(AWG,Keysight M8195,8位分辨率)和实时示波器(RTO,Tektronix DSA73304D,8位分辨率)执行。6。25 Gb/s强度调制光信号,由AWG的一个通道生成的向量 中的元素串行编码,耦合到片上FMZM中。同时,FMZM在正交点偏置,并由AWG另一个通道的向量 中的元素编码的6.25 Gb/s电信号驱动。需要强调的是,两个信号的时间域对齐是通过调整软件定义的通道延迟来实现的,每个信号分别携带 和 向量的元素。,电信号振幅的峰峰值接近Vπ),FMZM不仅对 进行适当的非线性变换,还促进哈达玛乘积 。之后,光信号由片上PD检测。最后,产生的电信号由电子放大器放大,并由RTO采样,通过计算机(英特尔酷睿i7-8700)进行离线累积运算,以生成所需的矢量 。
Results of Scheme II 方案二的结果
通过展示一个包含16384次旋转的128×128棋盘图,证明了光电IM方案II在解决大规模问题上的可行性。在图6a所示的时间演化过程中,可以观察到旋转幅度在前100次迭代中逐渐趋于饱和,同时截断值C也迅速增加。在随后的迭代中,从第100次迭代到第2000次迭代,自旋在达到饱和状态之前会发生翻转,而截断值C逐渐接近基态。
Figure 6 图6
Figure 6. (a) Evolution of spins amplitudes and cut value of checkerboard graph in Scheme II. (b–f) snapshots of the spin graph at the 2nd, 20th, 50th, 100th, 200th, 500th, 1000th, and 2000th iterations.
图6.(a)方案II中棋盘图旋转振幅和截断值的演变。(b-f)旋转图在第2、20、50、100、200、500、1000和2000次迭代时的快照。
图6b显示了第2、20、50、100、200、500、1000和2000次迭代时演化图的变化。在最初的100次迭代中,观察到棋盘图上自旋“域”的形成。在随后的收敛阶段,自旋翻转,导致大多数域缩小,最终达到稳定的全局稳定状态。这种模式收敛到基态,有力地证明了光电IM的方案II能够对自旋产生增益和耦合效应,引导演化到所需的基态。
Comparison on Key Metrics of Schemes I and II
方案I和方案II的关键指标对比
为了全面评估光电子IM方案I和II的可行性,表2对这两个方案的关键指标(延迟和能效)进行了比较。这些指标至关重要,因为它们与操作和流程有关,而操作和流程的延迟和能耗会随着Ising自旋数N或非零数NNz的增加而增加。
表2.所提方案的比较a
scheme 方案 | electrical processing devices 电子处理设备 | data Baudate 数据 波特率 | time complexity 时间复杂度 | scalability 可扩展性 | transmission latency (s/iteration) | operation latency (s/iteration) | energy efficiency |
---|---|---|---|---|---|---|---|
Scheme I 方案一 | FPGA (Xilinx KU115) and ADC/DAC (2.6 GS/s) FPGA(Xilinx KU115)和ADC/DAC(2.6 GS/s) | 2.6 GBaud 2.6 GBaud | ![]() | N = 2000 Nnz = 41,980 | ∼0.5 × 10–6 | ∼1.28 × 10–6 | 51.9 pJ/MAC (L) +35 pJ/symbol (NL) |
Scheme II 方案二 | PC (Intel Core i7–8700) and AWG, OSC PC(英特尔酷睿i7-8700)和AWG、OSC | 6.25 GBaud 6.25 GBaud | ![]() | N = 16,384 Nnz = 81,408 | ∼0.4 | ∼370 × 10–6 | 42.6 pJ/MAC (total) |
Scheme II (predicted) 方案二(预测) | FPGA | 100 GBaud 100 GBaud | ![]() | N > 16,384 Nnz > 81,408 | ∼0.5 × 10–6 | ∼2.14 × 10–6 | 2.4 pJ/MAC (total) |
AWG: arbitrary waveform generator, OSC: oscilloscope, N: number of spins, Nnz: number of nonzeros, L/NL: linear/nonlinear operation.
aAWG:任意波形发生器,OSC:示波器,N:旋转次数,Nnz:非零次数,L/NL:线性/非线性操作。
Latency 延迟
演示中的光电IM延迟可分为两部分:传输延迟和操作延迟。
传输延迟是指光电IM各个组成单元之间传输数据所需的时间。可通过FPGA和ADC/DAC之间的通信协议或远程处理软件Matlab中的探查器进行测量。
操作延迟主要关注的是MVM操作在Ising计算中所需的时间,因为在这两种方案的演示中,两个非线性变换都是基于片上FMZM在线实现的。FPGA计算延迟可参照FPGA设计工具vivado,调制延迟可通过自旋数(表2中的N,方案I)/非零元素数(表2中的NNZ,方案II)与ADC/DAC和调制波特率B的乘积进行评估,即N×B或NNZ×B。对于方案I,总操作延迟可评估为MVM操作的FPGA计算延迟和非线性变换的调制延迟的总持续时间。相比之下,方案II仅包括累加的FPGA计算延迟和乘法和非线性变换的调制延迟。
如表2所示,在方案I的演示中,自旋数(表2中的N)为2048时,每次迭代可达到1.78μs的延迟,其中数据传输占约0.5μs,剩余约1.28μs用于线性MVM操作。这种低延迟可归因于高效的SpMV算法、高度并行的计算单元以及FPGA上用于数据调度的有效流水线。
方案II的传输延迟较大,主要原因是AWG、RTO和计算机之间的远程数据通信时间较长(演示设置见图5)。在方案II的实施中,通过引入多通道DAC/ADC与FPGA结合进行实时处理,可以显著降低传输延迟。如表2第三行所示,通过利用Xilinx KU115 FPGA与以2.6 GHz采样率运行的多通道DAC和ADC相结合,预计方案II的传输延迟约为0.5μs,这是通过将Xilinx KU115 FPGA与工作在2.6 GHz采样率的多通道DAC和ADC结合使用而实现的,请参考方案I的演示。
在方案II演示中,使用16384自旋的棋盘图时,操作延迟也达到了约370微秒。如果方案II演示中使用支持100 Gbaud信号的光调制器(37,38)PD、(39)和ADC/DAC(40)(参见表2第三行),则该值预计将降至约2.14微秒。
Energy Efficiency 能源效率
本节还将评估能量效率,即每次操作的能量成本。在此,线性MVM和Ising计算中的非线性变换的能耗是主要关注点。
对于方案I,如方案I演示中所述,线性MVM(定义为包含乘法和累加(MAC)的单个操作)使用FPGA实现,能效为51.9 pJ/MAC。Ising计算所需的非线性变换使用片上FMZM执行,能效为35 pJ/符号。更详细的计算结果可在支持信息S5部分中找到。
在方案II中,乘法和非线性变换同时由片上FMZM执行,工作电压为3 V,能效为14.4 pJ/OP。随后,使用FPGA进行累加,能效为28.2 pJ/OP。因此,方案II的总能效为42.6 pJ/MAC。通过整合工作在高达100 GHz的高带宽下且具有小于1 V的低Vπ特性的先进片上MZM,(17,37)方案II有望将能效显著提高至低至2.4 pJ/MAC。详细计算结果见支持信息中的S5部分。
Conclusions and Discussion
结论与讨论
点击复制部分链接Section link copied!
本文通过实验演示了两种光电IM方案,使用与InGaAs/InP PD混合集成的片上TFLN调制器。在方案I中,片上TFLN调制器执行Ising演化的非线性变换,而Ising演化的反馈信号计算中的线性MVM运算则由电子FPGA执行。FPGA中嵌入的SpMV算法被引入,以有效利用有限的硬件资源,并实现低延迟的大规模Ising计算。对于具有2048次旋转的图,MAX-CUT任务已成功实现,每次迭代的计算延迟仅为1.78μs。为了在数字硬件限制下提高可扩展性,引入了方案II,其中单个片上TFLN调制器不仅提供了自然的非线性传递函数,还可在MVM操作中执行乘法。后者对于解决可用FPGA硬件资源的瓶颈至关重要,特别是在处理大规模计算时。基于方案II,演示了具有16384次旋转的大规模MAX-CUT任务。 这是有史以来在片上集成光电子模块上解决的最大规模问题,这凸显了光电子集成模块计算规模扩展的潜力,即使在数字资源有限的情况下也是如此。
在展示的作品中,只有一组简单的TFLN调制器和PD集成在芯片上。展望未来,ASIC、DAC和ADC(在展示的演示中,它们是FPGA、AWG和示波器的替代组件)以及光源都有潜力通过共封装光学技术(41)集成或共封装在同一芯片上,以实现更小的尺寸、更高的带宽和更低的能耗。在方案I和方案II中,使用EO调制将非线性变换与EO信号转换和倍增无缝结合,从而无需为非线性变换花费额外的计算时间和能量。这种方法与全光片上非线性(24,25)形成对比,后者尽管延迟和能耗低,但需要额外的片上光子元件和脉冲泵浦源。因此,EO调制在这些方案中更受欢迎。
虽然基于高速EO调制设备的TDM方案已经证明了其具有高可扩展性、低计算延迟、与多种问题解决场景兼容以及提高计算效率的潜力等特点,但在未来的研究中引入多维多路复用策略是可能的。这些策略可能涉及将TDM与空分复用(SDM)和波分复用(WDM)技术相结合。这种集成方案旨在充分利用不同策略的优势,从而显著提升光电集成模块的性能。
Strategy of TDM Combined with SDM
TDM与SDM相结合的策略
光子器件的典型尺寸通常与光波长相当,从数百纳米到几微米不等,这限制了集成光子元件的密度。这种密度无法达到电子元件的水平,因为电子元件的典型尺寸要小得多,通常只有几纳米左右。(42) 因此,基于集成光子学的集成光路的空间可扩展性存在固有的局限性。如果考虑到大规模SDM光子器件的制造和实施复杂性,这一挑战将更加严峻。
然而,高速光电子器件能够实现具有高计算吞吐量的大规模光学计算,即使空间集成的光子/光电子元件的数量大大低于电子元件的数量。如图7b所示,基于方案II演示部分中介绍的方案II,部署4组并行高速MZM(37,38)和PD(39)(例如,100 GHz带宽)可以为Ising计算中的大规模MVM提供具有竞争力的计算速度(例如每秒400×109次乘法)。这与图7a中描述的电子方案形成鲜明对比,该方案以相对较低的计算时钟速率使用大规模SDM元件。此外,光波导在单个光子元件之间的传输损耗低,有助于提高基于集成光子学的大规模集成模块的能源效率。
Figure 7 图7
Figure 7. Schematic diagrams of the architectures utilizing TDM and SDM for MVM computing based on (a) digital electronics (Scheme I) and (b) OE modulation (Scheme II). Inset: TDM vs SDM that commonly contribute computation throughput. (c) Concept architecture with additionally introduced WDM.
图7.基于(a)数字电子(方案I)和(b)OE调制(方案II)的利用TDM和SDM进行MVM计算的架构示意图。插图:通常有助于计算吞吐量的TDM与SDM。(c)额外引入WDM的概念架构。
Strategy of TDM Combined with WDM and SDM
TDM与WDM和SDM相结合的策略
理论上,当使用WDM策略时,可以引入额外的计算维度。如图7c所示,通过将不同波长下具有不同权值的光载波(例如,公式3中矩阵W中的元素)注入图1中所示的片上TDM IM,可以在Ising模型中实现线性和非线性操作。在空间维度上扩展IM架构可以进一步提高计算规模。
然而,在实际应用中,可重构片上光谱调制器能够适应多波长负载,但往往存在占用片上空间资源的限制。(43,44)这意味着,在此背景下,波长维度并不完全独立于空间维度。因此,WDM策略面临的挑战与SDM策略面临的挑战类似,即可扩展性有限。
然而,WDM策略为在芯片上实现用于Ising计算的光信号扇入和扇出提供了一种有吸引力的方法。PD对多个波长的直接检测为多个信号提供了稳健的、相位不敏感的强度累积。(45−50) 因此,利用高灵敏度PD检测,原则上不需要额外的能量,因为光子-电子转换效率高,(51) 节省了数字累积中相当一部分的功耗,如“能源效率”一节所述。此外,通过WDM滤波器将不同波长扇出到多个计算核心,与基于功率分束的方案相比,可以实现相对较低的插入损耗。
Supporting Information 支持信息
点击复制部分链接Section link copied!
The Supporting Information is available free of charge at: https://pubs.acs.org/doi/10.1021/acsphotonics.4c00003.
支持信息可免费获取,网址:https://pubs.acs.org/doi/10.1021/acsphotonics.4c00003。
Additional details about simulation of the travel-wave electrodes, fabrication of the on-chip devices, multiplication-accumulation core with configurable parallel accumulator and bubble layers, real-value spin and weight calculation based on intensity modulation, and calculation of energy efficiency (PDF)
关于仿真行波电极、片上器件的制造、带有可配置并行累加器的乘积累加核和气泡层、基于强度调制的实值自旋和重量计算以及能量效率计算的更多详细信息(PDF)
Terms & Conditions 条款与条件
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
大多数电子辅助信息文件无需订阅ACS网络版即可使用。此类文件可按文章下载,用于研究(如果相关文章有公共使用许可,则该许可可能允许其他用途)。可通过权利链接许可系统向ACS申请其他用途的许可:http://pubs.acs.org/page/copyright/permissions.html。
Acknowledgments
This work was supported by the Innovation Program for Quantum Science and Technology (2021ZD0301401), National Natural Science Foundation of China (62335019), Key Technologies Research and Development Program (2019YFA0706300), National Natural Science Foundation of China-Guangdong Joint Fund (U2001601), National Natural Science Foundation of China (62135012), and National Natural Science Foundation of China (61961146003).
References
This article references 51 other publications.
- 1Korte, B. H.; Vygen, J.; Korte, B.; Vygen, J. Combinatorial Optimization; Springer, 2011; Vol. 1.Google ScholarThere is no corresponding record for this reference.
- 2Lucas, A. Ising formulations of many NP problems. Front. Phys. 2014, 2, 5, DOI: 10.3389/fphy.2014.00005
- 3Terada, K.; Oku, D.; Kanamaru, S.; Tanaka, S.; Hayashi, M.; Yamaoka, M.; Yanagisawa, M.; Togawa, N. An Ising model mapping to solve rectangle packing problem. 2018 International Symposium on VLSI Design, Automation and Test (VLSI-DAT): Hsinchu, Taiwan, China, 2018; pp 1– 4.
- 4Mao, Z. T.; Matsuda, Y.; Tamura, R.; Tsuda, K. Chemical design with GPU-based Ising machines. Digital Discovery 2023, 2, 1098– 1103, DOI: 10.1039/D3DD00047H
- 5Bohm, F.; Alonso-Urquijo, D.; Verschaffelt, G.; Van der Sande, G. Noise-injected analog Ising machines enable ultrafast statistical sampling and machine learning. Nat. Commun. 2022, 13, 5847, DOI: 10.1038/s41467-022-33441-3
- 6Singh, A. K.; Kapelyan, A.; Venturelli, D.; Jamieson, K. Uplink MIMO Detection using Ising Machines: A Multi-Stage Ising Approach. arXiv 2023, arXiv:2304.12830Google ScholarThere is no corresponding record for this reference.
accessed March 3, 2024
- 7Garey, M. R.; Johnson, D. S. Computers and Intractability: A Guide to the Theory of NP-Completeness; W. H. Freeman & Co., 1979.Google ScholarThere is no corresponding record for this reference.
- 8Inagaki, T.; Haribara, Y.; Igarashi, K.; Sonobe, T.; Tamate, S.; Honjo, T.; Marandi, A.; McMahon, P. L.; Umeki, T.; Enbutsu, K. A coherent Ising machine for 2000-node optimization problems. Science 2016, 354, 603– 606, DOI: 10.1126/science.aah4243Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhslKgt7vO&md5=d1b192a4cda4bc082720a06a5ee3f016A coherent Ising machine for 2000-node optimization problemsInagaki, Takahiro; Haribara, Yoshitaka; Igarashi, Koji; Sonobe, Tomohiro; Tamate, Shuhei; Honjo, Toshimori; Marandi, Alireza; McMahon, Peter L.; Umeki, Takeshi; Enbutsu, Koji; Tadanaga, Osamu; Takenouchi, Hirokazu; Aihara, Kazuyuki; Kawarabayashi, Ken-ichi; Inoue, Kyo; Utsunomiya, Shoko; Takesue, HirokiScience (Washington, DC, United States) (2016), 354 (6312), 603-606CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)The anal. and optimization of complex systems can be reduced to math. problems collectively known as combinatorial optimization. Many such problems can be mapped onto ground-state search problems of the Ising model, and various artificial spin systems are now emerging as promising approaches. However, phys. Ising machines have suffered from limited nos. of spin-spin couplings because of implementations based on localized spins, resulting in severe scalability problems. We report a 2000-spin network with all-to-all spin-spin couplings. Using a measurement and feedback scheme, we coupled time-multiplexed degenerate optical parametric oscillators to implement max. cut problems on arbitrary graph topologies with up to 2000 nodes. Our coherent Ising machine outperformed simulated annealing in terms of accuracy and computation time for a 2000-node complete graph.
- 9Mohseni, N.; McMahon, P. L.; Byrnes, T. Ising machines as hardware solvers of combinatorial optimization problems. Nat. Rev. Phys. 2022, 4, 363– 379, DOI: 10.1038/s42254-022-00440-8
- 10Marandi, A.; Wang, Z.; Takata, K.; Byer, R. L.; Yamamoto, Y. Network of time-multiplexed optical parametric oscillators as a coherent Ising machine. Nat. Photonics 2014, 8, 937– 942, DOI: 10.1038/nphoton.2014.249Google Scholar10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhvVSqsbfK&md5=24e403b7bb5e5cb0219cec93604cf73dNetwork of time-multiplexed optical parametric oscillators as a coherent Ising machineMarandi, Alireza; Wang, Zhe; Takata, Kenta; Byer, Robert L.; Yamamoto, YoshihisaNature Photonics (2014), 8 (12), 937-942CODEN: NPAHBY; ISSN:1749-4885. (Nature Publishing Group)Finding the ground states of the Ising Hamiltonian maps to various combinatorial optimization problems in biol., medicine, wireless communications, artificial intelligence and social network. So far, no efficient classical and quantum algorithm is known for these problems and intensive research is focused on creating phys. systems-Ising machines-capable of finding the abs. or approx. ground states of the Ising Hamiltonian. Here, we report an Ising machine using a network of degenerate optical parametric oscillators (OPOs). Spins are represented with above-threshold binary phases of the OPOs and the Ising couplings are realized by mutual injections. The network is implemented in a single OPO ring cavity with multiple trains of femtosecond pulses and configurable mutual couplings, and operates at room temp. We programmed a small non-deterministic polynomial time-hard problem on a 4-OPO Ising machine and in 1,000 runs no computational error was detected.
- 11McMahon, P. L.; Marandi, A.; Haribara, Y.; Hamerly, R.; Langrock, C.; Tamate, S.; Inagaki, T.; Takesue, H.; Utsunomiya, S.; Aihara, K.; Byer, R. L.; Fejer, M. M.; Mabuchi, H.; Yamamoto, Y. A fully programmable 100-spin coherent Ising machine with all-to-all connections. Science 2016, 354, 614– 617, DOI: 10.1126/science.aah5178Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhslKgt7vJ&md5=6fd0e9a1f4095c3a934fe663208f50cfA fully programmable 100-spin coherent Ising machine with all-to-all connectionsMcMahon, Peter L.; Marandi, Alireza; Haribara, Yoshitaka; Hamerly, Ryan; Langrock, Carsten; Tamate, Shuhei; Inagaki, Takahiro; Takesue, Hiroki; Utsunomiya, Shoko; Aihara, Kazuyuki; Byer, Robert L.; Fejer, M. M.; Mabuchi, Hideo; Yamamoto, YoshihisaScience (Washington, DC, United States) (2016), 354 (6312), 614-617CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)Unconventional, special-purpose machines may aid in accelerating the soln. of some of the hardest problems in computing, such as large-scale combinatorial optimizations, by exploiting different operating mechanisms than those of std. digital computers. We present a scalable optical processor with electronic feedback that can be realized at large scale with room-temp. technol. Our prototype machine is able to find exact solns. of, or sample good approx. solns. to, a variety of hard instances of Ising problems with up to 100 spins and 10,000 spin-spin connections.
- 12Honjo, T.; Sonobe, T.; Inaba, K.; Inagaki, T.; Ikuta, T.; Yamada, Y.; Kazama, T.; Enbutsu, K.; Umeki, T.; Kasahara, R.; Kawarabayashi, K. I.; Takesue, H. 100,000-spin coherent Ising machine. Sci. Adv. 2021, 7, eabh0952 DOI: 10.1126/sciadv.abh0952
- 13Cen, Q.; Ding, H.; Hao, T.; Guan, S.; Qin, Z.; Lyu, J.; Li, W.; Zhu, N.; Xu, K.; Dai, Y.; Li, M. Large-scale coherent Ising machine based on optoelectronic parametric oscillator. Light: Sci. Appl. 2022, 11, 333, DOI: 10.1038/s41377-022-01013-1Google Scholar13https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XivFOlsr3K&md5=bc3c3af3172ac8ba371b7fd242e5452cLarge-scale coherent Ising machine based on optoelectronic parametric oscillatorCen, Qizhuang; Ding, Hao; Hao, Tengfei; Guan, Shanhong; Qin, Zhiqiang; Lyu, Jiaming; Li, Wei; Zhu, Ninghua; Xu, Kun; Dai, Yitang; Li, MingLight: Science & Applications (2022), 11 (1), 333CODEN: LSAIAZ; ISSN:2047-7538. (Nature Portfolio)Abstr.: Ising machines based on analog systems have the potential to accelerate the soln. of ubiquitous combinatorial optimization problems. Although some artificial spins to support large-scale Ising machines have been reported, e.g., superconducting qubits in quantum annealers and short optical pulses in coherent Ising machines, the spin stability is fragile due to the ultra-low equiv. temp. or optical phase sensitivity. In this paper, we propose to use short microwave pulses generated from an optoelectronic parametric oscillator as the spins to implement a large-scale Ising machine with high stability. The proposed machine supports 25,600 spins and can operate continuously and stably for hours. Moreover, the proposed Ising machine is highly compatible with high-speed electronic devices for programmability, paving a low-cost, accurate, and easy-to-implement way toward solving real-world optimization problems.
- 14Pierangeli, D.; Marcucci, G.; Conti, C. Large-Scale Photonic Ising Machine by Spatial Light Modulation. Phys. Rev. Lett. 2019, 122, 213902, DOI: 10.1103/PhysRevLett.122.213902
- 15Bohm, F.; Verschaffelt, G.; Van der Sande, G. A poor man’s coherent Ising machine based on opto-electronic feedback systems for solving optimization problems. Nat. Commun. 2019, 10, 3538, DOI: 10.1038/s41467-019-11484-3
- 16Prabhu, M.; Roques-Carmes, C.; Shen, Y.; Harris, N.; Jing, L.; Carolan, J.; Hamerly, R.; Baehr-Jones, T.; Hochberg, M.; Čeperić, V.; Joannopoulos, J. D.; Englund, D. R.; Soljačić, M. Accelerating recurrent Ising machines in photonic integrated circuits. Optica 2020, 7, 551– 558, DOI: 10.1364/OPTICA.386613Google Scholar16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXitVSnsbnO&md5=f51b284e5331d41d9440d61ad4075739Accelerating recurrent Ising machines in photonic integrated circuitsPrabhu, Mihika; Roques-Carmes, Charles; Shen, Yichen; Harris, Nicholas; Jing, Li; Carolan, Jacques; Hamerly, Ryan; Baehr-Jones, Tom; Hochberg, Michael; Ceperic, Vladimir; Joannopoulos, John D.; Englund, Dirk R.; Soljacic, MarinOptica (2020), 7 (5), 551-558CODEN: OPTIC8; ISSN:2334-2536. (Optical Society of America)Conventional computing architectures have no known efficient algorithms for combinatorial optimization tasks such as the Ising problem, which requires finding the ground state spin configuration of an arbitrary Ising graph. Phys. Ising machines have recently been developed as an alternative to conventional exact and heuristic solvers; however, these machines typically suffer from decreased ground state convergence probability or universality for high edgedensity graphs or arbitrary graph wts., resp. We exptl. demonstrate a proof-of-principle integrated nanophotonic recurrent Ising sampler (INPRIS), using a hybrid scheme combining electronics and silicon-on-insulator photonics, that is capable of converging to the ground state of various four-spin graphs with high probability. The INPRIS results indicate that noise may be used as a resource to speed up the ground state search and to explore larger regions of the phase space, thus allowing one to probe noise-dependent phys. observables. Since the recurrent photonic transformation that our machine imparts is a fixed function of the graph problem and therefore compatible with optoelectronic architectures that support GHz clock rates (such as passive or non-volatile photonic circuits that do not require reprogramming at each iteration), this work suggests the potential for future systems that could achieve ordersof-magnitude speedups in exploring the soln. space of combinatorially hard problems.
- 17Roques-Carmes, C.; Shen, Y.; Zanoci, C.; Prabhu, M.; Atieh, F.; Jing, L.; Dubcek, T.; Mao, C.; Johnson, M. R.; Ceperic, V.; Joannopoulos, J. D.; Englund, D.; Soljacic, M. Heuristic recurrent algorithms for photonic Ising machines. Nat. Commun. 2020, 11, 249, DOI: 10.1038/s41467-019-14096-zGoogle Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXivFOgt74%253D&md5=94d7fbd538dd7e344221d1eec17dc92cHeuristic recurrent algorithms for photonic Ising machinesRoques-Carmes, Charles; Shen, Yichen; Zanoci, Cristian; Prabhu, Mihika; Atieh, Fadi; Jing, Li; Dubcek, Tena; Mao, Chenkai; Johnson, Miles R.; Ceperic, Vladimir; Joannopoulos, John D.; Englund, Dirk; Soljacic, MarinNature Communications (2020), 11 (1), 249CODEN: NCAOBW; ISSN:2041-1723. (Nature Research)Abstr.: The inability of conventional electronic architectures to efficiently solve large combinatorial problems motivates the development of novel computational hardware. There has been much effort toward developing application-specific hardware across many different fields of engineering, such as integrated circuits, memristors, and photonics. However, unleashing the potential of such architectures requires the development of algorithms which optimally exploit their fundamental properties. Here, we present the Photonic Recurrent Ising Sampler (PRIS), a heuristic method tailored for parallel architectures allowing fast and efficient sampling from distributions of arbitrary Ising problems. Since the PRIS relies on vector-to-fixed matrix multiplications, we suggest the implementation of the PRIS in photonic parallel networks, which realize these operations at an unprecedented speed. The PRIS provides sample solns. to the ground state of Ising models, by converging in probability to their assocd. Gibbs distribution. The PRIS also relies on intrinsic dynamic noise and eigenvalue dropout to find ground states more efficiently. Our work suggests speedups in heuristic methods via photonic implementations of the PRIS.
- 18Johnson, M. W.; Amin, M. H. S.; Gildert, S.; Lanting, T.; Hamze, F.; Dickson, N.; Harris, R.; Berkley, A. J.; Johansson, J.; Bunyk, P. Quantum annealing with manufactured spins. Nature 2011, 473, 194– 198, DOI: 10.1038/nature10012Google Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXlvFGmurw%253D&md5=b4073f82731cf0080ef71f85148281acQuantum annealing with manufactured spinsJohnson, M. W.; Amin, M. H. S.; Gildert, S.; Lanting, T.; Hamze, F.; Dickson, N.; Harris, R.; Berkley, A. J.; Johansson, J.; Bunyk, P.; Chapple, E. M.; Enderud, C.; Hilton, J. P.; Karimi, K.; Ladizinsky, E.; Ladizinsky, N.; Oh, T.; Perminov, I.; Rich, C.; Thom, M. C.; Tolkacheva, E.; Truncik, C. J. S.; Uchaikin, S.; Wang, J.; Wilson, B.; Rose, G.Nature (London, United Kingdom) (2011), 473 (7346), 194-198CODEN: NATUAS; ISSN:0028-0836. (Nature Publishing Group)Many interesting but practically intractable problems can be reduced to that of finding the ground state of a system of interacting spins; however, finding such a ground state remains computationally difficult. It is believed that the ground state of some naturally occurring spin systems can be effectively attained through a process called quantum annealing. If it could be harnessed, quantum annealing might improve on known methods for solving certain types of problem. However, phys. investigation of quantum annealing has been largely confined to microscopic spins in condensed-matter systems. Here we use quantum annealing to find the ground state of an artificial Ising spin system comprising an array of eight superconducting flux quantum bits with programmable spin-spin couplings. We observe a clear signature of quantum annealing, distinguishable from classical thermal annealing through the temp. dependence of the time at which the system dynamics freezes. Our implementation can be configured in situ to realize a wide variety of different spin networks, each of which can be monitored as it moves towards a low-energy configuration. This programmable artificial spin network bridges the gap between the theor. study of ideal isolated spin networks and the exptl. investigation of bulk magnetic samples. Moreover, with an increased no. of spins, such a system may provide a practical phys. means to implement a quantum algorithm, possibly allowing more-effective approaches to solving certain classes of hard combinatorial optimization problems.
- 19Harris, R.; Sato, Y.; Berkley, A. J.; Reis, M.; Altomare, F.; Amin, M. H.; Boothby, K.; Bunyk, P.; Deng, C.; Enderud, C. Phase transitions in a programmable quantum spin glass simulator. Science 2018, 361, 162– 165, DOI: 10.1126/science.aat2025Google Scholar19https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhtlahsbjK&md5=59a1879e8e72e04ea46650419416b588Phase transitions in a programmable quantum spin glass simulatorHarris, R.; Sato, Y.; Berkley, A. J.; Reis, M.; Altomare, F.; Amin, M. H.; Boothby, K.; Bunyk, P.; Deng, C.; Enderud, C.; Huang, S.; Hoskinson, E.; Johnson, M. W.; Ladizinsky, E.; Ladizinsky, N.; Lanting, T.; Li, R.; Medina, T.; Molavi, R.; Neufeld, R.; Oh, T.; Pavlov, I.; Perminov, I.; Poulin-Lamarre, G.; Rich, C.; Smirnov, A.; Swenson, L.; Tsai, N.; Volkmann, M.; Whittaker, J.; Yao, J.Science (Washington, DC, United States) (2018), 361 (6398), 162-165CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)Understanding magnetic phases in quantum mech. systems is one of the essential goals in condensed matter physics, and the advent of prototype quantum simulation hardware has provided new tools for exptl. probing such systems. Here, the authors report on the exptl. realization of a quantum simulation of interacting Ising spins on 3-dimensional cubic lattices up to dimensions of 8x8x8 on a D-Wave processor. The ability to control and read out the state of individual spins provides direct access to several order parameters, which they used to det. the lattice's magnetic phases as well as crit. disorder and one of its universal exponents. By tuning the degree of disorder and effective transverse magnetic field, the authors obsd. phase transitions between a paramagnetic, an antiferromagnetic and a spin glass phase.
- 20Mandrà, S.; Katzgraber, H. G. A deceptive step towards quantum speedup detection. Quantum Sci. Technol. 2018, 3, 04LT01, DOI: 10.1088/2058-9565/aac8b2Google ScholarThere is no corresponding record for this reference.
- 21D-Wave Systems Inc. Advantage Data Sheet, 2022; https://www.dwavesys.com/media/htjclcey/advantage_datasheet_v10.pdf, (accessed March 3, 2024).Google ScholarThere is no corresponding record for this reference.
- 22Li, Z. H.; Liu, J.; Yu, S. Y. A Dynamic Time-Evolution Control Method to Improve the Performance of Optoelectronic Coherent Ising Machine. OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXPOSITION (OFC): San Diego, California, United States, 2021; p Tu1H.4.Google ScholarThere is no corresponding record for this reference.
- 23Mwamsojo, N.; Lehmann, F.; Merghem, K.; Benkelfat, B. E.; Frignac, Y. Optoelectronic coherent Ising machine for combinatorial optimization problems. Opt. Lett. 2023, 48, 2150– 2153, DOI: 10.1364/OL.485215Google ScholarThere is no corresponding record for this reference.
- 24Cheng, Z. Z.; Tsang, H. K.; Wang, X. M.; Xu, K.; Xu, J. B. In-Plane Optical Absorption and Free Carrier Absorption in Graphene-on-Silicon Waveguides. IEEE J. Sel. Top. Quantum Electron. 2014, 20, 43– 48, DOI: 10.1109/JSTQE.2013.2263115Google ScholarThere is no corresponding record for this reference.
- 25Li, G. H.; Sekine, R.; Nehra, R.; Gray, R. M.; Ledezma, L.; Guo, Q.; Marandi, A. All-optical ultrafast ReLU function for energy-efficient nanophotonic deep learning. Nanophotonics 2023, 12, 847– 855, DOI: 10.1515/nanoph-2022-0137Google ScholarThere is no corresponding record for this reference.
- 26Zuo, Y.; Li, B. H.; Zhao, Y. J.; Jiang, Y.; Chen, Y. C.; Chen, P.; Jo, G. B.; Liu, J. W.; Du, S. W. All-optical neural network with nonlinear activation functions. Optica 2019, 6, 1132– 1137, DOI: 10.1364/OPTICA.6.001132Google ScholarThere is no corresponding record for this reference.
- 27Chen, Z.; Li, Z.; Deng, Z.; Liu, J.; Yu, S. An Optoelectronic Analog Ising Machine Enabling 2048-Spin and Low-Latency Calculations. OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXPOSITION (OFC): San Diego, California, United States, 2023; p M2J.2.Google ScholarThere is no corresponding record for this reference.
- 28Wang, Z.; Marandi, A.; Wen, K.; Byer, R. L.; Yamamoto, Y. Coherent Ising machine based on degenerate optical parametric oscillators. Phys. Rev. A 2013, 88, 063853, DOI: 10.1103/PhysRevA.88.063853Google Scholar28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhsVOmtbc%253D&md5=3c16e68bdbec4014cad6e56f14e88c60Coherent Ising machine based on degenerate optical parametric oscillatorsWang, Zhe; Marandi, Alireza; Wen, Kai; Byer, Robert L.; Yamamoto, YoshihisaPhysical Review A: Atomic, Molecular, and Optical Physics (2013), 88 (6-B), 063853/1-063853/9CODEN: PLRAAN; ISSN:1050-2947. (American Physical Society)A degenerate optical parametric oscillator network is proposed to solve the NP-hard problem of finding a ground state of the Ising model. The underlying operating mechanism originates from the bistable output phase of each oscillator and the inherent preference of the network in selecting oscillation modes with the min. photon decay rate. Computational expts. are performed on all instances reducible to the NP-hard MAX-CUT problems on cubic graphs of order up to 20. The numerical results reasonably suggest the effectiveness of the proposed network.
- 29LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436– 444, DOI: 10.1038/nature14539Google Scholar29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXht1WlurzP&md5=8bc52a4d89944b3bc753cad905aba9e1Deep learningLeCun, Yann; Bengio, Yoshua; Hinton, GeoffreyNature (London, United Kingdom) (2015), 521 (7553), 436-444CODEN: NATUAS; ISSN:0028-0836. (Nature Publishing Group)Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
- 30Shen, Y.; Harris, N. C.; Skirlo, S.; Prabhu, M.; Baehr-Jones, T.; Hochberg, M.; Sun, X.; Zhao, S.; Larochelle, H.; Englund, D.; Soljačić, M. Deep learning with coherent nanophotonic circuits. Nat. Photonics 2017, 11, 441– 446, DOI: 10.1038/nphoton.2017.93Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtVSjt7bJ&md5=a7ac081815d879784c8242be80225c20Deep learning with coherent nanophotonic circuitsShen, Yichen; Harris, Nicholas C.; Skirlo, Scott; Prabhu, Mihika; Baehr-Jones, Tom; Hochberg, Michael; Sun, Xin; Zhao, Shijie; Larochelle, Hugo; Englund, Dirk; Soljacic, MarinNature Photonics (2017), 11 (7), 441-446CODEN: NPAHBY; ISSN:1749-4885. (Nature Publishing Group)Artificial neural networks are computational network models inspired by signal processing in the brain. These models have dramatically improved performance for many machine-learning tasks, including speech and image recognition. However, today's computing hardware is inefficient at implementing neural networks, in large part because much of it was designed for von Neumann computing schemes. Significant effort has been made towards developing electronic architectures tuned to implement artificial neural networks that exhibit improved computational speed and accuracy. Here, we propose a new architecture for a fully optical neural network that, in principle, could offer an enhancement in computational speed and power efficiency over state-of-the-art electronics for conventional inference tasks. We exptl. demonstrate the essential part of the concept using a programmable nanophotonic processor featuring a cascaded array of 56 programmable Mach-Zehnder interferometers in a silicon photonic integrated circuit and show its utility for vowel recognition.
- 31He, M. B. High-performance hybrid silicon and lithium niobate Mach-Zehnder modulators for 100 Gbit s and beyond. Nat. Photonics 2019, 13, 359– 365, DOI: 10.1038/s41566-019-0378-6Google ScholarThere is no corresponding record for this reference.
- 32Haribara, Y.; Utsunomiya, S.; Yamamoto, Y. Principles and Methods of Quantum Information Technologies. Lecture Notes in Physics. Chapter 12; Springer Japan, 2016; pp 251– 262.Google ScholarThere is no corresponding record for this reference.
- 33Cipra, B. A. The Ising model is NP-complete. SIAM News 2000, 33, 1– 3Google ScholarThere is no corresponding record for this reference.
- 34Kochenberger, G. A.; Hao, J. K.; Lü, Z.; Wang, H. B.; Glover, F. Solving large scale Max Cut problems via tabu search. J. Heuristics 2013, 19, 565– 571, DOI: 10.1007/s10732-011-9189-8Google ScholarThere is no corresponding record for this reference.
- 35Leleu, T.; Yamamoto, Y.; Utsunomiya, S.; Aihara, K. Combinatorial optimization using dynamical phase transitions in driven-dissipative systems. Phys. Rev. E 2017, 95, 022118, DOI: 10.1103/PhysRevE.95.022118Google ScholarThere is no corresponding record for this reference.
- 36Leleu, T.; Yamamoto, Y.; McMahon, P. L.; Aihara, K. Destabilization of Local Minima in Analog Spin Systems by Correction of Amplitude Heterogeneity. Phys. Rev. Lett. 2019, 122, 040607, DOI: 10.1103/PhysRevLett.122.040607Google ScholarThere is no corresponding record for this reference.
- 37Xu, M. Y.; Zhu, Y. T.; Pittalà, F.; Tang, J.; He, M. B.; Ng, W. C.; Wang, J. Y.; Ruan, Z. L.; Tang, X. F.; Kuschnerov, M.; Liu, L.; Yu, S. Y.; Zheng, B. F.; Cai, X. L. Dual-polarization thin-film lithium niobate in-phase quadrature modulators for terabit-per-second transmission. Optica 2022, 9, 61– 62, DOI: 10.1364/OPTICA.449691Google Scholar37https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38Xmslalt78%253D&md5=a324bbfc85df57fd47c598396180cf18Dual-polarization thin-film lithium niobate in-phase quadrature modulators for terabit-per-second transmissionXu, Mengyue; Zhu, Yuntao; Pittala, Fabio; Tang, Jin; He, Mingbo; Ng, Wing Chau; Wang, Jingyi; Ruan, Ziliang; Tang, Xuefeng; Kuschnerov, Maxim; Liu, Liu; Yu, Siyuan; Zheng, Bofang; Cai, XinlunOptica (2022), 9 (1), 61-62CODEN: OPTIC8; ISSN:2334-2536. (Optica Publishing Group)We report, to our knowledge, the first dual-polarization thin-film lithium niobate coherent modulator for next-generation optical links with sub-1-V driving voltage and 110-GHz bandwidth, enabling a record single-wavelength 1.96-Tb/s net data rate with ultrahigh energy efficiency.
- 38Han, C.; Zheng, Z.; Shu, H.; Jin, M.; Qin, J.; Chen, R.; Tao, Y.; Shen, B.; Bai, B.; Yang, F. Slow-light silicon modulator with 110-GHz bandwidth. Sci. Adv. 2023, 9, eadi5339 DOI: 10.1126/sciadv.adi5339Google ScholarThere is no corresponding record for this reference.
- 39Maes, D.; Reis, L.; Poelman, S.; Vissers, E.; Avramovic, V.; Zaknoune, M.; Roelkens, G.; Lemey, S.; Peytavit, E.; Kuyken, B. High-Speed Photodiodes on Silicon Nitride with a Bandwidth beyond 100 GHz. Conference on Lasers and Electro-Optics; San Jose: California, United States, 2022; p SM3K.3.Google ScholarThere is no corresponding record for this reference.
- 40Nagatani, M.; Wakita, H.; Jyo, T.; Takeya, T.; Yamazaki, H.; Ogiso, Y.; Mutoh, M.; Shiratori, Y.; Ida, M.; Hamaoka, F.; Nakamura, M.; Kobayashi, T.; Takahashi, H.; Miyamoto, Y. 110-GHz-Bandwidth InP-HBT AMUX/ADEMUX Circuits for Beyond-1-Tb/s/ch Digital Coherent Optical Transceivers. IEEE Custom Integrated Circuits Conference (CICC); Newport Beach: California, United States, 2022; pp 1– 8.Google ScholarThere is no corresponding record for this reference.
- 41Tan, M.; Xu, J.; Liu, S.; Feng, J.; Zhang, H.; Yao, C.; Chen, S.; Guo, H.; Han, G.; Wen, Z. Co-packaged optics (CPO): status, challenges, and solutions. Front. Optoelectron. 2023, 16, 1, DOI: 10.1007/s12200-022-00055-yGoogle ScholarThere is no corresponding record for this reference.
- 42McMahon, P. L. The physics of optical computing. Nat. Rev. Phys. 2023, 5, 717– 734, DOI: 10.1038/s42254-023-00645-5Google ScholarThere is no corresponding record for this reference.
- 43El Srouji, L.; Krishnan, A.; Ravichandran, R.; Lee, Y.; On, M.; Xiao, X.; Ben Yoo, S. J. Photonic and optoelectronic neuromorphic computing. APL Photonics 2022, 7, 051101, DOI: 10.1063/5.0072090Google ScholarThere is no corresponding record for this reference.
- 44Peserico, N.; Shastri, B. J.; Sorger, V. J. Integrated Photonic Tensor Processing Unit for a Matrix Multiply: A Review. J. Lightwave Technol. 2023, 41, 3704– 3716, DOI: 10.1109/JLT.2023.3269957Google ScholarThere is no corresponding record for this reference.
- 45Yang, L.; Ji, R.; Zhang, L.; Ding, J.; Xu, Q. On-chip CMOS-compatible optical signal processor. Opt. Express 2012, 20, 13560, DOI: 10.1364/OE.20.013560Google ScholarThere is no corresponding record for this reference.
- 46Tait, A. N.; Nahmias, M. A.; Shastri, B. J.; Prucnal, P. R. Broadcast and Weight: An Integrated Network For Scalable Photonic Spike Processing. J. Lightwave Technol. 2014, 32, 4029– 4041, DOI: 10.1109/JLT.2014.2345652Google ScholarThere is no corresponding record for this reference.
- 47Xu, X.; Tan, M.; Corcoran, B.; Wu, J.; Boes, A.; Nguyen, T. G.; Chu, S. T.; Little, B. E.; Hicks, D. G.; Morandotti, R.; Mitchell, A.; Moss, D. J. 11 TOPS photonic convolutional accelerator for optical neural networks. Nature 2021, 589, 44– 51, DOI: 10.1038/s41586-020-03063-0Google Scholar47https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXnsFWisA%253D%253D&md5=337a489b249066f40b6d4cd0a0e1e23c11 TOPS photonic convolutional accelerator for optical neural networksXu, Xingyuan; Tan, Mengxi; Corcoran, Bill; Wu, Jiayang; Boes, Andreas; Nguyen, Thach G.; Chu, Sai T.; Little, Brent E.; Hicks, Damien G.; Morandotti, Roberto; Mitchell, Arnan; Moss, David J.Nature (London, United Kingdom) (2021), 589 (7840), 44-51CODEN: NATUAS; ISSN:0028-0836. (Nature Research)Abstr.: Convolutional neural networks, inspired by biol. visual cortex systems, are a powerful category of artificial neural networks that can ext. the hierarchical features of raw data to provide greatly reduced parametric complexity and to enhance the accuracy of prediction. They are of great interest for machine learning tasks such as computer vision, speech recognition, playing board games and medical diagnosis1-7. Optical neural networks offer the promise of dramatically accelerating computing speed using the broad optical bandwidths available. Here we demonstrate a universal optical vector convolutional accelerator operating at more than ten TOPS (trillions (1012) of operations per s, or tera-ops per s), generating convolutions of images with 250,000 pixels-sufficiently large for facial image recognition. We use the same hardware to sequentially form an optical convolutional neural network with ten output neurons, achieving successful recognition of handwritten digit images at 88 per cent accuracy. Our results are based on simultaneously interleaving temporal, wavelength and spatial dimensions enabled by an integrated microcomb source. This approach is scalable and trainable to much more complex networks for demanding applications such as autonomous vehicles and real-time video recognition.
- 48Feldmann, J.; Youngblood, N.; Karpov, M.; Gehring, H.; Li, X.; Stappers, M.; Le Gallo, M.; Fu, X.; Lukashchuk, A.; Raja, A. S. Parallel convolutional processing using an integrated photonic tensor core. Nature 2021, 589, 52– 58, DOI: 10.1038/s41586-020-03070-1Google Scholar48https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXnsFWisw%253D%253D&md5=d2b46186c9d0e0fb4baae3ee7e174ce9Parallel convolutional processing using an integrated photonic tensor coreFeldmann, J.; Youngblood, N.; Karpov, M.; Gehring, H.; Li, X.; Stappers, M.; Le Gallo, M.; Fu, X.; Lukashchuk, A.; Raja, A. S.; Liu, J.; Wright, C. D.; Sebastian, A.; Kippenberg, T. J.; Pernice, W. H. P.; Bhaskaran, H.Nature (London, United Kingdom) (2021), 589 (7840), 52-58CODEN: NATUAS; ISSN:0028-0836. (Nature Research)With the proliferation of ultrahigh-speed mobile networks and internet-connected devices, along with the rise of artificial intelligence (AI), the world is generating exponentially increasing amts. of data that need to be processed in a fast and efficient way. Highly parallelized, fast and scalable hardware is therefore becoming progressively more important. Here we demonstrate a computationally specific integrated photonic hardware accelerator (tensor core) that is capable of operating at speeds of trillions of multiply-accumulate operations per s (1012 MAC operations per s or tera-MACs per s). The tensor core can be considered as the optical analog of an application-specific integrated circuit (ASIC). It achieves parallelized photonic in-memory computing using phase-change-material memory arrays and photonic chip-based optical frequency combs (soliton microcombs). The computation is reduced to measuring the optical transmission of reconfigurable and non-resonant passive components and can operate at a bandwidth exceeding 14 GHz, limited only by the speed of the modulators and photodetectors. Given recent advances in hybrid integration of soliton microcombs at microwave line rates, ultralow-loss silicon nitride waveguides, and high-speed on-chip detectors and modulators, our approach provides a path towards full complementary metal-oxide-semiconductor (CMOS) wafer-scale integration of the photonic tensor core. Although we focus on convolutional processing, more generally our results indicate the potential of integrated photonics for parallel, fast, and efficient computational hardware in data-heavy AI applications such as autonomous driving, live video processing, and next-generation cloud computing services.
- 49Shi, B.; Calabretta, N.; Stabile, R. Deep Neural Network Through an InP SOA-Based Photonic Integrated Cross-Connect. IEEE J. Sel. Top. Quantum Electron. 2020, 26, 1– 11, DOI: 10.1109/JSTQE.2019.2945548Google ScholarThere is no corresponding record for this reference.
- 50Zhong, Z.; Yang, M.; Lang, J.; Williams, C.; Kronman, L.; Sludds, A.; Esfahanizadeh, H.; Englund, D.; Ghobadi, M. Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference. Proceedings of the ACM SIGCOMM 2023 Conference: New York, United States, 2023; pp 452– 472.Google ScholarThere is no corresponding record for this reference.
- 51Hamerly, R.; Bernstein, L.; Sludds, A.; Soljačić, M.; Englund, D. Large-Scale Optical Neural Networks Based on Photoelectric Multiplication. Phys. Rev. X 2019, 9, 021032, DOI: 10.1103/PhysRevX.9.021032Google Scholar51https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXhsFOrurnI&md5=9e3f8685ffd80e84ace0659c72c7eaacLarge-Scale Optical Neural Networks Based on Photoelectric MultiplicationHamerly, Ryan; Bernstein, Liane; Sludds, Alexander; Soljacic, Marin; Englund, DirkPhysical Review X (2019), 9 (2), 021032CODEN: PRXHAE; ISSN:2160-3308. (American Physical Society)Recent success in deep neural networks has generated strong interest in hardware accelerators to improve speed and energy consumption. This paper presents a new type of photonic accelerator based on coherent detection that is scalable to large (N⪆106) networks and can be operated at high (gigahertz) speeds and very low (subattojoule) energies per multiply and accumulate (MAC), using the massive spatial multiplexing enabled by std. free-space optical components. In contrast to previous approaches, both wts. and inputs are optically encoded so that the network can be reprogrammed and trained on the fly. Simulations of the network using models for digit and image classification reveal a "std. quantum limit" for optical neural networks, set by photodetector shot noise. This bound, which can be as low as 50 zJ/MAC, suggests that performance below the thermodn. (Landauer) limit for digital irreversible computation is theor. possible in this device. The proposed accelerator can implement both fully connected and convolutional networks. We also present a scheme for backpropagation and training that can be performed in the same hardware. This architecture will enable a new class of ultralow-energy processors for deep learning.
Cited By
This article is cited by 3 publications.
- Yuan Gao, Guanyu Chen, Luo Qi, Wujie Fu, Zifeng Yuan, Aaron J. Danner. Photonic Ising machines for combinatorial optimization problems. Applied Physics Reviews 2024, 11
(4)
https://doi.org/10.1063/5.0216656
- Zhixian Zhou, Zhenhua Li, Zihao Chen, Jie Liu, Siyuan Yu. An Optoelectronic Ising Machine with Low-Cost FPGA for 10,000-Spin High-Accuracy Calculations. 2024, 1-3. https://doi.org/10.1109/ACP/IPOC63121.2024.10809451
- Xin Ye, Wenjia Zhang, Zuyuan He. InteGrated Spatial Photonic Ising Sampler Based on High-Uniformity 1 × 8 Multi-Mode Interferometer. 2024, 1-4. https://doi.org/10.1109/ACP/IPOC63121.2024.10810070
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.
Recommended Articles
Correction to “Scalable On-Chip Optoelectronic Ising Machine Utilizing Thin-Film Lithium Niobate Photonics”
Photonic Crystal Cavity IQ Modulators in Thin-Film Lithium Niobate
High-Frequency and High-Linearity Lithium Niobate Electro-optic Modulator
Micro-transfer Printed Thin Film Lithium Niobate (TFLN)-on-Silicon Ring Modulator
Si Microring Resonator Crossbar Array for On-Chip Inference and Training of the Optical Neural Network
Abstract
Figure 1
Figure 1. Schematic architecture of the optoelectronic Ising machine with 2 types of potential operational schemes.
Figure 2
Figure 2. Microscopy image of a (a) partially enlarged detail of the waveguide crossing, the (b) TFLN chip, and a (c) partially enlarged detail of the bonded PD and terminator. (d) Measured results of Vπ. The linear scanning 200 kHz sawtooth input waveform (red dash) and PD output (or transmission, blue solid). (e) Measured bandwidth (S21 parameter) of the whole device.
Figure 3
Figure 3. (a) Example for the CSR format of a sparse matrix. (b) Inner architecture of an AMU. (c) Architecture of a SpMV MACC.
Figure 4
Figure 4. (a) Checkerboard graph. (b) Evolution of spin amplitudes and cut value of the checkerboard graph. (c) Success rate of the checkerboard graph. (d) G22 graph. (e) Evolution of spin amplitudes and cut value of the G22 graph. (f) Success rate of the G22 graph.
Figure 5
Figure 5. Schematic diagram for IM with proposed Scheme II.
Figure 6
Figure 6. (a) Evolution of spins amplitudes and cut value of checkerboard graph in Scheme II. (b–f) snapshots of the spin graph at the 2nd, 20th, 50th, 100th, 200th, 500th, 1000th, and 2000th iterations.
Figure 7
Figure 7. Schematic diagrams of the architectures utilizing TDM and SDM for MVM computing based on (a) digital electronics (Scheme I) and (b) OE modulation (Scheme II). Inset: TDM vs SDM that commonly contribute computation throughput. (c) Concept architecture with additionally introduced WDM.
References
This article references 51 other publications.
- 1Korte, B. H.; Vygen, J.; Korte, B.; Vygen, J. Combinatorial Optimization; Springer, 2011; Vol. 1.There is no corresponding record for this reference.
- 2Lucas, A. Ising formulations of many NP problems. Front. Phys. 2014, 2, 5, DOI: 10.3389/fphy.2014.00005There is no corresponding record for this reference.
- 3Terada, K.; Oku, D.; Kanamaru, S.; Tanaka, S.; Hayashi, M.; Yamaoka, M.; Yanagisawa, M.; Togawa, N. An Ising model mapping to solve rectangle packing problem. 2018 International Symposium on VLSI Design, Automation and Test (VLSI-DAT): Hsinchu, Taiwan, China, 2018; pp 1– 4.There is no corresponding record for this reference.
- 4Mao, Z. T.; Matsuda, Y.; Tamura, R.; Tsuda, K. Chemical design with GPU-based Ising machines. Digital Discovery 2023, 2, 1098– 1103, DOI: 10.1039/D3DD00047HThere is no corresponding record for this reference.
- 5Bohm, F.; Alonso-Urquijo, D.; Verschaffelt, G.; Van der Sande, G. Noise-injected analog Ising machines enable ultrafast statistical sampling and machine learning. Nat. Commun. 2022, 13, 5847, DOI: 10.1038/s41467-022-33441-3There is no corresponding record for this reference.
- 6Singh, A. K.; Kapelyan, A.; Venturelli, D.; Jamieson, K. Uplink MIMO Detection using Ising Machines: A Multi-Stage Ising Approach. arXiv 2023, arXiv:2304.12830There is no corresponding record for this reference.
accessed March 3, 2024
- 7Garey, M. R.; Johnson, D. S. Computers and Intractability: A Guide to the Theory of NP-Completeness; W. H. Freeman & Co., 1979.There is no corresponding record for this reference.
- 8Inagaki, T.; Haribara, Y.; Igarashi, K.; Sonobe, T.; Tamate, S.; Honjo, T.; Marandi, A.; McMahon, P. L.; Umeki, T.; Enbutsu, K. A coherent Ising machine for 2000-node optimization problems. Science 2016, 354, 603– 606, DOI: 10.1126/science.aah42438https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhslKgt7vO&md5=d1b192a4cda4bc082720a06a5ee3f016A coherent Ising machine for 2000-node optimization problemsInagaki, Takahiro; Haribara, Yoshitaka; Igarashi, Koji; Sonobe, Tomohiro; Tamate, Shuhei; Honjo, Toshimori; Marandi, Alireza; McMahon, Peter L.; Umeki, Takeshi; Enbutsu, Koji; Tadanaga, Osamu; Takenouchi, Hirokazu; Aihara, Kazuyuki; Kawarabayashi, Ken-ichi; Inoue, Kyo; Utsunomiya, Shoko; Takesue, HirokiScience (Washington, DC, United States) (2016), 354 (6312), 603-606CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)The anal. and optimization of complex systems can be reduced to math. problems collectively known as combinatorial optimization. Many such problems can be mapped onto ground-state search problems of the Ising model, and various artificial spin systems are now emerging as promising approaches. However, phys. Ising machines have suffered from limited nos. of spin-spin couplings because of implementations based on localized spins, resulting in severe scalability problems. We report a 2000-spin network with all-to-all spin-spin couplings. Using a measurement and feedback scheme, we coupled time-multiplexed degenerate optical parametric oscillators to implement max. cut problems on arbitrary graph topologies with up to 2000 nodes. Our coherent Ising machine outperformed simulated annealing in terms of accuracy and computation time for a 2000-node complete graph.
- 9Mohseni, N.; McMahon, P. L.; Byrnes, T. Ising machines as hardware solvers of combinatorial optimization problems. Nat. Rev. Phys. 2022, 4, 363– 379, DOI: 10.1038/s42254-022-00440-8There is no corresponding record for this reference.
- 10Marandi, A.; Wang, Z.; Takata, K.; Byer, R. L.; Yamamoto, Y. Network of time-multiplexed optical parametric oscillators as a coherent Ising machine. Nat. Photonics 2014, 8, 937– 942, DOI: 10.1038/nphoton.2014.24910https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhvVSqsbfK&md5=24e403b7bb5e5cb0219cec93604cf73dNetwork of time-multiplexed optical parametric oscillators as a coherent Ising machineMarandi, Alireza; Wang, Zhe; Takata, Kenta; Byer, Robert L.; Yamamoto, YoshihisaNature Photonics (2014), 8 (12), 937-942CODEN: NPAHBY; ISSN:1749-4885. (Nature Publishing Group)Finding the ground states of the Ising Hamiltonian maps to various combinatorial optimization problems in biol., medicine, wireless communications, artificial intelligence and social network. So far, no efficient classical and quantum algorithm is known for these problems and intensive research is focused on creating phys. systems-Ising machines-capable of finding the abs. or approx. ground states of the Ising Hamiltonian. Here, we report an Ising machine using a network of degenerate optical parametric oscillators (OPOs). Spins are represented with above-threshold binary phases of the OPOs and the Ising couplings are realized by mutual injections. The network is implemented in a single OPO ring cavity with multiple trains of femtosecond pulses and configurable mutual couplings, and operates at room temp. We programmed a small non-deterministic polynomial time-hard problem on a 4-OPO Ising machine and in 1,000 runs no computational error was detected.
- 11McMahon, P. L.; Marandi, A.; Haribara, Y.; Hamerly, R.; Langrock, C.; Tamate, S.; Inagaki, T.; Takesue, H.; Utsunomiya, S.; Aihara, K.; Byer, R. L.; Fejer, M. M.; Mabuchi, H.; Yamamoto, Y. A fully programmable 100-spin coherent Ising machine with all-to-all connections. Science 2016, 354, 614– 617, DOI: 10.1126/science.aah517811https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhslKgt7vJ&md5=6fd0e9a1f4095c3a934fe663208f50cfA fully programmable 100-spin coherent Ising machine with all-to-all connectionsMcMahon, Peter L.; Marandi, Alireza; Haribara, Yoshitaka; Hamerly, Ryan; Langrock, Carsten; Tamate, Shuhei; Inagaki, Takahiro; Takesue, Hiroki; Utsunomiya, Shoko; Aihara, Kazuyuki; Byer, Robert L.; Fejer, M. M.; Mabuchi, Hideo; Yamamoto, YoshihisaScience (Washington, DC, United States) (2016), 354 (6312), 614-617CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)Unconventional, special-purpose machines may aid in accelerating the soln. of some of the hardest problems in computing, such as large-scale combinatorial optimizations, by exploiting different operating mechanisms than those of std. digital computers. We present a scalable optical processor with electronic feedback that can be realized at large scale with room-temp. technol. Our prototype machine is able to find exact solns. of, or sample good approx. solns. to, a variety of hard instances of Ising problems with up to 100 spins and 10,000 spin-spin connections.
- 12Honjo, T.; Sonobe, T.; Inaba, K.; Inagaki, T.; Ikuta, T.; Yamada, Y.; Kazama, T.; Enbutsu, K.; Umeki, T.; Kasahara, R.; Kawarabayashi, K. I.; Takesue, H. 100,000-spin coherent Ising machine. Sci. Adv. 2021, 7, eabh0952 DOI: 10.1126/sciadv.abh0952There is no corresponding record for this reference.
- 13Cen, Q.; Ding, H.; Hao, T.; Guan, S.; Qin, Z.; Lyu, J.; Li, W.; Zhu, N.; Xu, K.; Dai, Y.; Li, M. Large-scale coherent Ising machine based on optoelectronic parametric oscillator. Light: Sci. Appl. 2022, 11, 333, DOI: 10.1038/s41377-022-01013-113https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XivFOlsr3K&md5=bc3c3af3172ac8ba371b7fd242e5452cLarge-scale coherent Ising machine based on optoelectronic parametric oscillatorCen, Qizhuang; Ding, Hao; Hao, Tengfei; Guan, Shanhong; Qin, Zhiqiang; Lyu, Jiaming; Li, Wei; Zhu, Ninghua; Xu, Kun; Dai, Yitang; Li, MingLight: Science & Applications (2022), 11 (1), 333CODEN: LSAIAZ; ISSN:2047-7538. (Nature Portfolio)Abstr.: Ising machines based on analog systems have the potential to accelerate the soln. of ubiquitous combinatorial optimization problems. Although some artificial spins to support large-scale Ising machines have been reported, e.g., superconducting qubits in quantum annealers and short optical pulses in coherent Ising machines, the spin stability is fragile due to the ultra-low equiv. temp. or optical phase sensitivity. In this paper, we propose to use short microwave pulses generated from an optoelectronic parametric oscillator as the spins to implement a large-scale Ising machine with high stability. The proposed machine supports 25,600 spins and can operate continuously and stably for hours. Moreover, the proposed Ising machine is highly compatible with high-speed electronic devices for programmability, paving a low-cost, accurate, and easy-to-implement way toward solving real-world optimization problems.
- 14Pierangeli, D.; Marcucci, G.; Conti, C. Large-Scale Photonic Ising Machine by Spatial Light Modulation. Phys. Rev. Lett. 2019, 122, 213902, DOI: 10.1103/PhysRevLett.122.213902There is no corresponding record for this reference.
- 15Bohm, F.; Verschaffelt, G.; Van der Sande, G. A poor man’s coherent Ising machine based on opto-electronic feedback systems for solving optimization problems. Nat. Commun. 2019, 10, 3538, DOI: 10.1038/s41467-019-11484-3There is no corresponding record for this reference.
- 16Prabhu, M.; Roques-Carmes, C.; Shen, Y.; Harris, N.; Jing, L.; Carolan, J.; Hamerly, R.; Baehr-Jones, T.; Hochberg, M.; Čeperić, V.; Joannopoulos, J. D.; Englund, D. R.; Soljačić, M. Accelerating recurrent Ising machines in photonic integrated circuits. Optica 2020, 7, 551– 558, DOI: 10.1364/OPTICA.38661316https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXitVSnsbnO&md5=f51b284e5331d41d9440d61ad4075739Accelerating recurrent Ising machines in photonic integrated circuitsPrabhu, Mihika; Roques-Carmes, Charles; Shen, Yichen; Harris, Nicholas; Jing, Li; Carolan, Jacques; Hamerly, Ryan; Baehr-Jones, Tom; Hochberg, Michael; Ceperic, Vladimir; Joannopoulos, John D.; Englund, Dirk R.; Soljacic, MarinOptica (2020), 7 (5), 551-558CODEN: OPTIC8; ISSN:2334-2536. (Optical Society of America)Conventional computing architectures have no known efficient algorithms for combinatorial optimization tasks such as the Ising problem, which requires finding the ground state spin configuration of an arbitrary Ising graph. Phys. Ising machines have recently been developed as an alternative to conventional exact and heuristic solvers; however, these machines typically suffer from decreased ground state convergence probability or universality for high edgedensity graphs or arbitrary graph wts., resp. We exptl. demonstrate a proof-of-principle integrated nanophotonic recurrent Ising sampler (INPRIS), using a hybrid scheme combining electronics and silicon-on-insulator photonics, that is capable of converging to the ground state of various four-spin graphs with high probability. The INPRIS results indicate that noise may be used as a resource to speed up the ground state search and to explore larger regions of the phase space, thus allowing one to probe noise-dependent phys. observables. Since the recurrent photonic transformation that our machine imparts is a fixed function of the graph problem and therefore compatible with optoelectronic architectures that support GHz clock rates (such as passive or non-volatile photonic circuits that do not require reprogramming at each iteration), this work suggests the potential for future systems that could achieve ordersof-magnitude speedups in exploring the soln. space of combinatorially hard problems.
- 17Roques-Carmes, C.; Shen, Y.; Zanoci, C.; Prabhu, M.; Atieh, F.; Jing, L.; Dubcek, T.; Mao, C.; Johnson, M. R.; Ceperic, V.; Joannopoulos, J. D.; Englund, D.; Soljacic, M. Heuristic recurrent algorithms for photonic Ising machines. Nat. Commun. 2020, 11, 249, DOI: 10.1038/s41467-019-14096-z17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXivFOgt74%253D&md5=94d7fbd538dd7e344221d1eec17dc92cHeuristic recurrent algorithms for photonic Ising machinesRoques-Carmes, Charles; Shen, Yichen; Zanoci, Cristian; Prabhu, Mihika; Atieh, Fadi; Jing, Li; Dubcek, Tena; Mao, Chenkai; Johnson, Miles R.; Ceperic, Vladimir; Joannopoulos, John D.; Englund, Dirk; Soljacic, MarinNature Communications (2020), 11 (1), 249CODEN: NCAOBW; ISSN:2041-1723. (Nature Research)Abstr.: The inability of conventional electronic architectures to efficiently solve large combinatorial problems motivates the development of novel computational hardware. There has been much effort toward developing application-specific hardware across many different fields of engineering, such as integrated circuits, memristors, and photonics. However, unleashing the potential of such architectures requires the development of algorithms which optimally exploit their fundamental properties. Here, we present the Photonic Recurrent Ising Sampler (PRIS), a heuristic method tailored for parallel architectures allowing fast and efficient sampling from distributions of arbitrary Ising problems. Since the PRIS relies on vector-to-fixed matrix multiplications, we suggest the implementation of the PRIS in photonic parallel networks, which realize these operations at an unprecedented speed. The PRIS provides sample solns. to the ground state of Ising models, by converging in probability to their assocd. Gibbs distribution. The PRIS also relies on intrinsic dynamic noise and eigenvalue dropout to find ground states more efficiently. Our work suggests speedups in heuristic methods via photonic implementations of the PRIS.
- 18Johnson, M. W.; Amin, M. H. S.; Gildert, S.; Lanting, T.; Hamze, F.; Dickson, N.; Harris, R.; Berkley, A. J.; Johansson, J.; Bunyk, P. Quantum annealing with manufactured spins. Nature 2011, 473, 194– 198, DOI: 10.1038/nature1001218https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXlvFGmurw%253D&md5=b4073f82731cf0080ef71f85148281acQuantum annealing with manufactured spinsJohnson, M. W.; Amin, M. H. S.; Gildert, S.; Lanting, T.; Hamze, F.; Dickson, N.; Harris, R.; Berkley, A. J.; Johansson, J.; Bunyk, P.; Chapple, E. M.; Enderud, C.; Hilton, J. P.; Karimi, K.; Ladizinsky, E.; Ladizinsky, N.; Oh, T.; Perminov, I.; Rich, C.; Thom, M. C.; Tolkacheva, E.; Truncik, C. J. S.; Uchaikin, S.; Wang, J.; Wilson, B.; Rose, G.Nature (London, United Kingdom) (2011), 473 (7346), 194-198CODEN: NATUAS; ISSN:0028-0836. (Nature Publishing Group)Many interesting but practically intractable problems can be reduced to that of finding the ground state of a system of interacting spins; however, finding such a ground state remains computationally difficult. It is believed that the ground state of some naturally occurring spin systems can be effectively attained through a process called quantum annealing. If it could be harnessed, quantum annealing might improve on known methods for solving certain types of problem. However, phys. investigation of quantum annealing has been largely confined to microscopic spins in condensed-matter systems. Here we use quantum annealing to find the ground state of an artificial Ising spin system comprising an array of eight superconducting flux quantum bits with programmable spin-spin couplings. We observe a clear signature of quantum annealing, distinguishable from classical thermal annealing through the temp. dependence of the time at which the system dynamics freezes. Our implementation can be configured in situ to realize a wide variety of different spin networks, each of which can be monitored as it moves towards a low-energy configuration. This programmable artificial spin network bridges the gap between the theor. study of ideal isolated spin networks and the exptl. investigation of bulk magnetic samples. Moreover, with an increased no. of spins, such a system may provide a practical phys. means to implement a quantum algorithm, possibly allowing more-effective approaches to solving certain classes of hard combinatorial optimization problems.
- 19Harris, R.; Sato, Y.; Berkley, A. J.; Reis, M.; Altomare, F.; Amin, M. H.; Boothby, K.; Bunyk, P.; Deng, C.; Enderud, C. Phase transitions in a programmable quantum spin glass simulator. Science 2018, 361, 162– 165, DOI: 10.1126/science.aat202519https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhtlahsbjK&md5=59a1879e8e72e04ea46650419416b588Phase transitions in a programmable quantum spin glass simulatorHarris, R.; Sato, Y.; Berkley, A. J.; Reis, M.; Altomare, F.; Amin, M. H.; Boothby, K.; Bunyk, P.; Deng, C.; Enderud, C.; Huang, S.; Hoskinson, E.; Johnson, M. W.; Ladizinsky, E.; Ladizinsky, N.; Lanting, T.; Li, R.; Medina, T.; Molavi, R.; Neufeld, R.; Oh, T.; Pavlov, I.; Perminov, I.; Poulin-Lamarre, G.; Rich, C.; Smirnov, A.; Swenson, L.; Tsai, N.; Volkmann, M.; Whittaker, J.; Yao, J.Science (Washington, DC, United States) (2018), 361 (6398), 162-165CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)Understanding magnetic phases in quantum mech. systems is one of the essential goals in condensed matter physics, and the advent of prototype quantum simulation hardware has provided new tools for exptl. probing such systems. Here, the authors report on the exptl. realization of a quantum simulation of interacting Ising spins on 3-dimensional cubic lattices up to dimensions of 8x8x8 on a D-Wave processor. The ability to control and read out the state of individual spins provides direct access to several order parameters, which they used to det. the lattice's magnetic phases as well as crit. disorder and one of its universal exponents. By tuning the degree of disorder and effective transverse magnetic field, the authors obsd. phase transitions between a paramagnetic, an antiferromagnetic and a spin glass phase.
- 20Mandrà, S.; Katzgraber, H. G. A deceptive step towards quantum speedup detection. Quantum Sci. Technol. 2018, 3, 04LT01, DOI: 10.1088/2058-9565/aac8b2There is no corresponding record for this reference.
- 21D-Wave Systems Inc. Advantage Data Sheet, 2022; https://www.dwavesys.com/media/htjclcey/advantage_datasheet_v10.pdf, (accessed March 3, 2024).There is no corresponding record for this reference.
- 22Li, Z. H.; Liu, J.; Yu, S. Y. A Dynamic Time-Evolution Control Method to Improve the Performance of Optoelectronic Coherent Ising Machine. OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXPOSITION (OFC): San Diego, California, United States, 2021; p Tu1H.4.There is no corresponding record for this reference.
- 23Mwamsojo, N.; Lehmann, F.; Merghem, K.; Benkelfat, B. E.; Frignac, Y. Optoelectronic coherent Ising machine for combinatorial optimization problems. Opt. Lett. 2023, 48, 2150– 2153, DOI: 10.1364/OL.485215There is no corresponding record for this reference.
- 24Cheng, Z. Z.; Tsang, H. K.; Wang, X. M.; Xu, K.; Xu, J. B. In-Plane Optical Absorption and Free Carrier Absorption in Graphene-on-Silicon Waveguides. IEEE J. Sel. Top. Quantum Electron. 2014, 20, 43– 48, DOI: 10.1109/JSTQE.2013.2263115There is no corresponding record for this reference.
- 25Li, G. H.; Sekine, R.; Nehra, R.; Gray, R. M.; Ledezma, L.; Guo, Q.; Marandi, A. All-optical ultrafast ReLU function for energy-efficient nanophotonic deep learning. Nanophotonics 2023, 12, 847– 855, DOI: 10.1515/nanoph-2022-0137There is no corresponding record for this reference.
- 26Zuo, Y.; Li, B. H.; Zhao, Y. J.; Jiang, Y.; Chen, Y. C.; Chen, P.; Jo, G. B.; Liu, J. W.; Du, S. W. All-optical neural network with nonlinear activation functions. Optica 2019, 6, 1132– 1137, DOI: 10.1364/OPTICA.6.001132There is no corresponding record for this reference.
- 27Chen, Z.; Li, Z.; Deng, Z.; Liu, J.; Yu, S. An Optoelectronic Analog Ising Machine Enabling 2048-Spin and Low-Latency Calculations. OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXPOSITION (OFC): San Diego, California, United States, 2023; p M2J.2.There is no corresponding record for this reference.
- 28Wang, Z.; Marandi, A.; Wen, K.; Byer, R. L.; Yamamoto, Y. Coherent Ising machine based on degenerate optical parametric oscillators. Phys. Rev. A 2013, 88, 063853, DOI: 10.1103/PhysRevA.88.06385328https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhsVOmtbc%253D&md5=3c16e68bdbec4014cad6e56f14e88c60Coherent Ising machine based on degenerate optical parametric oscillatorsWang, Zhe; Marandi, Alireza; Wen, Kai; Byer, Robert L.; Yamamoto, YoshihisaPhysical Review A: Atomic, Molecular, and Optical Physics (2013), 88 (6-B), 063853/1-063853/9CODEN: PLRAAN; ISSN:1050-2947. (American Physical Society)A degenerate optical parametric oscillator network is proposed to solve the NP-hard problem of finding a ground state of the Ising model. The underlying operating mechanism originates from the bistable output phase of each oscillator and the inherent preference of the network in selecting oscillation modes with the min. photon decay rate. Computational expts. are performed on all instances reducible to the NP-hard MAX-CUT problems on cubic graphs of order up to 20. The numerical results reasonably suggest the effectiveness of the proposed network.
- 29LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436– 444, DOI: 10.1038/nature1453929https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXht1WlurzP&md5=8bc52a4d89944b3bc753cad905aba9e1Deep learningLeCun, Yann; Bengio, Yoshua; Hinton, GeoffreyNature (London, United Kingdom) (2015), 521 (7553), 436-444CODEN: NATUAS; ISSN:0028-0836. (Nature Publishing Group)Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
- 30Shen, Y.; Harris, N. C.; Skirlo, S.; Prabhu, M.; Baehr-Jones, T.; Hochberg, M.; Sun, X.; Zhao, S.; Larochelle, H.; Englund, D.; Soljačić, M. Deep learning with coherent nanophotonic circuits. Nat. Photonics 2017, 11, 441– 446, DOI: 10.1038/nphoton.2017.9330https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtVSjt7bJ&md5=a7ac081815d879784c8242be80225c20Deep learning with coherent nanophotonic circuitsShen, Yichen; Harris, Nicholas C.; Skirlo, Scott; Prabhu, Mihika; Baehr-Jones, Tom; Hochberg, Michael; Sun, Xin; Zhao, Shijie; Larochelle, Hugo; Englund, Dirk; Soljacic, MarinNature Photonics (2017), 11 (7), 441-446CODEN: NPAHBY; ISSN:1749-4885. (Nature Publishing Group)Artificial neural networks are computational network models inspired by signal processing in the brain. These models have dramatically improved performance for many machine-learning tasks, including speech and image recognition. However, today's computing hardware is inefficient at implementing neural networks, in large part because much of it was designed for von Neumann computing schemes. Significant effort has been made towards developing electronic architectures tuned to implement artificial neural networks that exhibit improved computational speed and accuracy. Here, we propose a new architecture for a fully optical neural network that, in principle, could offer an enhancement in computational speed and power efficiency over state-of-the-art electronics for conventional inference tasks. We exptl. demonstrate the essential part of the concept using a programmable nanophotonic processor featuring a cascaded array of 56 programmable Mach-Zehnder interferometers in a silicon photonic integrated circuit and show its utility for vowel recognition.
- 31He, M. B. High-performance hybrid silicon and lithium niobate Mach-Zehnder modulators for 100 Gbit s and beyond. Nat. Photonics 2019, 13, 359– 365, DOI: 10.1038/s41566-019-0378-6There is no corresponding record for this reference.
- 32Haribara, Y.; Utsunomiya, S.; Yamamoto, Y. Principles and Methods of Quantum Information Technologies. Lecture Notes in Physics. Chapter 12; Springer Japan, 2016; pp 251– 262.There is no corresponding record for this reference.
- 33Cipra, B. A. The Ising model is NP-complete. SIAM News 2000, 33, 1– 3There is no corresponding record for this reference.
- 34Kochenberger, G. A.; Hao, J. K.; Lü, Z.; Wang, H. B.; Glover, F. Solving large scale Max Cut problems via tabu search. J. Heuristics 2013, 19, 565– 571, DOI: 10.1007/s10732-011-9189-8There is no corresponding record for this reference.
- 35Leleu, T.; Yamamoto, Y.; Utsunomiya, S.; Aihara, K. Combinatorial optimization using dynamical phase transitions in driven-dissipative systems. Phys. Rev. E 2017, 95, 022118, DOI: 10.1103/PhysRevE.95.022118There is no corresponding record for this reference.
- 36Leleu, T.; Yamamoto, Y.; McMahon, P. L.; Aihara, K. Destabilization of Local Minima in Analog Spin Systems by Correction of Amplitude Heterogeneity. Phys. Rev. Lett. 2019, 122, 040607, DOI: 10.1103/PhysRevLett.122.040607There is no corresponding record for this reference.
- 37Xu, M. Y.; Zhu, Y. T.; Pittalà, F.; Tang, J.; He, M. B.; Ng, W. C.; Wang, J. Y.; Ruan, Z. L.; Tang, X. F.; Kuschnerov, M.; Liu, L.; Yu, S. Y.; Zheng, B. F.; Cai, X. L. Dual-polarization thin-film lithium niobate in-phase quadrature modulators for terabit-per-second transmission. Optica 2022, 9, 61– 62, DOI: 10.1364/OPTICA.44969137https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38Xmslalt78%253D&md5=a324bbfc85df57fd47c598396180cf18Dual-polarization thin-film lithium niobate in-phase quadrature modulators for terabit-per-second transmissionXu, Mengyue; Zhu, Yuntao; Pittala, Fabio; Tang, Jin; He, Mingbo; Ng, Wing Chau; Wang, Jingyi; Ruan, Ziliang; Tang, Xuefeng; Kuschnerov, Maxim; Liu, Liu; Yu, Siyuan; Zheng, Bofang; Cai, XinlunOptica (2022), 9 (1), 61-62CODEN: OPTIC8; ISSN:2334-2536. (Optica Publishing Group)We report, to our knowledge, the first dual-polarization thin-film lithium niobate coherent modulator for next-generation optical links with sub-1-V driving voltage and 110-GHz bandwidth, enabling a record single-wavelength 1.96-Tb/s net data rate with ultrahigh energy efficiency.
- 38Han, C.; Zheng, Z.; Shu, H.; Jin, M.; Qin, J.; Chen, R.; Tao, Y.; Shen, B.; Bai, B.; Yang, F. Slow-light silicon modulator with 110-GHz bandwidth. Sci. Adv. 2023, 9, eadi5339 DOI: 10.1126/sciadv.adi5339There is no corresponding record for this reference.
- 39Maes, D.; Reis, L.; Poelman, S.; Vissers, E.; Avramovic, V.; Zaknoune, M.; Roelkens, G.; Lemey, S.; Peytavit, E.; Kuyken, B. High-Speed Photodiodes on Silicon Nitride with a Bandwidth beyond 100 GHz. Conference on Lasers and Electro-Optics; San Jose: California, United States, 2022; p SM3K.3.There is no corresponding record for this reference.
- 40Nagatani, M.; Wakita, H.; Jyo, T.; Takeya, T.; Yamazaki, H.; Ogiso, Y.; Mutoh, M.; Shiratori, Y.; Ida, M.; Hamaoka, F.; Nakamura, M.; Kobayashi, T.; Takahashi, H.; Miyamoto, Y. 110-GHz-Bandwidth InP-HBT AMUX/ADEMUX Circuits for Beyond-1-Tb/s/ch Digital Coherent Optical Transceivers. IEEE Custom Integrated Circuits Conference (CICC); Newport Beach: California, United States, 2022; pp 1– 8.There is no corresponding record for this reference.
- 41Tan, M.; Xu, J.; Liu, S.; Feng, J.; Zhang, H.; Yao, C.; Chen, S.; Guo, H.; Han, G.; Wen, Z. Co-packaged optics (CPO): status, challenges, and solutions. Front. Optoelectron. 2023, 16, 1, DOI: 10.1007/s12200-022-00055-yThere is no corresponding record for this reference.
- 42McMahon, P. L. The physics of optical computing. Nat. Rev. Phys. 2023, 5, 717– 734, DOI: 10.1038/s42254-023-00645-5There is no corresponding record for this reference.
- 43El Srouji, L.; Krishnan, A.; Ravichandran, R.; Lee, Y.; On, M.; Xiao, X.; Ben Yoo, S. J. Photonic and optoelectronic neuromorphic computing. APL Photonics 2022, 7, 051101, DOI: 10.1063/5.0072090There is no corresponding record for this reference.
- 44Peserico, N.; Shastri, B. J.; Sorger, V. J. Integrated Photonic Tensor Processing Unit for a Matrix Multiply: A Review. J. Lightwave Technol. 2023, 41, 3704– 3716, DOI: 10.1109/JLT.2023.3269957There is no corresponding record for this reference.
- 45Yang, L.; Ji, R.; Zhang, L.; Ding, J.; Xu, Q. On-chip CMOS-compatible optical signal processor. Opt. Express 2012, 20, 13560, DOI: 10.1364/OE.20.013560There is no corresponding record for this reference.
- 46Tait, A. N.; Nahmias, M. A.; Shastri, B. J.; Prucnal, P. R. Broadcast and Weight: An Integrated Network For Scalable Photonic Spike Processing. J. Lightwave Technol. 2014, 32, 4029– 4041, DOI: 10.1109/JLT.2014.2345652There is no corresponding record for this reference.
- 47Xu, X.; Tan, M.; Corcoran, B.; Wu, J.; Boes, A.; Nguyen, T. G.; Chu, S. T.; Little, B. E.; Hicks, D. G.; Morandotti, R.; Mitchell, A.; Moss, D. J. 11 TOPS photonic convolutional accelerator for optical neural networks. Nature 2021, 589, 44– 51, DOI: 10.1038/s41586-020-03063-047https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXnsFWisA%253D%253D&md5=337a489b249066f40b6d4cd0a0e1e23c11 TOPS photonic convolutional accelerator for optical neural networksXu, Xingyuan; Tan, Mengxi; Corcoran, Bill; Wu, Jiayang; Boes, Andreas; Nguyen, Thach G.; Chu, Sai T.; Little, Brent E.; Hicks, Damien G.; Morandotti, Roberto; Mitchell, Arnan; Moss, David J.Nature (London, United Kingdom) (2021), 589 (7840), 44-51CODEN: NATUAS; ISSN:0028-0836. (Nature Research)Abstr.: Convolutional neural networks, inspired by biol. visual cortex systems, are a powerful category of artificial neural networks that can ext. the hierarchical features of raw data to provide greatly reduced parametric complexity and to enhance the accuracy of prediction. They are of great interest for machine learning tasks such as computer vision, speech recognition, playing board games and medical diagnosis1-7. Optical neural networks offer the promise of dramatically accelerating computing speed using the broad optical bandwidths available. Here we demonstrate a universal optical vector convolutional accelerator operating at more than ten TOPS (trillions (1012) of operations per s, or tera-ops per s), generating convolutions of images with 250,000 pixels-sufficiently large for facial image recognition. We use the same hardware to sequentially form an optical convolutional neural network with ten output neurons, achieving successful recognition of handwritten digit images at 88 per cent accuracy. Our results are based on simultaneously interleaving temporal, wavelength and spatial dimensions enabled by an integrated microcomb source. This approach is scalable and trainable to much more complex networks for demanding applications such as autonomous vehicles and real-time video recognition.
- 48Feldmann, J.; Youngblood, N.; Karpov, M.; Gehring, H.; Li, X.; Stappers, M.; Le Gallo, M.; Fu, X.; Lukashchuk, A.; Raja, A. S. Parallel convolutional processing using an integrated photonic tensor core. Nature 2021, 589, 52– 58, DOI: 10.1038/s41586-020-03070-148https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXnsFWisw%253D%253D&md5=d2b46186c9d0e0fb4baae3ee7e174ce9Parallel convolutional processing using an integrated photonic tensor coreFeldmann, J.; Youngblood, N.; Karpov, M.; Gehring, H.; Li, X.; Stappers, M.; Le Gallo, M.; Fu, X.; Lukashchuk, A.; Raja, A. S.; Liu, J.; Wright, C. D.; Sebastian, A.; Kippenberg, T. J.; Pernice, W. H. P.; Bhaskaran, H.Nature (London, United Kingdom) (2021), 589 (7840), 52-58CODEN: NATUAS; ISSN:0028-0836. (Nature Research)With the proliferation of ultrahigh-speed mobile networks and internet-connected devices, along with the rise of artificial intelligence (AI), the world is generating exponentially increasing amts. of data that need to be processed in a fast and efficient way. Highly parallelized, fast and scalable hardware is therefore becoming progressively more important. Here we demonstrate a computationally specific integrated photonic hardware accelerator (tensor core) that is capable of operating at speeds of trillions of multiply-accumulate operations per s (1012 MAC operations per s or tera-MACs per s). The tensor core can be considered as the optical analog of an application-specific integrated circuit (ASIC). It achieves parallelized photonic in-memory computing using phase-change-material memory arrays and photonic chip-based optical frequency combs (soliton microcombs). The computation is reduced to measuring the optical transmission of reconfigurable and non-resonant passive components and can operate at a bandwidth exceeding 14 GHz, limited only by the speed of the modulators and photodetectors. Given recent advances in hybrid integration of soliton microcombs at microwave line rates, ultralow-loss silicon nitride waveguides, and high-speed on-chip detectors and modulators, our approach provides a path towards full complementary metal-oxide-semiconductor (CMOS) wafer-scale integration of the photonic tensor core. Although we focus on convolutional processing, more generally our results indicate the potential of integrated photonics for parallel, fast, and efficient computational hardware in data-heavy AI applications such as autonomous driving, live video processing, and next-generation cloud computing services.
- 49Shi, B.; Calabretta, N.; Stabile, R. Deep Neural Network Through an InP SOA-Based Photonic Integrated Cross-Connect. IEEE J. Sel. Top. Quantum Electron. 2020, 26, 1– 11, DOI: 10.1109/JSTQE.2019.2945548There is no corresponding record for this reference.
- 50Zhong, Z.; Yang, M.; Lang, J.; Williams, C.; Kronman, L.; Sludds, A.; Esfahanizadeh, H.; Englund, D.; Ghobadi, M. Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference. Proceedings of the ACM SIGCOMM 2023 Conference: New York, United States, 2023; pp 452– 472.There is no corresponding record for this reference.
- 51Hamerly, R.; Bernstein, L.; Sludds, A.; Soljačić, M.; Englund, D. Large-Scale Optical Neural Networks Based on Photoelectric Multiplication. Phys. Rev. X 2019, 9, 021032, DOI: 10.1103/PhysRevX.9.02103251https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXhsFOrurnI&md5=9e3f8685ffd80e84ace0659c72c7eaacLarge-Scale Optical Neural Networks Based on Photoelectric MultiplicationHamerly, Ryan; Bernstein, Liane; Sludds, Alexander; Soljacic, Marin; Englund, DirkPhysical Review X (2019), 9 (2), 021032CODEN: PRXHAE; ISSN:2160-3308. (American Physical Society)Recent success in deep neural networks has generated strong interest in hardware accelerators to improve speed and energy consumption. This paper presents a new type of photonic accelerator based on coherent detection that is scalable to large (N⪆106) networks and can be operated at high (gigahertz) speeds and very low (subattojoule) energies per multiply and accumulate (MAC), using the massive spatial multiplexing enabled by std. free-space optical components. In contrast to previous approaches, both wts. and inputs are optically encoded so that the network can be reprogrammed and trained on the fly. Simulations of the network using models for digit and image classification reveal a "std. quantum limit" for optical neural networks, set by photodetector shot noise. This bound, which can be as low as 50 zJ/MAC, suggests that performance below the thermodn. (Landauer) limit for digital irreversible computation is theor. possible in this device. The proposed accelerator can implement both fully connected and convolutional networks. We also present a scheme for backpropagation and training that can be performed in the same hardware. This architecture will enable a new class of ultralow-energy processors for deep learning.
Supporting Information
Supporting Information
The Supporting Information is available free of charge at: https://pubs.acs.org/doi/10.1021/acsphotonics.4c00003.
Additional details about simulation of the travel-wave electrodes, fabrication of the on-chip devices, multiplication-accumulation core with configurable parallel accumulator and bubble layers, real-value spin and weight calculation based on intensity modulation, and calculation of energy efficiency (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.