R&D Activities R&D活动

Human Interaction 人际交往

Design the ultimate harmony of system and human beings
设计系统与人类的极致和谐

Natural User Interface 自然用户界面

This is a theme for developing “Natural User Interface” technologies in which natural actions of users, such as “seeing,” “speaking,” and “moving the body,” become inputs and the outputs are returned to users in an intuitive way without the need for thinking. For example, haptics technology, which is one of the output technologies and is being applied in games, realizes overwhelming reality and various immersive experiences by integrating high-definition dynamic haptic feedback and other sensory presentations based on human sensory characteristics. In addition, we will create new experience value at the interface between people and device through the development of technologies such as sound generation and odor presentation related to output, and gaze UI and voice UI related to user input. We are also focusing on accessibility technology, which uses technologies related to the five senses to provide a user-friendly experience for people with disabilities.
这是开发“自然用户界面”技术的主题，在这种技术中，用户的自然动作，如“看”、“说”和“移动身体”，成为输入，输出以直观的方式返回给用户，而无需思考。例如，触觉技术是输出技术之一，正在游戏中应用，通过整合高清动态触觉反馈和其他基于人类感官特征的感官呈现，实现压倒性的现实和各种沉浸式体验。此外，我们将通过开发与输出相关的声音生成和气味呈现，以及与用户输入相关的凝视UI和语音UI等技术，在人与设备的界面上创造新的体验价值。我们还专注于无障碍技术，该技术使用与五种感官相关的技术为残障人士提供用户友好的体验。

Image of access to the senses with Natural User Interface

Motion Sensing 运动感应

By capturing human motion in real time, a CG avatar can be controlled, or audio and visual content can be displayed accordingly. Using compact, low-power, and inexpensive inertial sensors (accelerometers and gyroscopes), we are working towards sensor fusion and deep learning to develop motion-sensing technologies that are easy for both creators and users to use. Specifically, we are addressing R&D themes such as self-position estimation to accurately track a pedestrian’s movement, as well as motion capture to estimate whole-body motion with a minimum number of sensors. These technologies will be utilized in products and services including, but not limited to, Sony’s entertainment business.
通过实时捕捉人体动作，可以控制CG头像，或者可以相应地显示音频和视频内容。使用紧凑、低功耗和廉价的惯性传感器（加速度计和陀螺仪），我们正在努力实现传感器融合和深度学习，以开发易于创作者和用户使用的运动传感技术。具体来说，我们正在解决研发主题，例如准确跟踪行人运动的自我位置估计，以及使用最少数量的传感器估计全身运动的动作捕捉。这些技术将用于产品和服务，包括但不限于索尼的娱乐业务。

Image of Motion Sensing from input to output

Vital Sensing and Emotion Estimation
生命体征感知和情绪评估

We have been working on vital sensing and emotion estimation technologies to allow us to be “getting closer to people” and to better understand them. By combining the vital sensing technology achieved through the development of devices and signal processing with the emotion estimation technology derived from experiments based on our knowledge of machine learning, neuroscience, and physiological psychology, we are promoting research on fundamental technologies that provide accurate personalization services based on the real-time emotional changes of users, as well as feedback for the evaluation and production of entertainment content.
我们一直在研究重要的传感和情感估计技术，使我们能够“更接近人们”并更好地了解他们。通过将设备和信号处理开发中实现的生命传感技术与基于机器学习、神经科学和生理心理学知识的实验衍生的情绪估计技术相结合，我们正在推进基于用户实时情绪变化提供准确个性化服务的基础技术研究，以及用于娱乐内容评估和制作的反馈。

Schematic of Vital Sensing and Emotion Estimation

Remote Spectator Assistance System
远程观众辅助系统

We are developing a spectator assistance system that connects remote spectators and gives then a “sense of being there” and “enthusiasm” in sports and live music. By sensing the movement and mental state of the audience in real time, we can quantify the degree of concentration and interest that is difficult for the human eye to accurately grasp accurately. By expressing the degree of concentration and interest through images and sounds that enable a remote user to feel as if he/she were there, and by conducting interactive interventions, we provide an experience that enables people in a remote location to share the sense of presence, unity, and enthusiasm with the audience at a venue.
我们正在开发一个观众辅助系统，该系统可以连接远程观众，并在体育和现场音乐中给人一种“身临其境的感觉”和“热情”。通过实时感知观众的动作和精神状态，我们可以量化人眼难以准确准确把握的注意力和兴趣程度。通过图像和声音表达专注和兴趣的程度，使远程用户感觉仿佛他/她身临其境，并通过进行交互式干预，我们提供了一种体验，使远程位置的人们能够与现场的观众分享存在感、团结感和热情。

Image of utilization of Remote Spectator Assistance System

Sound AR Interaction 声音 AR 交互

We are exploring the realization of an auditory AR experience that expands our world through the power of sound by superimposing the “sound of the real world” with the “sound of the virtual world”. This technology is already being used in Sony’s new sound experience called Sound AR™. This technology has three features. The first is a human-sensing technology for the real-time sensing of the location and behavior of users. The second is game-inspired sound-engine technology that generates real-time, interactive sound based on the sensing results. The third is a 360 Spatial Sound technology for superimposing the generated sound onto a three-dimensional space. By integrating these Sony technologies into interaction technologies tailored to human perceptual characteristics, Sony is able to provide a natural and immersive AR experience. We are also working on applications for accessibility that enable people with visual impairments to perceive space through the use of sound.
我们正在探索通过将“现实世界的声音”与“虚拟世界的声音”叠加，通过声音的力量扩展我们的世界的听觉AR体验。这项技术已经用于索尼名为Sound AR™的新声音体验中。该技术有三个特点。第一种是人类传感技术，用于实时传感用户的位置和行为。第二种是受游戏启发的声音引擎技术，可根据传感结果生成实时的交互式声音。第三种是 360 度空间声音技术，用于将生成的声音叠加到三维空间上。通过将这些索尼技术集成到针对人类感知特征量身定制的交互技术中，索尼能够提供自然和身临其境的 AR 体验。我们还在开发无障碍应用程序，使视障人士能够通过使用声音来感知空间。

Telepresence 网真

The elements of “reality” and “aura” can be leveraged to realize natural communication within space using Sony’s “MADO” (window) telepresence system. For “reality”, we applied the best in Sony’s video and audio capabilities. Meanwhile, for “aura”, we implemented the state of the art in cognitive psychology, that is, how humans perceive people and spaces. For example, instead of handling only “central vision” information such as the other person’s face, remarks, and materials as is the case in existing video conferencing, “MADO” works with a life-sized image of a person or the “peripheral vision” on a large vertical screen. The bi-directional, high-quality sound technology allows natural conversation in real time, as if the users were sitting in front of each other.
利用“现实”和“光环”的元素，使用索尼的“MADO”（窗口）远程呈现系统，实现空间内的自然通信。对于“现实”，我们充分利用了索尼的视频和音频功能。同时，对于“光环”，我们实现了认知心理学的最新技术，即人类如何感知人和空间。例如，“MADO”不像现有的视频会议那样只处理“中央视觉”信息，如对方的脸部、评论和材料，而是处理真人大小的人的图像或大型垂直屏幕上的“周边视觉”。双向、高品质的声音技术允许实时自然对话，就好像用户坐在彼此面前一样。

Biometrics & behavior authentication
生物识别和行为认证

In recent years, multi-factor authentication methods have been introduced to address the security shortcomings of passwords. A disadvantage of these authentication methods is that they require the manual intervention of the end user, and authentication is not immediately available when needed. At Sony, we are working on the development of high-precision biometric devices that are small enough to be worn, as well as technologies that realize advanced functions, including anti-spoofing, through the use of general-purpose sensors. In addition, we are developing methods to continuously identify persons based on their behaviors or biometric properties – without requiring any interactions from the user. For example, by using data captured from mobile or wearable devices, a user’s walking style or favorite routes can be learned and used for authentication purposes.