Artificial Intelligence (AI) agents are required to learn from their surroundings and reason about what has been learned to make decisions, act in the world, or react to various stimuli. The latest Machine Learning (ML) has adopted mostly a pure sub-symbolic learning approach. Using distributed representations of entities, the latest ML performs quick decision-making without building a comprehensible model of the world. While achieving impressive results in computer vision, natural language, game playing, and multimodal learning, such approaches are known to be data inefficient and to struggle at out-of-distribution generalization. Although the use of appropriate inductive biases can alleviate such shortcomings, in general, sub-symbolic models lack comprehensibility. By contrast, symbolic AI is based on rich, high-level representations of the world that use human-readable symbols. By rich knowledge, we refer to logical representations which are more expressive than propositional logic or propositional probabilistic approaches, and which can express knowledge using full first-order logic, including universal and existential quantification ,arbitrary -ary relations over variables,e.g. ,and function symbols,e.g. father ,etc. Symbolic AI has achieved success at theorem proving,logical inference,and verification. However, it also has shortcomings when dealing with incomplete knowledge. It can be inefficient with large amounts of inaccurate data and lack robustness to outliers. Purely symbolic decision algorithms usually have high computational complexity making them impractical for the real world. It is now clear that the predominant approach to ML, where learning is based on recognizing the latent structures hidden in the data, is insufficient and may benefit from symbolic AI [17]. In this context, neurosymbolic AI, which stems from neural networks and symbolic AI, attempts to combine the strength of both paradigms (see [16, 40, 54] for recent surveys). That is to say, combine reasoning with complex representations of knowledge (knowledge-bases, semantic networks, ontologies, trees, and graphs) with learning from complex data (images, time series, sensorimotor data, natural language). Consequently, a main challenge for neurosymbolic AI is the grounding of symbols, including constants, functional and relational symbols, into real data, which is akin to the longstanding symbol grounding problem [30].
人工智能(AI)代理需要从周围环境中学习,并推理所学内容以做出决策,在世界中行动,或对各种刺激做出反应。最新的机器学习(ML)主要采用了一种纯次符号学习方法。通过使用实体的分布式表示,最新的 ML 能够快速做出决策,而无需构建一个可理解的世界模型。虽然在计算机视觉、自然语言、游戏玩法和多模态学习方面取得了令人印象深刻的成果,但这些方法被认为在数据效率上存在问题,并且在超出分布范围的泛化上存在困难。尽管适当使用归纳偏差可以缓解这些缺点,但总体而言,次符号模型缺乏可理解性。相比之下,符号 AI 基于对世界的丰富、高级别表示,使用人类可读的符号。 通过丰富的知识,我们指的是比命题逻辑或命题概率方法更具表现力的逻辑表示,可以使用完整的一阶逻辑来表达知识,包括全称量词和存在量词 ,对变量的任意 元关系,例如 ,和函数符号,例如父亲 等。符号人工智能在定理证明、逻辑推理和验证方面取得了成功。然而,在处理不完整知识时也存在缺点。当处理大量不准确数据时可能效率低下,并且对异常值缺乏鲁棒性。纯符号决策算法通常具有较高的计算复杂性,使其在实际世界中难以实用。现在清楚地看到,基于识别数据中隐藏的潜在结构进行学习的主导 ML 方法是不足的,可能受益于符号人工智能[17]。在这种背景下,神经符号人工智能源自神经网络和符号人工智能,试图结合两种范式的优势(参见[16, 40, 54]最近的调查报告)。 换句话说,将推理与知识的复杂表征(知识库、语义网络、本体论、树和图形)与从复杂数据中学习(图像、时间序列、感觉运动数据、自然语言)相结合。因此,神经符号人工智能的主要挑战之一是将符号(包括常量、功能和关系符号)与真实数据进行基础化,这类似于长期存在的符号基础问题【30】。