这是用户在 2024-9-28 17:49 为 https://app.immersivetranslate.com/word/ 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

Glossary
术语表

Data Analytics
数据分析

Terms and Definitions
术语和定义

Terms and definitions from all courses
所有课程的术语和定义

A

A/B testing: The process of testing two variations of the same web page to determine which page is more successful at attracting user traffic and generating revenue
A/B测试:测试同一网页的两个变体以确定哪个网页在吸引用户流量和产生收入方面更成功的过程

Absolute reference: A reference within a function that is locked so that rows and columns won’t change if the function is copied
绝对引用:函数内的引用,它被锁定,因此如果复制函数,行和列不会更改

Access control: Features such as password protection, user permissions, and encryption that are used to protect a spreadsheet
访问控制:用于保护电子表格的密码保护、用户权限和加密等功能

Accuracy: The degree to which data conforms to the actual entity being measured or described
准确性:数据与被测量或描述的实际实体相符的程度

Action-oriented question: A question whose answers lead to change
导向性问题:一个问题,其答案导致变化

Administrative metadata: Metadata that indicates the technical source of a digital asset
管理元数据:指示数字资产技术来源的元数据

Aesthetic (R): A visual property of an object in a plot
美学(R):情节中对象的视觉特性

Agenda: A list of scheduled appointments
议程:预定约会名单

Aggregation: The process of collecting or gathering many separate pieces into a whole
聚合:将许多独立的部分收集或聚集成一个整体的过程

Algorithm: A process or set of rules followed for a specific task
算法:为特定任务所遵循的过程或规则集

Aliasing: Temporarily naming a table or column in a query to make it easier to read and write
别名:临时命名查询中的表或列,以使其更易于读取和写入

Alternative text: Text that provides an alternative to non-text content, such as images and videos
替代文本:提供非文本内容(例如图像和视频)替代方案的文本

Analytical skills: Qualities and characteristics associated with using facts to solve problems
分析能力:与使用事实解决问题相关的品质和特征

Analytical thinking: The process of identifying and defining a problem, then solving it by using data in an organized, step-by-step manner
分析性思维:识别和定义问题的过程,然后以有组织的方式逐步使用数据解决问题

Annotation: Text that briefly explains data or helps focus the audience on a particular aspect of the data in a visualization
注释:简要解释数据或帮助观众关注可视化数据的特定方面的文本

Anscombe’s quartet: Four datasets that have nearly identical summary statistics but contain different plotted values
Anscombe's quartet:四个数据集具有几乎相同的汇总统计量,但包含不同的标绘值

Area chart: A data visualization that uses individual data points for a changing variable connected by a continuous line with a filled in area underneath
面积图一种数据可视化,它使用单个数据点作为变化的变量,这些数据点由一条连续的线连接起来,下面是一个填充的区域。

Argument (R): Information needed by a function in R in order to run
Argument(R):R中函数运行所需的信息

Arithmetic operator: An operator used to perform basic math operations such as addition, subtraction, multiplication, and division
算术运算符:用于执行基本数学运算(如加、减、乘和除)的运算符

Array: A collection of values in spreadsheet cells
数组:电子表格单元格中的值的集合

Assignment operator: An operator used to assign values to variables and vectors
赋值运算符:用于为变量和向量赋值的运算符

Attribute: A characteristic or quality of data used to label a column in a table
属性:用于标记表中列的数据的特性或质量

Audio file: Digitized audio storage usually in an MP3, AAC, or other compressed format
音频文件:数字化音频存储,通常为MP3、AAC或其他压缩格式

AVERAGE: A spreadsheet function that returns an average of the values from a selected range
AVERAGE:一个电子表格函数,返回选定范围内的平均值

AVERAGEIF: A spreadsheet function that returns the average of all cell values from a given range that meet a specified condition
AVERAGEIF:一个电子表格函数,返回给定范围内满足指定条件的所有单元格值的平均值

B

Bad data source: A data source that is not reliable, original, comprehensive, current, and cited (ROCCC)
坏数据源:不可靠、原始、全面、最新和被引用的数据源(ROCCC)

Balance: The design principle of creating aesthetic appeal and clarity in a data visualization by evenly distributing visual elements
平衡:通过均匀分布视觉元素,在数据可视化中创造美感和清晰度的设计原则

Bar graph: A data visualization that uses size to contrast and compare two or more values
条形图:一种数据可视化,使用大小来对比和比较两个或多个值

Bias: A conscious or subconscious preference in favor of or against a person, group of people, or thing
偏见:有意识或无意识地倾向于支持或反对一个人、一群人或一件事。

Big data: Large, complex datasets typically involving long periods of time, which enable data analysts to address far-reaching business problems
大数据:大型、复杂的数据集,通常涉及很长的时间,使数据分析师能够解决影响深远的业务问题

Boolean data: A data type with only two possible values, usually true or false
布尔数据:只有两个可能值的数据类型,通常为真或假

Borders: Lines that can be added around two or more cells on a spreadsheet
边框:可以在电子表格上的两个或多个单元格周围添加的线条

Box plot: A data visualization that displays the distribution of values along an x-axis
箱线图:显示值沿x轴沿着分布的数据可视化

Bubble chart: A data visualization that displays individual data points as bubbles, comparing numeric values by their relative size
气泡图:一种数据可视化,将各个数据点显示为气泡,并按数值的相对大小进行比较

Bullet graph: A data visualization that displays data as a horizontal bar chart moving toward a desired value
项目符号图:一种数据可视化,将数据显示为向所需值移动的水平条形图

Business metric: A standard of measurement used to solve a business task
业务度量:用于解决业务任务的度量标准

Business task: The question or problem data analysis resolves for a business
业务任务:数据分析为业务解决的问题或难题

C

C# : An object-oriented programming language used to create games and mobile apps in the .NET open source developer platform
C#:一种面向对象的编程语言,用于在.NET开源开发者平台中创建游戏和移动的应用程序

C++: An extension of the C programming language that is used to create console games, such as those for Xbox
C++:C编程语言的扩展,用于创建控制台游戏,例如Xbox游戏

Calculated field: A new field within a pivot table that carries out certain calculations based on the values of other fields
计算字段:数据透视表中的新字段,根据其他字段的值执行某些计算

Calculus: A branch of mathematics that involves the study of rates of change and the changes between values that are related by a function
微积分:数学的一个分支,研究变化率和函数相关值之间的变化

CASE: A SQL statement that returns records that meet conditions by including an if/then statement in a query
CASE:一条SQL语句,通过在查询中包含if/then语句来返回满足条件的记录

Case study: A common way for employers to assess job skills and gain insight into how a candidate approaches common data-related challenges
案例研究:雇主评估工作技能并深入了解候选人如何应对常见数据相关挑战的常用方法

CAST: A SQL function that converts data from one datatype to another
CAST:将数据从一种数据类型转换为另一种数据类型的SQL函数

Causation: When an action directly leads to an outcome, such as a cause-effect relationship
因果关系:当一个行为直接导致一个结果时,如因果关系

Cell reference: A cell or a range of cells in a worksheet typically used in formulas and functions
单元格引用:工作表中通常用于公式和函数的单元格或单元格区域

Changelog: A file containing a chronologically ordered list of modifications made to a project
Changelog:一个包含按时间顺序排列的项目修改列表的文件

Channel: A visual aspect or variable that represents characteristics of the data in a visualization
通道:表示可视化中数据特征的可视方面或变量

Chart: A graphical representation of data from a worksheet
图表:工作表中数据的图形表示

Circle view: A data visualization that shows comparative strength in data
圆形视图:显示数据比较优势的数据可视化

Clean data: Data that is complete, correct, and relevant to the problem being solved
干净数据:完整、正确且与所解决问题相关的数据

Cloud: A place to keep data online, rather than a computer hard drive
云:在线保存数据的地方,而不是计算机硬盘驱动器

Cluster: A collection of data points on a data visualization with similar values
聚类:数据可视化中具有相似值的数据点的集合

COALESCE: A SQL function that returns non-null values in a list
COALESCE:一个返回列表中非空值的SQL函数

Code chunk: A piece of code added in an R Markdown file that is used to process, visualize or analyze data
代码块:添加到R Markdown文件中的一段代码,用于处理、可视化或分析数据

Coding: The process of writing instructions to a computer in the syntax of a specific programming language
编码:用特定程序语言的语法向计算机写入指令的过程

Column chart: A data visualization that uses individual data points for a changing variable, represented as vertical columns
柱形图:一种数据可视化,使用单个数据点表示变化的变量,以垂直列表示

Combo chart: A data visualization that combines more than one visualization type
组合图:组合了多个可视化类型的数据可视化

Compatibility: How well two or more datasets are able to work together
兼容性:两个或多个数据集能够协同工作的程度

Completeness: The degree to which data contains all desired components or measures
完整性:数据包含所有所需组件或度量的程度

Computer programming: The process of giving instructions to a computer in order to perform an action or set of actions
电脑程式设计:给电脑指令以执行一个或一组动作的过程

CONCAT: A SQL function that adds strings together to create new text strings that can be used as unique keys
CONCAT:一个SQL函数,用于将字符串添加到一起,以创建可用作唯一键的新文本字符串

CONCATENATE: A spreadsheet function that joins together two or more text strings
CONCATENATE:一个电子表格函数,将两个或多个文本字符串连接在一起

Conditional formatting: A spreadsheet tool that changes how cells appear when values meet specific conditions
条件格式:一种电子表格工具,当值满足特定条件时,它会更改单元格的显示方式

Conditional statement: A declaration that if a certain condition holds, then a certain event must take place
条件语句:一种声明,如果某个条件成立,则某个事件必须发生

Confidence interval: A range of values that conveys how likely a statistical estimate reflects the population
置信区间:传达统计估计反映总体的可能性的值范围

Confidence level: The probability that a sample size accurately reflects the greater population
置信水平:样本量准确反映较大总体的概率

Confirmation bias: The tendency to search for or interpret information in a way that confirms pre-existing beliefs
确认偏差:以确认先前存在的信念的方式搜索或解释信息的倾向

Consent: The aspect of data ethics that presumes an individual’s right to know how and why their personal data will be used before agreeing to provide it
同意:数据道德的一个方面,假定个人在同意提供个人数据之前有权知道如何以及为什么使用其个人数据

Consistency: The degree to which data is repeatable from different points of entry or collection
一致性:数据从不同的输入点或收集点可重复的程度

Context: The condition in which something exists or happens
上下文:某事存在或发生的条件

Continuous data: Data that is measured and can have almost any numeric value
连续数据:测量的数据,几乎可以有任何数值

CONVERT: A SQL function that changes the unit of measurement of a value in data
CONVERT:一个SQL函数,用于更改数据中值的度量单位

Cookie: A small file stored on a computer that contains information about its users
Cookie:存储在计算机上的一个小文件,其中包含有关其用户的信息

Correlation: The measure of the degree to which two variables change in relationship to each other
相关性:衡量两个变量之间关系变化的程度

COUNT: A spreadsheet function that counts the number of cells within a range that meet a specified condition
计数器:一个电子表格函数,用于计算某个范围内满足指定条件的单元格的数量

COUNTA: A spreadsheet function that counts the total number of values within a specified range that meet specified criteria
COUNTA:一个电子表格函数,用于计算指定范围内满足指定条件的值的总数

COUNTIF: A spreadsheet function that returns the number of cells within a range that match a specified value
COUNTIF:一个电子表格函数,返回范围内与指定值匹配的单元格数

COUNT DISTINCT: A SQL function that only returns the distinct values in a specified range
SQL DISTINCT:一个SQL函数,只返回指定范围内的非重复值

CRAN (Comprehensive R Archive Network) (R): An online archive with R packages, source code, manuals, and documentation
CRAN(Comprehensive R Archive Network)(R):一个包含R软件包、源代码、手册和文档的在线存档

CREATE TABLE: A SQL clause that adds a temporary table to a database that can be used by multiple people
CREATETABLE:一个SQL子句,用于向数据库中添加一个可供多人使用的临时表

Cross-field validation: A process that ensures certain conditions for multiple data fields are satisfied
跨字段验证:确保满足多个数据字段的某些条件的过程

CSS (Cascading Style Sheets): A programming language used for web page design that controls graphic elements and page presentation
CSS(层叠样式表):一种用于网页设计的编程语言,用于控制图形元素和页面呈现

CSV (comma-separated values) file: A delimited text file that uses a comma to separate values
CSV(逗号分隔值)文件:使用逗号分隔值的分隔文本文件

Currency: The aspect of data ethics that presumes individuals should be aware of financial transactions resulting from the use of their personal data and the scale of those transactions
货币:数据伦理的一个方面,假定个人应该意识到使用其个人数据所导致的金融交易以及这些交易的规模

D

Dashboard: A tool that monitors live, incoming data
仪表板:一种监控实时传入数据的工具

Data: A collection of facts
数据:事实的集合

Data aggregation: The process of gathering data from multiple sources and combining it into a single, summarized collection
数据聚合:从多个来源收集数据并将其合并为单个汇总集合的过程

Data analysis: The collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making
数据分析:数据的收集、转换和组织,以便得出结论、做出预测并推动明智的决策

Data analysis process: The six phases of ask, prepare, process, analyze, share, and act whose purpose is to gain insights that drive informed decision-making
数据分析流程:询问、准备、处理、分析、分享和行动的六个阶段,其目的是获得推动明智决策的见解

Data analyst: Someone who collects, transforms, and organizes data in order to draw conclusions, make predictions, and drive informed decision-making
数据分析师:收集、转换和组织数据以得出结论、做出预测并推动明智决策的人

Data analytics: The science of data
数据分析:数据科学

Data anonymization: The process of protecting people's private or sensitive data by eliminating identifying information
数据匿名化:通过消除识别信息来保护人们的私人或敏感数据的过程

Data bias: When a preference in favor of or against a person, group of people, or thing systematically skews data analysis results in a certain direction
数据偏差:当偏好有利于或不利于一个人,一群人或事物时,系统地将数据分析结果向某个方向倾斜

Data blending: A Tableau method that combines data from multiple data sources
数据混合:一种将来自多个数据源的数据进行组合的Tableau方法

Data composition: The process of combining the individual parts in a visualization and displaying them together as a whole
数据组合:将可视化中的各个部分组合在一起并将它们作为一个整体显示的过程

Data constraints: The criteria that determine whether a piece of a data is clean and valid
数据约束:确定一段数据是否干净和有效的标准

Data design: How information is organized
数据设计:如何组织信息

Data-driven decision-making: Using facts to guide business strategy
数据驱动的决策:使用事实指导业务战略

Data ecosystem: The various elements that interact with one another in order to produce, manage, store, organize, analyze, and share data
数据生态系统:为了产生、管理、存储、组织、分析和共享数据而相互交互的各种元素

Data element: A piece of information in a dataset
数据元素:数据集中的条信息

Data engineer: A professional who transforms data into a useful format for analysis and gives it a reliable infrastructure
数据工程师:将数据转换为有用的分析格式并为其提供可靠基础设施的专业人员

Data ethics: Well-founded standards of right and wrong that dictate how data is collected, shared, and used
数据伦理:有充分依据的对与错标准,规定如何收集、共享和使用数据

Data frame: A collection of columns containing data, similar to a spreadsheet or SQL table
数据框:包含数据的列的集合,类似于电子表格或SQL表

Data governance: A process for ensuring the formal management of a company’s data assets
数据治理:确保对公司数据资产进行正式管理的流程

Data-inspired decision-making: Exploring different data sources to find out what they have in common
数据启发决策:探索不同的数据源,找出它们的共同点

Data integrity: The accuracy, completeness, consistency, and trustworthiness of data throughout its life cycle
数据完整性:数据在整个生命周期中的准确性、完整性、一致性和可信度

Data interoperability: The ability to integrate data from multiple sources and a key factor leading to the successful use of open data among companies and governments
数据互操作性:整合来自多个来源的数据的能力,是企业和政府成功使用开放数据的关键因素

Data life cycle: The sequence of stages that data experiences, which include plan, capture, manage, analyze, archive, and destroy
数据生命周期:数据经历的一系列阶段,包括计划、捕获、管理、分析、存档和销毁

Data manipulation: The process of changing data to make it more organized and easier to read
数据操作:更改数据以使其更有组织且更易于阅读的过程

Data mapping: The process of matching fields from one data source to another
数据映射:将字段从一个数据源匹配到另一个数据源的过程

Data merging: The process of combining two or more datasets into a single dataset
数据合并:将两个或多个数据集合并为一个数据集的过程

Data model: A tool for organizing data elements and how they relate to one another
数据模型:用于组织数据元素以及它们如何相互关联的工具

Data privacy: Preserving a data subject’s information any time a data transaction occurs
数据隐私:在数据交易发生时保留数据主体的信息

Data range: Numerical values that fall between predefined maximum and minimum values
数据范围:介于预定义的最大值和最小值之间的数值

Data replication: The process of storing data in multiple locations
数据复制:将数据存储在多个位置的过程

Data science: A field of study that uses raw data to create new ways of modeling and understanding the unknown
数据科学:一个研究领域,使用原始数据来创建建模和理解未知的新方法

Data security: Protecting data from unauthorized access or corruption by adopting safety measures
数据安全:通过采取安全措施保护数据免受未经授权的访问或损坏

Data storytelling: Communicating the meaning of a dataset with visuals and a narrative that are customized for an audience
数据故事讲述:通过为受众定制的视觉效果和叙述来传达数据集的含义

Data strategy: The management of the people, processes, and tools used in data analysis
数据策略:对数据分析中使用的人员、流程和工具的管理

Data structure: A format for organizing and storing data
数据结构:组织和存储数据的格式

Data transfer: The process of copying data from a storage device to computer memory or from one computer to another
数据传送:把数据从储存装置复制到电脑内存或从一部电脑复制到另一部电脑的过程

Data type: An attribute that describes a piece of data based on its values, its programming language, or the operations it can perform
数据类型:根据数据的值、编程语言或可以执行的操作描述数据的属性

Data validation: A tool for checking the accuracy and quality of data
数据验证:用于检查数据准确性和质量的工具

Data validation process: The process of checking and rechecking the quality of data so that it is complete, accurate, secure and consistent
数据验证过程:检查和重新检查数据质量的过程,以确保数据完整、准确、安全和一致

Data visualization: The graphical representation of data
数据可视化:数据的图形表示

Data warehousing specialist: A professional who develops processes and procedures to effectively store and organize data
数据仓库专家:开发流程和程序以有效存储和组织数据的专业人员

Database: A collection of data stored in a computer system
数据库:存储在计算机系统中的数据集合

Dataset: A collection of data that can be manipulated or analyzed as one unit
数据集:可以作为一个单元进行操作或分析的数据集合

DATEDIF: A spreadsheet function that calculates the number of days, months, or years between two dates
DATEDIF:一个电子表格函数,用于计算两个日期之间的天数、月数或年数

Decision tree: A tool that helps analysts make decisions about critical features of a visualization
决策树:帮助分析师对可视化的关键功能做出决策的工具

Delimiter: A character that indicates the beginning or end of a data item
分隔符:表示数据项开始或结束的字符

Density map: A data visualization that represents concentrations, with color representing the number or frequency of data points in a given area on a map
密度图:表示浓度的数据可视化,颜色表示地图上给定区域中数据点的数量或频率

Descriptive metadata: Metadata that describes a piece of data and can be used to identify it at a later point in time
描述性元数据:描述一段数据的元数据,可用于在以后的时间点识别该数据

Design thinking: A process used to solve complex problems in a user-centric way
设计思维:以用户为中心解决复杂问题的过程

Digital photo: An electronic or computer-based image usually in BMP or JPG format
数码相片:通常是BMP或JPG格式的电子或电脑影像

Dirty data: Data that is incomplete, incorrect, or irrelevant to the problem to be solved
脏数据:不完整、不正确或与待解决问题无关的数据

Discrete data: Data that is counted and has a limited number of values
离散数据:被计数且具有有限数量值的数据

DISTINCT: A keyword that is added to a SQL SELECT statement to retrieve only non-duplicate entries
DISTINCT:添加到SQL SELECT语句中的关键字,用于仅检索不重复的条目

Distribution graph: A data visualization that displays the frequency of various outcomes in a sample
分布图:显示样本中各种结果的频率的数据可视化

Diverging color palette: A color theme that displays two ranges of data values using two different hues, with color intensity representing the magnitude of the values
发散调色板:使用两种不同色调显示两个数据值范围的颜色主题,颜色强度表示值的大小

Donut chart: A data visualization where segments of a ring represent data values adding up to a whole
圆环图:一种数据可视化,其中一个环的片段表示数据值加起来是一个整体

dplyr (R): An R package in Tidyverse that offers a consistent set of functions to complete common data-manipulation tasks
dapr(R):Tidyverse中的一个R包,提供一组一致的函数来完成常见的数据操作任务

DROP TABLE: A SQL clause that removes a temporary table from a database
DROP TABLE:从数据库中删除临时表的SQL子句

Duplicate data: Any record that inadvertently shares data with another record
重复数据:无意中与另一个记录共享数据的任何记录

Dynamic visualizations: Data visualizations that are interactive or change over time
动态可视化:交互式或随时间变化的数据可视化

E

Elevator pitch: A short statement describing an idea or concept
电梯演讲:描述一个想法或概念的简短陈述

Emphasis: The design principle of arranging visual elements to focus the audience’s attention on important information in a data visualization
强调:安排视觉元素以将观众的注意力集中在数据可视化中的重要信息上的设计原则

Engagement: Capturing and holding someone’s interest and attention during a data presentation
参与:在数据演示过程中捕捉并保持某人的兴趣和注意力

Equation: A calculation that involves addition, subtraction, multiplication, or division (also called a math expression)
方程式:涉及加、减、乘或除的计算(也称为数学表达式)

Estimated response rate: The average number of people who typically complete a survey
估计回复率:通常完成调查的平均人数

Ethics: Well-founded standards of right and wrong that prescribe what humans ought to do, usually in terms of rights, obligations, benefits to society, fairness, or specific virtues
道德规范:有根据的对与错的标准,规定人类应该做什么,通常是权利、义务、对社会的好处、公平或特定的美德

External data: Data that lives, and is generated, outside of an organization
外部数据:在组织外部存在和生成的数据

F

Facets (R): A series of functions that splits data into subsets in a matrix of panels
Facets(R):一系列函数,用于将数据拆分为面板矩阵中的子集

Factor (R): An object that stores categorical data where the data values are limited and usually based on a finite group, such as country or year
Factor(R):存储分类数据的对象,其中数据值是有限的,通常基于有限组,如国家或年份

Fairness: A quality of data analysis that does not create or reinforce bias
公平性:数据分析的质量不会产生或强化偏见

Field: A single piece of information from a row or column of a spreadsheet; in a data table, typically a column in the table
字段:电子表格中某一行或某一列的单条信息;在数据表中,通常为表中的一列

Field length: A tool for determining how many characters can be keyed into a spreadsheet field
字段长度:用于确定电子表格字段中可以键入多少字符的工具

Fill handle: A box in the lower-right-hand corner of a selected spreadsheet cell that can be dragged through neighboring cells in order to continue an instruction
填充手柄:位于选定的电子表格单元格右下角的一个框,可以通过拖动该框穿过相邻的单元格来继续执行指令

Filled map: A data visualization that colors areas in a map based on measurements or dimensions
填充地图:一种数据可视化,可根据测量值或尺寸为地图中的区域着色

Filtering: The process of showing only the data that meets a specified criteria while hiding the rest
过滤:只显示符合指定条件的数据而隐藏其余数据的过程

Find and replace: A tool that finds a specified search term and replaces it with something else
查找和替换:一种工具,用于查找指定的搜索词并将其替换为其他内容

First-party data: Data collected by an individual or group using their own resources
第一方数据:由个人或团体使用自己的资源收集的数据

Float: A number that contains a decimal
浮点数:包含小数的数字

Foreign key: A field within a database table that is a primary key in another table (Refer to primary key)
Foreign key:数据库表中的字段,它是另一个表中的主键(参见主键)

Formula: A set of instructions used to perform a calculation using the data in a spreadsheet
公式:用于使用电子表格中的数据执行计算的一组指令

Framework: The context a presentation needs to create logical connections that tie back to the business task and metrics
框架:演示文稿创建逻辑连接所需的上下文,这些逻辑连接与业务任务和指标相关联

FROM: The section of a query that indicates from which table(s) to extract the data
FROM:查询的一部分,指示从哪个或哪些表提取数据

Function: A preset command that automatically performs a specific process or task using the data in a spreadsheet
功能:一种预置命令,使用电子表格中的数据自动执行特定的过程或任务

Function (R): A body of reusable code for performing specific tasks in R
Function(R):用于在R中执行特定任务的可重用代码体

FWF (fixed-width file): A text file with a specific format, which enables the saving of textual data in an organized fashion
FWF(固定宽度文件):具有特定格式的文本文件,可以有组织地保存文本数据

G

GAM (generalized additive model) smoothing (R): A process for smoothing plots with a large number of points
GAM(广义加性模型)平滑(R):平滑具有大量点的图的过程

Gantt chart: A data visualization that displays the duration of events or activities on a timeline
甘特图:一种数据可视化,在时间轴上显示事件或活动的持续时间

Gap analysis: A method for examining and evaluating the current state of a process in order to identify opportunities for improvement in the future
差距分析:一种检查和评估过程当前状态的方法,以确定未来改进的机会

Gauge chart: A data visualization that shows a single result within a progressive range of values
仪表图:一种数据可视化,在渐进的值范围内显示单个结果

General Data Protection Regulation of the European Union (GDPR): Policy-making body in the European Union created to help protect people and their data
欧盟通用数据保护条例(GDPR):欧盟的政策制定机构,旨在帮助保护人们及其数据

Geolocation: The geographical location of a person or device by means of digital information
地理位置:通过数字信息确定人员或设备的地理位置

Geom (R): The geometric object used to represent data
Geom(R):用于表示数据的几何对象

ggplot2 (R): An R package in Tidyverse that creates a variety of data visualizations by applying different visual properties to the data variables in R
ggplot2(R):Tidyverse中的一个R包,通过将不同的可视化属性应用于R中的数据变量来创建各种数据可视化。

Good data source: A data source that is reliable, original, comprehensive, current, and cited (ROCCC)
良好的数据源:可靠、原始、全面、最新和被引用的数据源(ROCCC)

GROUP BY: A SQL clause that groups rows that have the same values from a table into summary rows
GROUP BY:一个SQL子句,用于将表中具有相同值的行分组为汇总行

H

HAVING: A SQL clause that adds a filter to a query instead of the underlying table that can only be used with aggregate functions
HAVING:一个SQL子句,它向查询添加筛选器,而不是只能与聚合函数一起使用的基础表

head() (R): An R function that returns a preview of the column names and the first few rows of a dataset
head()(R):一个R函数,返回数据集的列名和前几行的预览

Header: The first row in a spreadsheet that labels the type of data in each column
标题:电子表格中的第一行,用于标记每列中的数据类型

Headline: Text at the top of a visualization that communicates the data being presented
标题:可视化顶部的文本,用于传达所呈现的数据

Heat map: A data visualization that uses color contrast to compare categories in a dataset
热图:一种数据可视化,使用颜色对比来比较数据集中的类别

Highlight table: A data visualization that uses conditional formatting and color on a table
突出显示表:在表上使用条件格式和颜色的数据可视化

Histogram: A data visualization that shows how often data values fall into certain ranges
直方图:显示数据值落入特定范围的频率的数据可视化

HTML (Hypertext Markup Language): The set of markup symbols or codes used to create a webpage
HTML(超文本标记语言):用于创建网页的标记符号或代码集

HTML5: A programming language that provides structure for web pages and connects to hosting platforms
HTML5:一种为网页提供结构并连接到托管平台的编程语言

Hypothesis: A theory that one might try to prove or disprove with data
假设:人们可能试图用数据证明或反驳的理论

Hypothesis testing: A process to determine if a survey or experiment has meaningful results
假设检验:确定调查或实验是否具有有意义的结果的过程

I

IDE (Integrated Development Environment): A software application that brings together all the tools a data analyst may want to use in a single place
IDE(集成开发环境):一个软件应用程序,它将数据分析师可能想要使用的所有工具集中在一个地方

Incomplete data: Data that is missing important fields
不完整数据:缺少重要字段的数据

Inconsistent data: Data that uses different formats to represent the same thing
不一致的数据:使用不同格式表示同一事物的数据

Incorrect/inaccurate data: Data that is complete but inaccurate
不正确/不准确的数据:完整但不准确的数据

Inline code: Code that can be inserted directly into the text of an R Markdown file
内联代码:可以直接插入到R Markdown文件文本中的代码

INNER JOIN : A SQL function that returns records with matching values in both tables
INNER JOIN:一个SQL函数,返回两个表中具有匹配值的记录

Inner query: A SQL subquery that is inside of another SQL statement
内部查询:位于另一个SQL语句内部的SQL子查询

Internal data: Data that lives within a company’s own systems
内部数据:存在于公司自身系统中的数据

Interpretation bias: The tendency to interpret ambiguous situations in a positive or negative way
解释偏见:倾向于以积极或消极的方式解释模棱两可的情况

J

Java: A programming language widely used to create enterprise web applications that can run on multiple clients
Java:一种广泛用于创建可在多个客户端上运行的企业Web应用程序的编程语言

JOIN: A SQL function that is used to combine rows from two or more tables based on a related column
JOIN:一个SQL函数,用于根据相关列将两个或多个表中的行联合收割机组合起来

Jupyter Notebook: An open-source web application used to create and share documents that contain live code, equations, visualizations and narrative text
Xuyter Notebook:一个开源Web应用程序,用于创建和共享包含实时代码、方程式、可视化和叙述性文本的文档

K

L

Label: Text in a visualization that identifies a value or describes a scale
标签:可视化中标识值或描述比例的文本

Labels and annotations (R): A group of R functions used for customizing a plot
Labels and annotations(R):一组用于自定义绘图的R函数

Leading question: A question that steers people toward a certain response
引导性问题:引导人们做出某种反应的问题

LEFT: A function that returns a set number of characters from the left side of a text string
LEFT:返回文本字符串左侧的一组字符的函数

LEFT JOIN: A SQL function that will return all the records from the left table and only the matching records from the right table
LEFT JOIN:一个SQL函数,它将返回左表中的所有记录,只返回右表中的匹配记录

Legend: A tool that identifies the meaning of various elements in a data visualization
图例:用于识别数据可视化中各种元素的含义的工具

LEN: A function that returns the length of a text string by counting the number of characters it contains
LEN:一个通过计算文本字符串所包含的字符数来返回其长度的函数

Length: The number of characters in a text string
长度:文本字符串中的字符数

Library: A directory containing all of a data analyst’s installed packages
Library:包含数据分析师安装的所有软件包的目录

LIMIT: A SQL clause that specifies the maximum number of records returned in a query
LIMIT:一个SQL子句,指定查询中返回的最大记录数

Line graph: A data visualization that uses one or more lines to display shifts or changes in data over time
折线图:一种数据可视化,使用一条或多条线来显示数据随时间的变化或变化

List: A vector whose elements can be of any type
List:一个向量,其元素可以是任何类型

Live data: Data that is automatically updated
实时数据:自动更新的数据

Loess smoothing (R): A process used for smoothing plots with fewer than 1,000 points
黄土平滑(R):用于平滑少于1,000个点的地块的过程

Log file: A computer-generated file that records events from operating systems and other software programs
日志文件:计算机生成的文件,记录来自操作系统和其他软件程序的事件

Logical operator: An operator that returns a logical data type
逻辑运算符:返回逻辑数据类型的运算符

Long data: A dataset in which each row is one time point per subject, so each subject has data in multiple rows
长数据:数据集中每行是每个受试者的一个时间点,因此每个受试者在多行中有数据

M

Mandatory: A data value that cannot be left blank or empty
必填:不能为空或为空的数据值

Map: A data visualization that organizes data geographically
Map:按地理位置组织数据的数据可视化

Mapping (R): The process of matching up a specific variable in a dataset with a specific aesthetic
映射(R):将数据集中的特定变量与特定美学相匹配的过程

Margin of error: The maximum amount that sample results are expected to differ from those of the actual population
误差幅度:样本结果预期与实际总体结果差异的最大量

Markdown (R): A syntax for formatting plain text files
Markdown(R):格式化纯文本文件的语法

Mark: A visual object in a data visualization such as a point, line, or shape
标记:数据可视化中的可视对象,如点、线或形状

MATCH: A spreadsheet function used to locate the position of a specific lookup value
MATCH:用于定位特定查找值位置的电子表格函数

Math expression: A calculation that involves addition, subtraction, multiplication, or division (also called an equation)
数学表达式:涉及加、减、乘或除的计算(也称为等式)

Math function: A function that is used as part of a mathematical formula
数学函数:用作数学公式一部分的函数

Matrix: A two-dimensional collection of data elements with rows and columns
矩阵:具有行和列的数据元素的二维集合

MAX: A spreadsheet function that returns the largest numeric value from a range of cells
MAX:一个电子表格函数,返回一个单元格范围内的最大数值

MAXIFS: A spreadsheet function that returns the maximum value from a given range that meets a specified condition
MAXIFS:一个电子表格函数,返回满足指定条件的给定范围内的最大值

McCandless Method: A method for presenting data visualizations that moves from general to specific information
McCandless方法:一种用于呈现从一般信息到特定信息的数据可视化的方法

Measurable question: A question whose answers can be quantified and assessed
可衡量的问题:其答案可以量化和评估的问题

Mental model: A data analyst’s thought process and approach to a problem
心智模型:数据分析师的思维过程和解决问题的方法

Mentor: Someone who shares knowledge, skills, and experience to help another grow both professionally and personally
导师:分享知识、技能和经验以帮助他人在专业和个人方面成长的人

Merger: An agreement that unites two organizations into a single new one
合并:将两个组织合并为一个新组织的协议

Metadata: Data about data
元数据:关于数据的数据

Metadata repository: A database created to store metadata
Metadata repository:用于存储元数据的数据库

Metric: A single, quantifiable type of data that is used for measurement
指标:用于测量的单一、可量化的数据类型

Metric goal: A measurable goal set by a company and evaluated using metrics
度量目标:由公司设定并使用度量进行评估的可衡量目标

MID: A function that returns a segment from the middle of a text string
MID:返回文本字符串中间段的函数

MIN: A spreadsheet function that returns the smallest numeric value from a range of cells
MIN:一个电子表格函数,返回一个单元格范围内的最小数值

MINIFS: A spreadsheet function that returns the minimum value from a given range that meets a specified condition
MINIFS:一个电子表格函数,返回给定范围内满足指定条件的最小值

Modulo: An operator (%) that returns the remainder when one number is divided by another
模:一个运算符(%),当一个数被另一个数除时返回余数

Movement: The design principle of arranging visual elements to guide the audience’s eyes from one part of a data visualization to another
运动:安排视觉元素的设计原则,以引导观众的眼睛从数据可视化的一部分到另一部分

mutate() (R): An R function that makes changes to a dataframe separating and merging columns or creating new variables
mutate()(R):一个R函数,用于更改一个分隔和合并列的框架或创建新变量

N

Naming conventions: Consistent guidelines that describe the content, creation date, and version of a file in its name
文档约定:在文件名中描述文件内容、创建日期和版本的一致准则

Narrative: (Refer to Story)
故事:(参考故事)

Nested: Code that performs a particular function and is contained within code that performs a broader function
嵌套:执行特定功能的代码,包含在执行更广泛功能的代码中

Nested function: A function that is completely contained within another function
嵌套函数:完全包含在另一个函数中的函数

Networking: Building relationships by meeting people both in person and online
人际关系:通过面对面和在线接触建立关系

Nominal data: A type of qualitative data that is categorized without a set order
名义数据:一种没有固定顺序的定性数据

Normalized database: A database in which only related data is stored in each table
规范化数据库:每个表中只存储相关数据的数据库

Notebook: An interactive, editable programming environment for creating data reports and showcasing data skills
Notebook:一个交互式的可编辑编程环境,用于创建数据报告和展示数据技能

Null: An indication that a value does not exist in a dataset
不存在:表示数据集中不存在某个值

O

Observation: The attributes that describe a piece of data contained in a row of a table
Observation:描述表的一行中包含的一段数据的属性

Observer bias: The tendency for different people to observe things differently (also called experimenter bias)
观察者偏见:不同的人观察事物的倾向不同(也称为实验者偏见)。

Open data: Data that is available to the public
开放数据:向公众提供的数据

Open-source: Code that is freely available and may be modified and shared by the people who use it
开源:代码是免费提供的,可以修改和共享的人谁使用它

Openness: The aspect of data ethics that promotes the free access, usage, and sharing of data
开放性:数据伦理的一个方面,促进数据的自由访问、使用和共享。

Operator: A symbol that names the operation or calculation to be performed
运算符:命名要执行的操作或计算的符号

ORDER BY: A SQL clause that sorts results returned in a query
ORDERBY:一个SQL子句,用于对查询中返回的结果进行排序

Order of operations: Using parentheses to group together spreadsheet values in order to clarify the order in which operations should be performed
操作顺序:使用括号将电子表格值组合在一起,以明确执行操作的顺序

Ordinal data: Qualitative data with a set order or scale
有序数据:具有一定顺序或尺度的定性数据

Outdated data: Any data that has been superseded by newer and more accurate information
过时的数据:任何已被更新和更准确的信息所取代的数据

OUTER JOIN: A SQL function that combines RIGHT and LEFT JOIN to return all matching records in both tables
OUTER JOIN:一个SQL函数,它结合了RIGHT和LEFT JOIN以返回两个表中的所有匹配记录

Outer query: A SQL statement containing a subquery
外部查询:包含子查询的SQL语句

Ownership: The aspect of data ethics that presumes individuals own the raw data they provide and have primary control over its usage, processing, and sharing
所有权:数据伦理的一个方面,假定个人拥有他们提供的原始数据,并对其使用,处理和共享拥有主要控制权

P

Package (R): A unit of reproducible R code
Package(R):可复制的R代码的单位

Packed bubble chart: A data visualization that displays data in clustered circles
压缩气泡图:一种数据可视化,以聚集的圆圈显示数据

Pattern: The design principle of using similar visual elements to demonstrate trends and relationships in a data visualization
模式:在数据可视化中使用相似的视觉元素来展示趋势和关系的设计原则

PHP (Hypertext Preprocessor): A programming language for web application development
PHP(Hypertext Preprocessor):一种用于Web应用程序开发的编程语言

Pie chart: A data visualization that uses segments of a circle to represent the proportions of each data category compared to the whole
饼图:一种数据可视化,使用圆的线段来表示每个数据类别与整体相比的比例

Pipe (R): A tool in R for expressing a sequence of multiple operations, represented with “%>%”
Pipe(R):R语言中的一种工具,用于表示一系列多个操作,用“%> %”表示。

Pivot chart: A chart created from the fields in a pivot table
数据透视图:从数据透视表中的字段创建的图表

Pivot table: A data summarization tool used to sort, reorganize, group, count, total, or average data
数据透视表:用于对数据进行排序、重新组织、分组、计数、总计或平均的数据汇总工具

Pixel: In digital imaging, a small area of illumination on a display screen that, when combined with other adjacent areas, forms a digital image
像素:在数码影像中,显示屏上的一个小的照明区,当它与其他邻近的区域结合后,便形成数码影像

Population: In data analytics, all possible data values in a dataset
总体:在数据分析中,数据集中所有可能的数据值

Portfolio: A collection of materials that can be shared with potential employers
Portfolio:可以与潜在雇主分享的材料集合

Pre-attentive attributes: The elements of a data visualization that an audience recognizes automatically without conscious effort
预先注意的属性:受众无需有意识的努力即可自动识别的数据可视化元素

Primary key: An identifier in a database that references a column in which each value is unique (Refer to foreign key)
主键:数据库中的一个标识符,它引用一个列,其中每个值都是唯一的(参考外键)

Problem domain: The area of analysis that encompasses every activity affecting or affected by a problem
问题域:分析的领域,包括影响问题或受问题影响的每一项活动

Problem types: The various problems that data analysts encounter, including categorizing things, discovering connections, finding patterns, identifying themes, making predictions, and spotting something unusual
问题类型:数据分析师遇到的各种问题,包括对事物进行分类、发现联系、查找模式、识别主题、进行预测以及发现异常

Profit margin: A percentage that indicates how many cents of profit has been generated for each dollar of sale
利润率:一个百分比,表示每一美元的销售产生多少美分的利润

Programming language: A system of words and symbols used to write instructions that computers follow
程式语言:一套文字和符号系统,用来写指令,让电脑执行

Proportion: The design principle of using the relative size and arrangement of visual elements to demonstrate information in a data visualization
比例:使用视觉元素的相对大小和排列来展示数据可视化中的信息的设计原则

Python: A general-purpose programming language
Python:一种通用编程语言

Q

Qualitative data: A subjective and explanatory measure of a quality or characteristic
定性数据:对质量或特征的主观和解释性测量

Quantitative data: A specific and objective measure, such as a number, quantity, or range
定量数据:一个具体和客观的衡量标准,如数字,数量或范围

Query: A request for data or information from a database
查询:从数据库中请求数据或信息

Query language: A computer programming language used to communicate with a database
查询语言:用来与资料库通讯的电脑程式语言

R

R: A programming language used for statistical analysis, visualization, and other data analysis
R:用于统计分析、可视化和其他数据分析的编程语言

R Markdown: A file format for making dynamic documents with R
R Markdown:一种用R语言制作动态文档的文件格式

R Notebook: A document for running code and displaying the graphs and charts that visualize the code
R Notebook:用于运行代码并显示可视化代码的图形和图表的文档

Random sampling: A way of selecting a sample from a population so that every possible type of the sample has an equal chance of being chosen
随机抽样:从总体中选择样本的一种方法,使每种可能的样本类型都有同等的机会被选中。

Range: A collection of two or more cells in a spreadsheet
范围:电子表格中两个或多个单元格的集合

Ranking: A system to position values of a dataset within a scale of achievement or status
排名:在成就或地位范围内定位数据集值的系统

readr (R): An R package in Tidyverse used for importing data
readr(R):Tidyverse中用于导入数据的R包

Record: A collection of related data in a data table, usually synonymous with row
记录:数据表中相关数据的集合,通常与行同义

Redundancy: When the same piece of data is stored in two or more places
冗余:当相同的数据存储在两个或多个位置时

Reframing: The process of restating a problem or challenge, then redirecting it toward a potential resolution
重新构建:重新陈述问题或挑战,然后将其转向潜在解决方案的过程

Regular expression (RegEx): A rule that says the values in a table must match a prescribed pattern
正则表达式(RegEx):一种规则,表明表中的值必须匹配指定的模式

Relational database: A database that contains a series of tables that can be connected to form relationships
关系数据库:包含一系列表的数据库,这些表可以连接起来形成关系

Relational operator: An operator used to compare values, also known as a comparator
关系运算符:用于比较值的运算符,也称为比较器

Relativity: The process of considering observations in relation or proportion to something else
相对性:将观察结果与其他事物相关联或成比例地考虑的过程。

Relevant question: A question that has significance to the problem to be solved
相关问题:对要解决的问题有意义的问题

Remove duplicates: A spreadsheet tool that automatically searches for and eliminates duplicate entries from a spreadsheet
删除重复项:一个电子表格工具,可以自动搜索和删除电子表格中的重复条目

Repetition: The design principle of repeating visual elements to demonstrate meaning in a data visualization
重复:重复视觉元素以展示数据可视化中的含义的设计原则

Report: A static collection of data periodically given to stakeholders
报告:定期提供给利益相关者的静态数据集合

Return on investment (ROI): A formula that uses the metrics of investment and profit to evaluate the success of an investment
投资回报率(ROI):使用投资和利润的指标来评估投资成功与否的公式

Revenue: The total amount of income generated by the sale of goods or services
收入:商品或服务的销售所产生的收入总额

Rhythm: The design principle of creating movement and flow in a data visualization to engage an audience
节奏:在数据可视化中创造运动和流动的设计原则,以吸引观众

RIGHT: A function that returns a set number of characters from the right side of a text string
RIGHT:返回文本字符串右侧的一组字符的函数

RIGHT JOIN: A SQL function that will return all records from the right table and only the matching records from the left
RightJOIN:一个SQL函数,将返回右表中的所有记录,只返回左表中的匹配记录

Root cause: The reason why a problem occurs
根本原因:问题发生的原因

ROUND: A SQL function that returns a number rounded to a certain number of decimal places.
ROUND:一个SQL函数,返回舍入到一定小数位数的数字。

Ruby: An object-oriented programming language for web application development
Ruby:一种用于Web应用程序开发的面向对象编程语言

S

Sample: In data analytics, a segment of a population that is representative of the entire population
样本:在数据分析中,代表整个人口的一部分人口

Sampling bias: Overrepresenting or underrepresenting certain members of a population as a result of working with a sample that is not representative of the population as a whole
抽样偏差:由于使用的样本不能代表整个人口而导致的人口中某些成员的代表性过高或过低

Scatterplot: A data visualization that represents relationships between different variables with individual data points without a connecting line
散点图:一种数据可视化,用单个数据点表示不同变量之间的关系,而不使用连接线

Schema: A way of describing how something, such as data, is organized
模式:一种描述事物(如数据)如何组织的方式

Scope of work (SOW): An agreed-upon outline of the tasks to be performed during a project
工作范围(SOW):在项目期间要执行的任务的商定大纲

Second-party data: Data collected by a group directly from its audience and then sold
第二方数据:由一个团体直接从其受众那里收集然后出售的数据

SELECT: The section of a query that indicates from which column(s) to extract the data
SELECT:查询的一部分,指示要从哪些列提取数据

SELECT INTO: A SQL clause that copies data from one table into a temporary table without adding the new table to the database
SELECT INTO:一个SQL子句,它将数据从一个表复制到一个临时表中,而不将新表添加到数据库中

Shiny (R): An R package used to build interactive web apps with R code
Shiny(R):一个R包,用于使用R代码构建交互式Web应用程序

Small data: Small, specific data points typically involving a short period of time, which are useful for making day-to-day decisions
小数据:小的、特定的数据点,通常涉及短时间段,对日常决策很有用。

SMART methodology: A tool for determining a question’s effectiveness based on whether it is specific, measurable, action-oriented, relevant, and time-bound
SMART方法:根据问题是否具体、可衡量、面向行动、相关和有时限来确定问题有效性的工具

Smoothing (R): A process used to make data visualizations in R clearer and more readable
平滑(R):用于使R中的数据可视化更清晰,更具可读性的过程

Smoothing line (R): A line on a data visualization that uses smoothing to represent a trend
平滑线(R):数据可视化中使用平滑表示趋势的线

Social media: Websites and applications through which users create and share content or participate in social networking
社交媒体:网站和应用程序,用户通过它们创建和共享内容或参与社交网络

Soft skills: Nontechnical traits and behaviors that relate to how people work
软技能:与人们如何工作有关的非技术特征和行为

Sort range: A spreadsheet menu function that sorts a specified range and preserves the cells outside the range
排序范围:一个电子表格菜单功能,它对指定的范围进行排序,并保留范围外的单元格

Sort sheet: A spreadsheet menu function that sorts all data by the ranking of a specific sorted column and keeps data together across rows
排序表:一个电子表格菜单功能,它根据特定排序列的排名对所有数据进行排序,并将数据跨行保存在一起

Sorting: The process of arranging data into a meaningful order to make it easier to understand, analyze, and visualize
排序:将数据排列成有意义的顺序,使其更易于理解、分析和可视化的过程

Specific question: A question that is simple, significant, and focused on a single topic or a few closely related ideas
具体问题:简单、有意义的问题,集中在一个主题或几个密切相关的想法上。

SPLIT: A spreadsheet function that divides text around a specified character and puts each fragment into a new, separate cell
SPLIT:一个电子表格函数,它将文本围绕指定字符分割,并将每个片段放入一个新的单独单元格中

Sponsor: A professional advocate who is committed to moving forward the career of another
赞助人:致力于推动他人职业发展的专业倡导者

Spotlightling: Scanning through data to quickly identify the most important insights
聚焦:扫描数据以快速识别最重要的见解

Spreadsheet: A digital worksheet
电子工作表:数字工作表

SQL: (Refer to Structured Query Language)
SQL:(参见结构化查询语言)

Stakeholders: People who invest time and resources into a project and are interested in its outcome
利益相关者:在项目中投入时间和资源并对其结果感兴趣的人

Static data: Data that doesn’t change once it has been recorded
静态数据:一旦记录下来就不会改变的数据

Static visualization: A data visualization that does not change over time unless it is edited
静态可视化:除非经过编辑,否则不会随时间变化的数据可视化

Statistical power: The probability that a test of significance will recognize an effect that is present
统计功效:显著性检验识别存在效应的概率

Statistical significance: The probability that sample results are not due to random chance
统计学显著性:样本结果不是随机产生的概率

Statistics: The study of how to collect, analyze, summarize, and present data
统计学:研究如何收集、分析、总结和呈现数据的学科

Story: The narrative of a data presentation that makes it meaningful and interesting
故事:数据呈现的叙述,使其有意义和有趣

String data type: A sequence of characters and punctuation that contains textual information (also called text data type)
字符串数据类型:包含文本信息的字符和标点符号序列(也称为文本数据类型)

Structural metadata: Metadata that indicates how a piece of data is organized and whether it is part of one or more than one data collection
结构元数据:指示数据块如何组织以及它是一个还是多个数据集合的一部分的元数据

Structured data: Data organized in a certain format such as rows and columns
结构化数据:以特定格式(如行和列)组织的数据

Structured Query Language: A computer programming language used to communicate with a database
结构化查询语言:用于与数据库通信的计算机编程语言

Structured thinking: The process of recognizing the current problem or situation, organizing available information, revealing gaps and opportunities, and identifying options
结构化思维:认识当前问题或情况、组织可用信息、揭示差距和机会以及确定选项的过程

Subquery: A SQL query that is nested inside a larger query
子查询:嵌套在较大查询中的SQL查询

SUBSTR: A SQL function that extracts a substring from a string variable
SUBSTR:从字符串变量中提取子字符串的SQL函数

Substring: A subset of a text string
子字符串:文本字符串的子集

Subtitle: Text that supports a headline by adding context and description
副标题:通过添加上下文和描述来支持标题的文本

SUM: A spreadsheet function that adds the values of a selected range of cells
SUM:一个电子表格函数,用于将选定单元格区域的值相加

SUMIF: A spreadsheet function that adds numeric data based on one condition
SUMIF:一个电子表格函数,根据一个条件添加数字数据

Summary table: A table used to summarize statistical information about data
汇总表:用于汇总有关数据的统计信息的表

SUMPRODUCT: A function that multiplies arrays and returns the sum of those products
SUMPRODUCT:一个函数,将数组相乘并返回这些乘积的总和

Swift: A programming language for macOS, iOS, watchOS, and tvOS
Swift:适用于macOS、iOS、watchOS和tvOS的编程语言

Symbol map: A data visualization that displays a mark over a given longitude and latitude
符号地图:显示给定经度和纬度上的标记的数据可视化

Syntax: The predetermined structure of a language that includes all required words, symbols, and punctuation, as well as their proper placement
语法:语言的预定结构,包括所有必需的单词、符号和标点符号,以及它们的正确位置

T

Tableau: A business intelligence and analytics platform that helps people visualize, understand, and make decisions with data
Tableau:一个商业智能和分析平台,可帮助人们可视化、理解数据并利用数据制定决策

Technical mindset: The ability to break things down into smaller steps or pieces and work with them in an orderly and logical way
技术思维:能够将事情分解成更小的步骤或部分,并以有序和逻辑的方式与他们一起工作

Temporary table: A database table that is created and exists temporarily on a database server
临时表:在数据库服务器上临时创建并存在的数据库表

Text data type: A sequence of characters and punctuation that contains textual information (also called string data type)
文本数据类型:包含文本信息的字符和标点符号序列(也称为字符串数据类型)

Text string: A group of characters within a cell, most often composed of letters
文本字符串:单元格内的一组字符,通常由字母组成

Third-party data: Data provided from outside sources that did not collect it directly
第三方数据:由外部来源提供的数据,但并非直接收集

Tibble (R): A streamlined variation of data frames
Tibble(R):数据帧的精简变体

Tidy data (R): A way of standardizing the organization of data within R
Tidy data(R):一种标准化R中数据组织的方法

tidyr (R): An R package in Tidyverse used for data cleaning to make tidy data
tidyr(R):Tidyverse中的一个R包,用于数据清理,使数据整洁

Tidyverse (R): A system of packages in R with a common design philosophy for data manipulation, exploration, and visualization
Tidyverse(R):一个R语言的包系统,具有数据操作、探索和可视化的通用设计理念

Time-bound question: A question that specifies a timeframe to be studied
有时限的问题:指定研究时限的问题

Transaction transparency: The aspect of data ethics that presumes all data-processing activities and algorithms should be explainable and understood by the individual who provides the data
交易透明度:数据伦理的一个方面,即假设所有数据处理活动和算法都应该是可解释的,并被提供数据的个人所理解。

Transferable skills: Skills and qualities that can transfer from one job or industry to another
可转移技能:可以从一个工作或行业转移到另一个工作或行业的技能和素质

TRIM: A function that removes leading, trailing, and repeated spaces in data
TRIM:删除数据中的前导、尾随和重复空格的函数

TSV (Tab-separated values file): A text file that stores a data table by separating columns of data with tabs
TSV(制表符分隔值文件):一种文本文件,通过用制表符分隔数据列来存储数据表

Turnover rate: The rate at which employees voluntarily leave a company
流动率:员工自愿离开公司的比率

Typecasting: Converting data from one type to another
类型转换:将数据从一种类型转换为另一种类型

U

Unbiased sampling: When the sample of the population being measured is representative of the population as a whole
无偏抽样:当被测量的总体样本代表整个总体时

Underscores: Lines used to underline words and connect text characters
下划线:用来在字下划线和连接文字字符的线条

Unfair question: A question that makes assumptions or is difficult to answer honestly
不公平的问题:做出假设或难以诚实回答的问题

Unique: A value that can’t have a duplicate
唯一:不能有重复的值

United States Census Bureau: An agency in the U.S. Department of Commerce that serves as the nation’s leading provider of quality data about its people and economy
美国人口普查局:美国商务部的一个机构,是美国人口和经济质量数据的主要提供者。

Unity: The design principle of using visual elements that complement each other to create aesthetic appeal and clarity in a data visualization
统一性:在数据可视化中使用相互补充的视觉元素来创造美感和清晰度的设计原则

Unstructured data: Data that is not organized in any easily identifiable manner
非结构化数据:未以任何易于识别的方式组织的数据

V

Validity: The degree to which data conforms to constraints when it is input, collected, or created
有效性:数据在输入、收集或创建时符合约束的程度

VALUE: A spreadsheet function that converts a text string that represents a number to a numeric value
VALUE:一个电子表格函数,用于将表示数字的文本字符串转换为数值

Variable (R): A representation of a value in R that can be stored for later use
变量(R):R中的值的表示,可以存储以供以后使用

Variety: The design principle of using different kinds of visual elements in a data visualization to engage an audience
多样性:在数据可视化中使用不同类型的视觉元素来吸引受众的设计原则

Vector (R): A group of data elements of the same type stored in a one-dimensional sequence in R
向量(R):以一维序列存储在R中的一组相同类型的数据元素

Verification: A process to confirm that a data-cleaning effort was well executed and the resulting data is accurate and reliable
验证:确认数据清理工作得到良好执行,并且得到的数据准确可靠的过程

Video file: A collection of images, audio files, and other data usually encoded in a compressed format such as MP4, MV4, MOV, AVI, or FLV
视频文件:通常以压缩格式(如MP4、MV4、MOV、AVI或FLV)编码的图像、音频文件和其他数据的集合

Vignette (R): Documentation for an R package that describes the problem the package is designed to solve, explains how its functions can be used, and lists any dependencies on other packages
晕影(R):R软件包的文档,描述该软件包旨在解决的问题,解释如何使用其函数,并列出对其他软件包的任何依赖

Visual form: The appearance of a data visualization that gives it structure and aesthetic appeal
可视化形式:数据可视化的外观,使其具有结构和美感

Visualization: (Refer to Data visualization)
可视化:(参见数据可视化)

VLOOKUP: A spreadsheet function that vertically searches for a certain value in a column to return a corresponding piece of information 重试    错误原因

W 重试    错误原因

WHERE: The section of a query that specifies criteria that the requested data must meet 重试    错误原因

Wide data: A dataset in which every data subject has a single row with multiple columns to hold the values of various attributes of the subject 重试    错误原因

WITH: A SQL clause that creates a temporary table that can be queried multiple times 重试    错误原因

World Health Organization: An organization whose primary role is to direct and coordinate international health within the United Nations system 重试    错误原因

X 重试    错误原因

X-axis: The horizontal line of a graph usually placed at the bottom, which is often used to represent time scales and discrete categories 重试    错误原因

Y 重试    错误原因

Y-axis: The vertical line of a graph usually placed to the left, which is often used to represent frequencies and other numerical variables
Y轴:通常放置在图表左侧的垂直线,通常用于表示频率和其他数值变量

YAML: A language that translates data to improve readability
YAML:一种翻译数据以提高可读性的语言

Z