Digital Age Reshaping Social Science Research: A Review of “Bit by Bit: Social Research in the Digital Age”
数字时代重塑社会科学研究: 《一点一滴:数字时代的社会研究》述评
Cheng Hao(程昊), School of Social Sciences
程昊,社会科学学院
Introduction
介绍
The advent and development of digital technology is precipitating a significant paradigm shift within the field of social science research. Technological leap has conferred upon researchers an unprecedented capacity for data collection and analysis, thereby introducing novel prospects and challenges to the social sciences. In this context, Matthew J. Salganik’s seminal work, “Bit by Bit: Social Research in the Digital Age”, has emerged as a necessary contribution, offering innovative perspectives, methodologies, and paradigms that are pivotal to the advancement of social scientific inquiry in the era of data-driven research (Salganik, 2019).
数字技术的出现和发展正在加速社会科学研究领域的重大范式转变。技术的飞跃赋予了研究人员前所未有的数据收集和分析能力,从而给社会科学带来了新的前景和挑战。在此背景下,马修·J·萨尔加尼克(Matthew J. Salganik)的开创性著作《一点一滴:数字时代的社会研究》成为了必要的贡献,提供了对于推动社会科学探究至关重要的创新视角、方法论和范式。数据驱动研究的时代( Salganik,2019) 。
Matthew J. Salganik is a Professor of Sociology at Princeton University1, mainly interested in social networks and computational social science. Salganik’s research has been published in journals with high impacts, including Science, Proceedings of the National Academy of Sciences (PNAS), Sociological Methodology, and Journal of the American Statistical Association. This award-winning book Bit by Bit is one of the few systematic reviews that sociologists have hitherto addressed to computational sociology, which has been translated into five languages. Salganik’s doctoral supervisor was Duncan Watts, who discovered the small-world network model, an important concept in the field of social network analysis. Influenced by Watts, Salganik was concerned about the impact of digital technology and computational methods on the social sciences. Through his famous “MusicLab” research (Salganik et al., 2006), Salganik realized and revealed the possibility of social science to establish a new research paradigm in the digital age.
Matthew J. Salganik 是普林斯顿大学1社会学教授,主要对社交网络和计算社会科学感兴趣。 Salganik的研究成果发表在具有较高影响力的期刊上,包括《Science》、《美国国家科学院院刊》 ( PNAS)、《社会学方法论》和《美国统计协会杂志》 。这本获奖著作《一点一滴》是迄今为止社会学家对计算社会学进行的为数不多的系统评论之一,该书已被翻译成五种语言。 Salganik的博士生导师是Duncan Watts,他发现了小世界网络模型,这是社交网络分析领域的一个重要概念。受瓦茨的影响,萨尔加尼克关注数字技术和计算方法对社会科学的影响。通过他著名的“MusicLab”研究( Salganik等,2006) ,Salganik认识到并揭示了社会科学在数字时代建立新的研究范式的可能性。
Salganik (2019:XV) wrote in the book that, “Changes in technology—specifically the transition from the analog age to the digital age—mean that we can now collect and analyze social data in new ways.” In the past, sociologists typically acquired knowledge about society through two primary methods. The first one, field observations and interviews lead to qualitative research, while the second, survey methodology directs towards quantitative research in traditional sociological studies. However, digital technologies such as the Internet, smartphones, and social media have introduced new opportunities for social science research. These opportunities demand the modernization of these classic methods, not their replacement. We can now access vast amounts of digital footprint and conduct in-depth analyses, while digital footprint was nearly impossible to record and utilize in the past. Concurrently, the development of computational techniques and interdisciplinary integration has provided social sciences with a plethora of computational methods and analytical tools, thereby enabling the discovery of entirely new social facts from complex data sets.
Salganik ( 2019:XV) 在书中写道, “技术的变化,特别是从模拟时代到数字时代的转变,意味着我们现在可以以新的方式收集和分析社交数据。”过去,社会学家通常通过两种主要方法来获取有关社会的知识。第一个,实地观察和访谈导致定性研究,而第二个,调查方法则指向传统社会学研究中的定量研究。然而,互联网、智能手机和社交媒体等数字技术为社会科学研究带来了新的机遇。这些机会需要对这些经典方法进行现代化改造,而不是替代它们。我们现在可以访问大量的数字足迹并进行深入分析,而过去几乎不可能记录和利用数字足迹。同时,计算技术和跨学科整合的发展为社会科学提供了大量的计算方法和分析工具,从而能够从复杂的数据集中发现全新的社会事实。
However, big data are not perfect. Regardless of the characteristics that big data may possess, at its core, it remains data—traces of human social life (邱泽奇, 2018). What ethical and methodological frameworks should we employ in the collection and utilization of these data? How should we reconcile the relationship between traditional sociological surveys and the big data of the digital age? The digital age, while presenting new opportunities, also introduces new risks, which may necessitate an in-depth discussion within the fields of technological sociology and philosophy of technology. Salganik thoroughly considered both the opportunities and challenges that the digital age poses for sociological research, and this book, Bit by Bit, is a compilation of his reflections.
然而,大数据并不完美。不管大数据具有什么样的特征,其核心仍然是数据——人类社会生活的痕迹(邱泽奇,2018) 。在收集和利用这些数据时我们应该采用哪些伦理和方法框架?我们应该如何协调传统社会学调查与数字时代大数据的关系?数字时代在带来新机遇的同时,也带来了新的风险,这可能需要在技术社会学和技术哲学领域进行深入讨论。萨尔加尼克深入思考了数字时代给社会学研究带来的机遇和挑战,这本书《一点一滴》就是他的思考的汇编。
Although this book has its title translated as “Computational Sociology” in Chinese2, it does not delve into specific computational methods. Instead, the core concept of this book is the “Digital Age”, exploring the current state, problems, and future direction of computational social science in the digital age. Therefore, this book is not a textbook but an excellent guide to thinking for this new field of social science. Nonetheless, this is also timely and instructive for the current nascent stage of computational social science, especially for social scientists who wish to harness the potential of big data research.
虽然这本书的中文名称被翻译为“计算社会学” 2 ,但它并没有深入探讨具体的计算方法。相反,本书的核心概念是“数字时代”,探讨数字时代计算社会科学的现状、问题和未来方向。因此,本书不是教科书,而是社会科学这一新领域的优秀思考指南。尽管如此,这对于当前计算社会科学的萌芽阶段,特别是对于希望利用大数据研究潜力的社会科学家来说,也是及时且具有指导意义的。
Opportunities and challenges facing sociological research in the digital age
数字时代社会学研究面临的机遇与挑战
Salganik summarizes ten characteristics of big data: big, always-on, nonreactive, incomplete, inaccessible, nonrepresentative, drifting, algorithmically confounded, dirty, and sensitive. Big data is not perfect. According to the relationship between big data and social research, Salganik divides the above characteristics into two categories, among which the first three characteristics are beneficial to social research, and the last seven characteristics are unfavorable to social research. These characteristics of big data have changed methods of understanding and organizing society, brought about a transformation in the way of thinking, and also posed challenges to social science research.
萨尔加尼克总结了大数据的十大特征:大、永远在线、非反应性、不完整、不可访问、非代表性、漂移、算法混乱、肮脏和敏感。大数据并不完美。根据大数据与社会研究的关系,萨尔加尼克将上述特征分为两类,其中前三类特征有利于社会研究,后七类特征则不利于社会研究。大数据的这些特征改变了人们认识和组织社会的方式,带来了思维方式的转变,也对社会科学研究提出了挑战。
2.1 Opportunities
2.1 机遇
First, big data means that “we have moved from a world lacking behavioral data to a world rich in behavioral data” (Salganik, 2019: 5-8). With the popularization of the Internet and the large-scale digitization of administrative records and historical archives, an unprecedented amount of digital data has emerged in recent years. Different from the data collected by traditional social scientists, these new digital data can often record in detail the development and changes of social relations among a large number of people. Such abundant data resources provide the possibility for researchers to observe social phenomena from multiple perspectives and solve many theoretical problems. For example, researchers used the Google Ngram corpus to capture the subtle relationships regarding gender, class, and racial attributes in cultural sociology over a century (Kozlowski et al., 2019). The Google Ngram corpus, the product of a massive project in text digitization across thousands of the world’s libraries, distills text from 6 percent of all books ever published. Henry S. Farber conducted a study using the big data from taxi meters to verify relevant theories in labor economics. Farber was able to take advantage of the scale of the data to better understand heterogeneity and dynamic changes. He found that over time, new drivers gradually learned to work longer hours on high-wage days. In earlier studies that used a small number of taxi drivers’ paper trip sheets within a short period, these subtle findings would have been impossible to detect.
首先,大数据意味着“我们已经从一个缺乏行为数据的世界转向了一个行为数据丰富的世界” (Salganik,2019:5-8)。随着互联网的普及以及行政记录和历史档案的大规模数字化,近年来出现了前所未有的数字数据量。与传统社会科学家收集的数据不同,这些新的数字数据往往可以详细记录大量人群之间社会关系的发展和变化。如此丰富的数据资源为研究者从多角度观察社会现象、解决许多理论问题提供了可能。例如,研究人员使用 Google Ngram语料库捕捉了一个多世纪以来文化社会学中关于性别、阶级和种族属性的微妙关系(Kozlowski 等,2019) 。 Google Ngram 语料库是全球数千家图书馆的文本数字化大型项目的产物,它从已出版的所有图书中的 6% 中提取文本。亨利·法伯(Henry S. Farber)利用出租车计价器大数据进行了一项研究,验证了劳动经济学的相关理论。法伯能够利用数据规模来更好地理解异质性和动态变化。他发现,随着时间的推移,新司机逐渐学会在高薪日工作更长的时间。在早期的研究中,在短时间内使用少量出租车司机的纸质行程单,这些微妙的发现是不可能被发现的。
Second, many new technological methods have been introduced to analyze complex and large datasets. These technologies include various methods such as automatic text analysis, online experiments, large-scale collaboration, and other methods inspired by machine learning. The proliferation of digital data and the emergence of new methods for analyzing it have given birth to a new interdisciplinary field: computational social science. Machine learning algorithms can automatically identify patterns and trends in the data, helping researchers discover the laws hidden behind the data. Through the analysis of large-scale text data, researchers can gain a deeper understanding of the evolution of public opinion, the characteristics and behavior patterns of social groups. In addition, network analysis techniques can reveal the structure and dynamics of social networks, enable researchers to better understand the mechanisms of information dissemination, social influence, and group behavior. For example, Di Zhou (2022) innovated a diachronic word embedding method, calculated the diachronic novelty index of each discussion text based on the discussion data about American politics on the Zhihu platform, and thus verified the influence of novelty, emotion, status, and cultural capital on cultural power. Due to the introduction of new methods, researchers can compare a certain text in the past corpus space, making the calculation of novelty more in line with the original theoretical definition.
其次,引入了许多新技术方法来分析复杂和大型数据集。这些技术包括自动文本分析、在线实验、大规模协作以及其他受机器学习启发的方法等各种方法。数字数据的激增和分析数据的新方法的出现催生了一个新的跨学科领域:计算社会科学。机器学习算法可以自动识别数据中的模式和趋势,帮助研究人员发现隐藏在数据背后的规律。通过对大规模文本数据的分析,研究人员可以更深入地了解舆论的演变、社会群体的特征和行为模式。此外,网络分析技术可以揭示社交网络的结构和动态,使研究人员能够更好地理解信息传播、社会影响和群体行为的机制。例如,周迪(2022)创新了一种历时词嵌入方法,根据知乎平台上有关美国政治的讨论数据计算每个讨论文本的历时新颖性指数,从而验证了新颖性、情感、地位的影响力。 ,以及文化资本对文化力量的影响。由于新方法的引入,研究人员可以将某个文本在过去的语料库空间中进行比较,使得新颖性的计算更加符合原来的理论定义。
Third, the characteristic “nonreactive” in big data enables researchers to conceal their “presence”, thereby reducing the interference that occurs in traditional social surveys. In traditional social research, research subjects may change their behavior due to the awareness of being observed or studied, and this reactivity may affect the accuracy and authenticity of research results. In contrast, in big data research with non-reactivity, participants are usually unaware of the occurrence of data collection or have become accustomed to it, which thereby reduces the behavioral changes that would otherwise be caused by the awareness of being studied. As a result, many big data resources can be used to study behaviors that were previously impossible to accurately measure. For example, in traditional social movement research, researchers often need to enter an ongoing social movement to observe and interview, and once social movement participants learn that they are being noticed, their behavior and performance tend to be conservative or even withdraw from the movement. However, when conducting participatory observation of online social movements, other participants are not aware of the digital presence of the researcher, allowing the researcher to observe and record every detail of the social movement without concern. Additionally, in social media research, researchers can collect a large amount of data from users, such as posted content, likes, and comments. Users usually do not realize that their data is being used for research when using social media normally, so these data can reflect the true behavior and attitudes of users. For example, by analyzing users’ discussions of political events on Twitter, one can understand the public's political views and emotional changes, but once aware of the presence of the researcher, users may be reluctant to express their opinions.
第三,大数据的“非反应性”特性使研究人员能够隐藏自己的“存在”,从而减少传统社会调查中出现的干扰。在传统的社会研究中,研究对象可能会因为被观察或被研究的意识而改变自己的行为,这种反应性可能会影响研究结果的准确性和真实性。相比之下,在非反应性的大数据研究中,参与者通常不知道数据收集的发生或已经习惯了数据收集,从而减少了由于意识到被研究而引起的行为变化。因此,许多大数据资源可用于研究以前无法准确测量的行为。例如,在传统的社会运动研究中,研究者往往需要进入正在进行的社会运动进行观察和采访,而一旦社会运动参与者得知自己受到关注,他们的行为和表现就会趋于保守甚至退出运动。然而,在对网络社会运动进行参与式观察时,其他参与者并不知道研究人员的数字存在,从而使研究人员可以无忧无虑地观察和记录社会运动的每一个细节。此外,在社交媒体研究中,研究人员可以从用户那里收集大量数据,例如发布的内容、点赞和评论。用户在正常使用社交媒体时通常不会意识到他们的数据被用于研究,因此这些数据可以反映用户的真实行为和态度。 例如,通过分析用户在Twitter上对政治事件的讨论,可以了解公众的政治观点和情绪变化,但一旦意识到研究人员的存在,用户可能不愿意表达自己的观点。
2.2 Challenges
2.2 挑战
First, data problems. Although the big data in the digital era has characteristics conducive to social research such as massiveness, there are also many factors that are unfavorable to social research. Due to strict legal, commercial, and ethical restrictions, it is difficult for researchers to access the data resources held by companies and governments. In particular, since enterprises regard data as an important asset and aim to avoid the risk of information leakage, they are usually reluctant to cooperate with researchers. Even if data can be obtained, since the data cannot be made public and shared, other researchers are also “unable to verify and expand your research results” (Salganik, 2019:36). This leads to the fact that social science researchers no longer have the initiative in social research as in the traditional analog era (邱泽奇, 2018), and to a large extent hinders the “publicism” norm of science. In addition, many big data resources are not representative samples drawn from a clear population, which is a serious problem for research that needs to generalize the research results from the sample to the population. When researchers attempt to combine traditional data and big data to improve data representativeness, they will encounter problems such as data sparsity and the difficulty in assessing the quality of big data. The last common problem is that the behaviors in big data systems, especially in social media systems, do not occur naturally but are driven by the design goals of the system, which may lead to data interference and make it difficult for researchers to draw correct conclusions.
首先,数据问题。数字时代的大数据虽然具有海量等有利于社会研究的特征,但也存在许多不利于社会研究的因素。由于严格的法律、商业和道德限制,研究人员很难访问公司和政府持有的数据资源。特别是,由于企业将数据视为重要资产,并希望避免信息泄露的风险,因此通常不愿意与研究人员合作。即使可以获得数据,由于数据无法公开和共享,其他研究人员也“无法验证和扩展你的研究成果” (Salganik,2019:36) 。这导致社会科学研究者不再像传统模拟时代那样拥有社会研究的主动权(邱泽奇,2018) ,并在很大程度上阻碍了科学的“公共主义”规范。此外,许多大数据资源并不是从明确总体中抽取的代表性样本,这对于需要将研究结果从样本推广到总体的研究来说是一个严重的问题。当研究人员尝试将传统数据与大数据结合起来提高数据代表性时,会遇到数据稀疏、大数据质量难以评估等问题。 最后一个常见问题是,大数据系统尤其是社交媒体系统中的行为不是自然发生的,而是由系统的设计目标驱动的,这可能会导致数据干扰,使研究人员难以得出正确的结论。
Second, theory problems. Computational social science is an interdisciplinary field that promotes the development of theories of human behavior by applying computational techniques to large datasets from social media or other digital archives. The emphasis on sociological theory lies in the fact that the future of this field in sociology depends not only on new data sources and analytical methods, but also on whether it can generate new theories of human behavior or provide further explanations for existing social phenomena. However, in the digital age, some researchers may rely too much on data and neglect the guiding role of theory. This data-driven research approach may lead to research results lacking depth and theoretical significance.
二是理论问题。计算社会科学是一个跨学科领域,它通过将计算技术应用于社交媒体或其他数字档案中的大型数据集来促进人类行为理论的发展。社会学理论的强调在于,社会学这一领域的未来不仅取决于新的数据源和分析方法,而且取决于它能否产生人类行为的新理论或为现有的社会现象提供进一步的解释。然而,在数字时代,一些研究者可能过于依赖数据而忽视了理论的指导作用。这种数据驱动的研究方式可能会导致研究成果缺乏深度和理论意义。
Third, ethics problems. Researchers’ ability to observe and experiment on participants without their consent or even without their awareness has seen a rapid increase, and there is a lack of clear definitions and consensus on how to use this ability. For example, in the emotional contagion project, researchers experimented on 700,000 Facebook users without obtaining informed consent and without the participants' knowledge (Salganik, 2019:284). In traditional social experiments, this clearly violates the principle of informed consent, while the rationality of such behavior in the digital age still awaits a unified standard to be formed. In addition, since data in the digital age contains more information, the possibility of information-based harm has increased dramatically, and information-based risks are more difficult to understand and manage. Data “anonymization” cannot effectively protect data, and researchers should assume that all data may be identifiable and sensitive to prevent the exposure of individuals in big data.
第三,道德问题。研究人员在未经参与者同意甚至无意识的情况下对参与者进行观察和实验的能力迅速增强,但对于如何使用这种能力缺乏明确的定义和共识。例如,在情绪传染项目中,研究人员在未获得知情同意且参与者不知情的情况下对 70 万 Facebook 用户进行了实验(Salganik,2019:284) 。在传统的社会实验中,这显然违反了知情同意原则,而在数字时代这种行为的合理性仍有待形成统一的标准。此外,由于数字时代的数据包含更多的信息,基于信息的危害的可能性急剧增加,基于信息的风险更难以理解和管理。数据“匿名化”并不能有效保护数据,研究人员应该假设所有数据都可能是可识别的、敏感的,以防止大数据中的个人暴露。
A New research paradigm: responses to opportunities and challenges
新的研究范式:应对机遇与挑战
Based on a series of new characteristics of social sciences in the digital age, Salganik has comprehensively renovated the research methods such as observing behavior, asking questions, conducting experiments, and scientific collaboration in traditional social science research, and analyzed how to utilize these methods to deal with the opportunities and challenges in the digital age.
针对数字时代社会科学的一系列新特点,萨尔加尼克全面革新了传统社会科学研究中的观察行为、提出问题、进行实验、科学协作等研究方法,并分析了如何利用这些方法来应对数字时代的机遇和挑战。
Observing Behavior through Big Data
通过大数据观察行为
The copious amount of big data, replete with traces of social activity, has spurred social researchers to concentrate on harnessing big data for the observation of human behavior. Salganik begins his discussion by highlighting the shift from a scarcity of behavioral data in the analog era to an abundance in the digital age, with various digital traces being generated through everyday activities. He emphasizes that big data, often created by companies and governments for purposes other than research, presents both opportunities and challenges. For instance, the vastness of data, as exemplified by the Google Books corpus, can enable the study of rare events and minute differences, yet it may also lead to overlooking data generation processes and concept errors (Salganik, 2019:18-21). The persistence of data collection systems, like Twitter, allows for the study of events over time and real-time monitoring, but long-term tracking may be affected by system changes (Salganik, 2019:21). The issue of reactivity is also discussed, where the unobtrusive nature of some big data can reduce the impact of participants’ awareness on their behavior, although it doesn’t guarantee accurate reflection of attitudes and may raise ethical concerns (Salganik, 2019:24)
大量的大数据充满了社会活动的痕迹,促使社会研究人员集中精力利用大数据来观察人类行为。萨尔加尼克首先强调了从模拟时代的行为数据稀缺到数字时代的丰富行为数据的转变,通过日常活动产生各种数字痕迹。他强调,大数据通常是由公司和政府出于研究以外的目的而创建的,它既带来了机遇,也带来了挑战。例如,以谷歌图书语料库为例的海量数据可以使研究罕见事件和微小差异成为可能,但也可能导致忽视数据生成过程和概念错误(Salganik,2019:18-21)。 Twitter 等数据收集系统的持久性允许研究一段时间内的事件并进行实时监控,但长期跟踪可能会受到系统变化的影响(Salganik,2019:21)。还讨论了反应性问题,其中一些大数据的不引人注目的性质可以减少参与者意识对其行为的影响,尽管它不能保证准确反映态度并可能引起道德问题(Salganik,2019:24).
To extract useful information from big data, Salganik proposes three main strategies. Counting, as demonstrated by Farber’s study on New York taxi drivers, can provide valuable insights when combined with interesting questions (Salganik, 2019:42). “Predicting the future is hard, but predicting the present is easier.” (Salganik, 2019:46) Prediction aims to estimate the present situation but has faced challenges like drift and algorithmic interference. Approximation experiments, such as the natural experiments by Joshua Angrist (Salganik, 2019:51) and the matching methods used by Liran Einav (Salganik, 2019:55), offer ways to make causal inferences from non-experimental data. Overall, Salganik provides a comprehensive understanding of the nature and implications of using big data for observing behavior in social research.
为了从大数据中提取有用信息,Salganik 提出了三个主要策略。正如法伯对纽约出租车司机的研究所证明的那样,计数与有趣的问题相结合可以提供有价值的见解(Salganik,2019:42) 。 “预测未来很难,但预测现在更容易。” (Salganik,2019:46)预测旨在估计现状,但面临漂移和算法干扰等挑战。近似实验,例如 Joshua Angrist 的自然实验(Salganik,2019:51)和 L iran Einav使用的匹配方法(Salganik,2019:55) ,提供了从非实验数据中进行因果推断的方法。总体而言,萨尔加尼克对使用大数据观察社会研究行为的本质和影响提供了全面的理解。
Questioning in the Digital Age
数字时代的提问
Salganik delves into the crucial topic of asking questions in the context of digital transformation. This part begins by emphasizing the continued importance of surveys despite the prevalence of big data. The author traces the evolution of survey research from its early days of face-to-face interviews in specific geographic areas to the use of telephones and now the digital realm. The concept of the survey error total framework is introduced, which highlights two main sources of error: representativeness and measurement. The case of the Literary Digest’s failed prediction in the 1936 presidential election serves as a cautionary tale, illustrating the pitfalls of improper sampling and the importance of considering all types of errors, not just sampling error (Salganik, 2019:91-93)
萨尔加尼克深入探讨了在数字化转型背景下提出问题的关键主题。本部分首先强调尽管大数据盛行,但调查的持续重要性。作者追溯了调查研究的演变,从早期在特定地理区域进行面对面访谈到使用电话以及现在的数字领域。引入了调查误差总体框架的概念,强调了误差的两个主要来源:代表性和测量性。 《文学文摘》在 1936 年总统选举中预测失败的案例是一个警示,说明了抽样不当的陷阱以及考虑所有类型错误而不仅仅是抽样误差的重要性(Salganik,2019:91-93).
Salganik then explores new methods of questioning enabled by digital technologies. Ecological momentary assessment, as exemplified by Naomi Sugie’s study on ex-offenders, facilitates the real-time collection of data within the participants’ natural environments, thereby affording a more precise and elaborate comprehension of their experiences (Salganik, 2019:109-111). Wiki surveys, on the other hand, combine the advantages of open-ended and closed-ended questions, as seen in the project with the New York Mayor's Office, where residents could contribute their own ideas and participate in a more engaging survey process (Salganik, 2019:111-114)
萨尔加尼克随后探索了数字技术带来的新提问方法。 Naomi Sugies 对刑满释放人员的研究表明,生态瞬时评估有助于参与者在自然环境中实时收集数据,从而更准确、更详细地理解他们的经历(Salganik,2019:109-111)。另一方面,维基调查结合了开放式和封闭式问题的优点,正如纽约市长办公室的项目所示,居民可以贡献自己的想法并参与更具吸引力的调查过程(Salganik ,2019:111-114).
Overall, Salganik offers valuable insights into how digital technologies are revolutionizing the way we ask questions in social research, enabling more accurate, timely, and engaging data collection methods.
总的来说,萨尔加尼克就数字技术如何彻底改变我们在社会研究中提出问题的方式,实现更准确、及时和有吸引力的数据收集方法提供了宝贵的见解。
Running Social Experiments in the Digital Age
在数字时代进行社会实验
Salganik presents a comprehensive view of experiments in the digital age, beginning by highlighting the fundamental role of experiments in answering causal questions, which is crucial in social research. In the digital age, the landscape of experimentation has significantly transformed. Salganik emphasizes that digital systems have revolutionized the way experiments are conducted. They have made it easier to recruit participants on a large scale, implement treatments, and measure outcomes. For example, digital platforms like Wikipedia and Facebook have provided new avenues for conducting experiments. The ability to reach a vast number of users online has increased the potential sample size, allowing for more powerful statistical analyses and a better understanding of treatment effects.
萨尔加尼克对数字时代的实验提出了全面的看法,首先强调了实验在回答因果问题方面的基本作用,这在社会研究中至关重要。在数字时代,实验的格局发生了巨大变化。萨尔加尼克强调,数字系统彻底改变了实验的进行方式。它们使得大规模招募参与者、实施治疗和衡量结果变得更加容易。例如,维基百科和 Facebook 等数字平台为进行实验提供了新的途径。在线接触大量用户的能力增加了潜在的样本量,从而可以进行更强大的统计分析并更好地了解治疗效果。
Salganik, in his book, states that “Lab experiments offer control, field experiments offer realism, and digital field experiments combine control and realism at scale.” (Salganik, 2019:151) In social research, it is necessary to distinguish between laboratory experiments and field experiments, and digital field experiments can combine the rigor of laboratory settings with the authenticity of natural environments, thus bringing about better results in social experiments. Furthermore, while social scientists can enjoy the research convenience brought by digital experiments, they must also pay more attention to ethical issues. With the increased power to manipulate and observe participants, researchers must be more cautious in protecting the rights and well-being of participants. A more advisable approach is to embed ethical considerations into the experimental design right from the outset, adhering to principles such as substitution, refinement, and reduction.
萨尔加尼克在他的书中指出, “实验室实验提供控制,现场实验提供真实性,数字现场实验将控制和真实性大规模结合起来。” (Salganik,2019:151)在社会研究中,需要区分实验室实验和田野实验,而数字田野实验可以将实验室设置的严谨性与自然环境的真实性结合起来,从而带来更好的社会实验结果。此外,社会科学家在享受数字实验带来的研究便利的同时,也必须更加关注伦理问题。随着操纵和观察参与者的权力不断增强,研究人员在保护参与者的权利和福祉方面必须更加谨慎。更明智的方法是从一开始就将伦理考虑纳入实验设计中,遵循替代、细化和减少等原则。
Creating Mass Collaboration
建立大规模协作
For scientific collaboration, there is no doubt that digital technology has ushered in a new era of cooperation, enabling researchers to engage with a broader and more diverse range of participants than ever before. The new large-scale collaboration not only implies an increase in the number of participants but also represents an unprecedented breadth and diversity of skills and perspectives. This diversity conduces to more comprehensive and innovative research findings.
对于科学合作而言,毫无疑问,数字技术开创了合作的新时代,使研究人员能够接触到比以往更广泛、更多样化的参与者。新的大规模合作不仅意味着参与者数量的增加,而且代表了前所未有的技能和观点的广度和多样性。这种多样性有助于产生更全面和创新的研究结果。
The author classifies large-scale collaboration projects into different types, such as human-based computing, open calls, and distributed data collection. Social science researchers can utilize these complementary approaches and adjust them according to different research questions. For example, human-based computing projects can handle tasks that are simple for humans but difficult for computers, while open calls can solicit novel solutions to complex problems. Distributed data collection projects, represented by eBird, can gather data from a large number of contributors (Salganik, 2019:257). eBird allows birdwatchers to contribute their observations, providing valuable data for ornithological research, although there are challenges related to sampling and data quality. Compared with traditional cooperation methods, when applying these new cooperation methods, it is also necessary for researchers to conduct appropriate research designs. Salganik suggests that researchers need to consider factors such as motivating participants, leveraging the heterogeneity of participants’ skills and efforts, and maintaining focus on the research goals.
作者将大型协作项目分为不同类型,如基于人的计算、公开调用、分布式数据收集等。社会科学研究人员可以利用这些互补的方法,并根据不同的研究问题进行调整。例如,基于人类的计算项目可以处理对人类来说简单但对计算机来说困难的任务,而公开征集可以为复杂问题征求新颖的解决方案。以 eBird 为代表的分布式数据收集项目可以从大量贡献者那里收集数据(Salganik,2019:257) 。 eBird 允许观鸟者贡献他们的观察结果,为鸟类学研究提供有价值的数据,尽管在采样和数据质量方面存在挑战。与传统的合作方法相比,在应用这些新的合作方法时,研究人员还需要进行适当的研究设计。萨尔加尼克建议研究人员需要考虑诸如激励参与者、利用参与者技能和努力的异质性以及保持对研究目标的关注等因素。
Overall, Salganik views digital-age social research collaboration as a powerful tool that can expand the boundaries of what is possible in social research. It not only enables the tackling of previously intractable problems but also has the potential to democratize the research process by involving a wider audience. However, it also requires careful planning and consideration to ensure its success and to address the various challenges that may arise.
总体而言,萨尔加尼克将数字时代的社会研究合作视为一种强大的工具,可以扩展社会研究的可能性范围。它不仅能够解决以前棘手的问题,而且有可能通过让更广泛的受众参与来使研究过程民主化。然而,它也需要仔细的规划和考虑,以确保其成功并应对可能出现的各种挑战。
In conclusion, new technologies can assist researchers in establishing new connections with research objects, providing new research perspectives and paradigms, thereby solving some problems that have been difficult for social sciences to answer in the past. Social science research should combine classic research methods (observing behavior, asking questions, running experiments, and mass collaboration) with digital age technologies to achieve innovation and optimization of traditional methods rather than simple replacement.
总之,新技术可以帮助研究者与研究对象建立新的联系,提供新的研究视角和范式,从而解决一些过去社会科学难以回答的问题。社会科学研究应该将经典的研究方法(观察行为、提出问题、进行实验、群体协作)与数字时代的技术结合起来,实现对传统方法的创新和优化,而不是简单的替代。
The position of sociologists in computational sociology
社会学家在计算社会学中的地位
Throughout the entire book, Salganik delved deeply into the positioning of sociologists in the field of computational sociology. This positioning encompasses several important aspects, including the mode of collaboration with other disciplines, the required skills and capabilities, as well as the outlook for future development.
在整本书中,萨尔加尼克深入探讨了社会学家在计算社会学领域的定位。这个定位包含几个重要的方面,包括与其他学科的合作模式、所需的技能和能力以及未来发展的展望。
In terms of the mode of collaboration with other disciplines, as an interdisciplinary field of social science and data science, computational sociology requires sociologists to actively engage in close cooperation with data scientists, computer scientists, and others. As stated in the book, in many research projects, sociologists are responsible for providing a solid theoretical framework and in-depth research questions, which are the cornerstone of social research. On the other hand, data scientists and computer scientists play a crucial role in data processing and analysis techniques with their professional skills. This interdisciplinary cooperation model can fully leverage the advantages of each discipline and achieve effective integration of resources, thereby promoting the in-depth development of computational sociology research.
在与其他学科的合作模式上,计算社会学作为社会科学和数据科学的交叉学科领域,需要社会学家积极与数据科学家、计算机科学家等进行密切合作。正如书中所述,在许多研究项目中,社会学家负责提供坚实的理论框架和深入的研究问题,这是社会研究的基石。另一方面,数据科学家和计算机科学家以其专业技能在数据处理和分析技术中发挥着至关重要的作用。这种跨学科合作模式可以充分发挥各学科的优势,实现资源的有效整合,从而推动计算社会学研究的深入发展。
On the one hand, sociologists in the digital age are required to be proficient in the core skills of traditional social science research, including rigorous research design, in-depth theoretical construction, and accurate data analysis. These skills are the basic abilities for sociologists to conduct scientific research and can ensure the scientific and reliability of the research. On the other hand, with the advent of the digital age, sociologists also need to actively master certain digital technology and data science knowledge, such as programming skills, big data analysis methods, and machine learning algorithms. Only by possessing these interdisciplinary skills can sociologists better adapt to the development needs of computational sociology and flexibly use various tools and methods in research to explore more valuable research results.
一方面,数字时代的社会学家需要精通传统社会科学研究的核心技能,包括严谨的研究设计、深入的理论构建和准确的数据分析。这些技能是社会学家进行科学研究的基本能力,可以保证研究的科学性和可靠性。另一方面,随着数字时代的到来,社会学家也需要主动掌握一定的数字技术和数据科学知识,比如编程技能、大数据分析方法、机器学习算法等。社会学家只有具备这些跨学科技能,才能更好地适应计算社会学的发展需要,在研究中灵活运用各种工具和方法,探索出更有价值的研究成果。
It is foreseeable that sociologists will not be replaced by data scientists and computer scientists, and sociologists will still occupy an indispensable position in computational sociology. The author is confident that sociologists will play a crucial role in the development process of computational sociology. By means of organically combining social science theory with advanced digital technology, sociologists are capable of effectively resolving various complex social problems. In the future, sociologists need to maintain a keen insight, continuously learn and adapt to new technologies and methods, and actively promote the sustainable development of computational sociology. In this era of rapid digital development, the role of sociologists will become more important, and their research results will have a profound impact on the progress and development of society.
可以预见,社会学家不会被数据科学家和计算机科学家取代,社会学家仍将在计算社会学中占据不可或缺的地位。作者相信社会学家将在计算社会学的发展过程中发挥至关重要的作用。社会学家通过将社会科学理论与先进的数字技术有机结合,能够有效解决各种复杂的社会问题。未来,社会学家需要保持敏锐的洞察力,不断学习和适应新技术、新方法,积极推动计算社会学的可持续发展。在这个数字化快速发展的时代,社会学家的作用将变得更加重要,他们的研究成果将对社会的进步和发展产生深远的影响。