如何不认识自己
重點(diǎn) (Top highlight)
By Angela Xiao Wu, assistant professor at New York University
紐約大學(xué)助理教授Angela Xiao Wu
This blog post comes out of a paper by Angela Xiao Wu and Harsh Taneja that offers a new take on social sciences’ ongoing embrace of platform log data by questioning their measurement conditions. The distinct nature of platform datafication is foregrounded in comparison with the longer tradition of third-party audience measurement.
這篇博客文章來(lái)自 Angela Xiao Wu 和 Harsh Taneja 的 一篇論文 , 通過(guò)質(zhì)疑它們的測(cè)量條件,為社會(huì)科學(xué)對(duì)平臺(tái)日志數(shù)據(jù)的持續(xù)接受提供了新的思路。 與第三方受眾評(píng)估的悠久傳統(tǒng)相比,平臺(tái)數(shù)據(jù)化的獨(dú)特性質(zhì)得到了展望。
Surfing a wave of societal awe and excitement about “Big Data,” platforms formed a habit of releasing “data science” insights on what we search, like, express, purchase, obsess over, attempt to hide, and prefer to forget. These colorful graphics and juicy taglines — most notably from OKCupid and PornHub, whose data lay claims to the quirks and desires of our intimate lives — are always popular novelties to behold, ponder, and reference. If knowing ourselves through platform data is a practice of our age, it is certainly not confined to platforms themselves. Aspiring data scientists, curious programmers, vigilant data journalists, analysts of civic organizations and political campaigns, and (last but not the least) academic social scientists such as myself make up the growing field that is figuring out who we are, what we do, and how we sway in the swathes of platform data.
平臺(tái)引起了社會(huì)對(duì)“大數(shù)據(jù)”的敬畏和興奮,習(xí)慣養(yǎng)成了對(duì)我們搜索,表達(dá),購(gòu)買,癡迷,試圖隱藏以及寧愿忘記的事物發(fā)布“數(shù)據(jù)科學(xué)”見(jiàn)解的習(xí)慣。 這些色彩鮮艷的圖形和多汁的標(biāo)語(yǔ),尤其是來(lái)自O(shè)KCupid和PornHub的數(shù)據(jù),它們的數(shù)據(jù)表明了我們私密生活的怪癖和渴望,這些都是新穎的新穎事物,值得注視,思考和借鑒。 如果通過(guò)平臺(tái)數(shù)據(jù)了解自己是我們時(shí)代的一種實(shí)踐,那么它肯定不僅限于平臺(tái)本身。 有抱負(fù)的數(shù)據(jù)科學(xué)家,好奇的程序員,警惕的數(shù)據(jù)記者,民間組織和政治運(yùn)動(dòng)的分析人員,以及(最后但并非最不重要的)像我這樣的學(xué)術(shù)社會(huì)科學(xué)家組成了一個(gè)不斷發(fā)展的領(lǐng)域,該領(lǐng)域正在弄清我們是誰(shuí),我們做什么,以及我們?nèi)绾卧诒姸嗥脚_(tái)數(shù)據(jù)中搖擺。
Such data can be impressive due to their unprecedented granularity and volume, as well as the fact that they are seemingly “unobtrusive” recordings of our activities when no one is watching. These apparent strengths of data for social research are outweighed by a problem in what we call the “measurement conditions”: platform data are platforms’ records of their own behavioral experimentation. Trying to know ourselves through platform data tends to yield partial and contorted accounts of human behavior that conceal platform interventions. Moreover, though increasingly produced by non-corporate actors, such knowledge accounts and narratives tend to be amenable to platform money-making and image-building.
由于這些數(shù)據(jù)的空前的粒度和數(shù)量,以及當(dāng)沒(méi)有人觀看時(shí),它們似乎對(duì)我們的活動(dòng)“不干擾”的記錄,因此這些數(shù)據(jù)之所以令人印象深刻。 社會(huì)研究數(shù)據(jù)的這些明顯優(yōu)勢(shì)被我們所謂的“測(cè)量條件”問(wèn)題所抵消:平臺(tái)數(shù)據(jù)是平臺(tái)自身行為實(shí)驗(yàn)的記錄。 試圖通過(guò)平臺(tái)數(shù)據(jù)了解自己往往會(huì)產(chǎn)生隱藏在平臺(tái)干預(yù)中的人類行為的部分和扭曲的描述。 此外,盡管由非企業(yè)行為者越來(lái)越多地產(chǎn)生這種知識(shí),但這些敘述和敘述往往適合平臺(tái)賺錢和建立形象。
Trying to know ourselves through platform data tends to yield partial and contorted accounts of human behavior that conceal platform interventions.
試圖通過(guò)平臺(tái)數(shù)據(jù)了解自己往往會(huì)產(chǎn)生隱藏在平臺(tái)干預(yù)中的人類行為的部分和扭曲的描述。
To be clear, for years many have contested the ascendance of platform data as a staple in quantitative social sciences alongside conventional data collection methods, such as surveys and experiments. These contestations focus on issues about the data’s representativeness, privacy concerns, and precarious access at the mercy of platform companies. The “measurement conditions” problem, however, is entirely different. In our newly published paper, Harsh Taneja and I call for attention to the circumstances under which these data come about: what purpose does the measurement initially serve? As historians have told us, measurement — or converting parts of the social world into quantities according to some enduring instrument — is not an end in itself, but a means for managing events and coordinating actions. Measurement is thus a product of the social and institutional context (i.e., “measurement conditions”) in which it is called upon and carried out.
需要明確的是,多年來(lái),許多人一直將平臺(tái)數(shù)據(jù)的崛起與定量社會(huì)科學(xué)以及常規(guī)數(shù)據(jù)收集方法(例如調(diào)查和實(shí)驗(yàn))一起作為定量社會(huì)科學(xué)中的主要手段來(lái)進(jìn)行競(jìng)爭(zhēng)。 這些競(jìng)賽的重點(diǎn)是關(guān)于數(shù)據(jù)的代表性,隱私問(wèn)題以及平臺(tái)公司的不確定性。 但是,“測(cè)量條件”問(wèn)題完全不同。 在我最近發(fā)表的論文中 ,Harsh Taneja和我提請(qǐng)注意這些數(shù)據(jù)出現(xiàn)的情況:測(cè)量最初起什么作用? 正如歷史學(xué)家告訴我們的那樣,測(cè)量(或根據(jù)某種持久性工具將社會(huì)世界的一部分轉(zhuǎn)換為數(shù)量)本身并不是目的,而是管理事件和協(xié)調(diào)行動(dòng)的一種手段。 因此,衡量是社會(huì)和制度環(huán)境(即“衡量條件”)的產(chǎn)物,在此環(huán)境中需要進(jìn)行衡量。
A closer look at the measurement conditions of platforms allows us to rethink the nature of platform log data: they are essentially “administrative data” that platforms generate to realize their own organizational goals, which go little beyond enlarging advertising income, harvesting intermediary fees, and attracting venture capitals. These companies track user engagements with their platforms to evaluate and showcase “product performance.” Such data analytics are integral to the iterative process whereby platforms tinker with their digital architectures in attempts to shape usage in ways that maximize profits.
仔細(xì)研究平臺(tái)的衡量條件,我們可以重新考慮平臺(tái)日志數(shù)據(jù)的性質(zhì):它們本質(zhì)上是平臺(tái)為實(shí)現(xiàn)自己的組織目標(biāo)而生成的“管理數(shù)據(jù)”,除了增加廣告收入,收取中介費(fèi)和吸引風(fēng)險(xiǎn)投資。 這些公司通過(guò)其平臺(tái)跟蹤用戶參與度,以評(píng)估和展示“產(chǎn)品性能”。 此類數(shù)據(jù)分析是迭代過(guò)程不可或缺的部分,在此過(guò)程中,平臺(tái)將對(duì)其數(shù)字架構(gòu)進(jìn)行修補(bǔ),以嘗試通過(guò)使利潤(rùn)最大化的方式來(lái)改變使用方式。
In other words, platform log data are not “unobtrusive” recordings of human behavior out in the wild. Rather, their measurement conditions determine that they are accounts of putative user activity — “putative” in a sense that platforms are often incentivized to keep bots and other fake accounts around, because, from their standpoint, it’s always a numbers game with investors, marketers, and the actual, oft-insecure users. With calculated neglect comes calibrated nudges: platform user activity, in the first place, is induced, coaxed, and experimented on by the platform environment. From multilayered graphical organization to complex algorithmic recommendation, it is from all these platform arrangements that user activity arises. Conversely, it is to make decisions about these arrangements that platform companies measure usage.
換句話說(shuō),平臺(tái)日志數(shù)據(jù)并不是野外人類行為的“毫不干擾”記錄。 相反,他們的衡量條件確定他們是假定的用戶活動(dòng)的帳戶-在某種意義上說(shuō),“經(jīng)常”是指平臺(tái)經(jīng)常受到激勵(lì)以保持機(jī)器人程序和其他虛假帳戶的存在,因?yàn)閺乃麄兊慕嵌葋?lái)看,這始終是與投資者,營(yíng)銷商的數(shù)字游戲,以及經(jīng)常不安全的實(shí)際用戶。 經(jīng)過(guò)計(jì)算的疏忽帶來(lái)了經(jīng)過(guò)校準(zhǔn)的微調(diào):首先,平臺(tái)環(huán)境會(huì)誘發(fā),哄騙和試驗(yàn)平臺(tái)用戶的活動(dòng)。 從多層圖形化組織到復(fù)雜的算法推薦,正是從所有這些平臺(tái)安排中產(chǎn)生了用戶活動(dòng)。 相反,平臺(tái)公司將根據(jù)使用情況做出決策。
Thus, it is difficult to tell to what extent the patterns emerging from platform data are about “us,” rather than testimonies to the effects of platform nudges.
因此,很難說(shuō)平臺(tái)數(shù)據(jù)出現(xiàn)的模式在多大程度上是關(guān)于“我們”的,而不是平臺(tái)微弱效果的證詞。
Of course, when bulks of platform log data become available for inquisitive parties to crunch, platforms keep the other part of the iterative process — shifting platform arrangements aimed to nudge usage — in the dark. Thus, it is difficult to tell to what extent the patterns emerging from platform data are about “us,” rather than testimonies to the effects of platform nudges. When we are experimental subjects oblivious to platforms’ treatments on us, taking our induced behaviors as “natural” means regarding these platforms as benign, transparent vehicles for our inherent intentions, and thus obscuring their prevailing power.
當(dāng)然,當(dāng)大量平臺(tái)日志數(shù)據(jù)可供查詢方處理時(shí),平臺(tái)會(huì)將重復(fù)過(guò)程的另一部分(即旨在輕推使用的平臺(tái)安排轉(zhuǎn)移到黑暗中)保留下來(lái)。 因此,很難說(shuō)平臺(tái)數(shù)據(jù)出現(xiàn)的模式在多大程度上是關(guān)于“我們”的,而不是平臺(tái)微弱效果的證詞。 當(dāng)我們是實(shí)驗(yàn)對(duì)象而忽略平臺(tái)對(duì)我們的治療時(shí),將我們的誘發(fā)行為視為“自然”意味著將這些平臺(tái)視為對(duì)我們固有意圖的良性透明工具,從而掩蓋了它們的主導(dǎo)力量。
Consider peeking into our innate preferences (by race, geography, and daily rhythms!) based on “patterns” that emerge from PornHub’s log data, when the site’s visual design, temporal pacing, and content curation is all about eliciting and extending the user’s state of pleasure and pleasure seeking; or using Twitter data to study the insurgent online protests during Occupy Wall Street when, due to unknown algorithmic workings, the very term failed to trend; or using Uber’s rides data to study commuting habits when Uber wields its driving force with strategies, such as price surging under the name of (predicted but unverifiable) high demand; or using YouTube, or more fantastically Netflix data, to discern media preferences when these platforms’ entire business rests on herding sequences of viewing. (Each of these platform strategies have been creatively uncovered by critical scholars.)
考慮基于PornHub日志數(shù)據(jù)中出現(xiàn)的“模式”來(lái)窺視我們的先天偏好(按種族,地理和日常節(jié)奏!),此時(shí)網(wǎng)站的視覺(jué)設(shè)計(jì),時(shí)間步調(diào)和內(nèi)容管理都是關(guān)于激發(fā)和擴(kuò)展用戶狀態(tài)的享樂(lè)和尋求享樂(lè); 或使用Twitter數(shù)據(jù)研究“占領(lǐng)華爾街”期間的叛亂在線抗議活動(dòng),當(dāng)時(shí)由于未知的算法工作原理,這一術(shù)語(yǔ)未能趨于發(fā)展 ; 或當(dāng)Uber 運(yùn)用策略推動(dòng)其通行動(dòng)力時(shí),使用Uber的乘車數(shù)據(jù)研究通勤習(xí)慣,例如以(預(yù)計(jì)但無(wú)法驗(yàn)證的)高需求的名義飆升價(jià)格; 當(dāng)這些平臺(tái)的整個(gè)業(yè)務(wù)都集中在觀看序列上時(shí),或者使用YouTube或更奇妙的Netflix數(shù)據(jù)來(lái)識(shí)別媒體偏好。 (批評(píng)學(xué)者們創(chuàng)造性地發(fā)現(xiàn)了每種平臺(tái)策略。)
…platforms’ intervention in human behavior is at once the center of platform business models and the secret that platforms strive to hide.
……平臺(tái)對(duì)人類行為的干預(yù)既是平臺(tái)業(yè)務(wù)模型的中心,又是平臺(tái)努力隱藏的秘密。
When we wind up finding human nature in platform data, we take administrative records from insulated digital experiments as expressions of humanity in our society. The data envelope a platform-shaped hole that may eschew the scrutiny of the most sophisticated computational techniques. Such a data analytic pitfall, increasingly common in data science showcases, journalistic reporting, and academic research, effectively obscures platforms’ intervention in human behavior. And platforms’ intervention in human behavior is at once the center of platform business models and the secret that platforms strive to hide.
當(dāng)我們最終在平臺(tái)數(shù)據(jù)中發(fā)現(xiàn)人性時(shí),我們將隔離的數(shù)字實(shí)驗(yàn)中的管理記錄作為人類在社會(huì)中的表現(xiàn)。 數(shù)據(jù)包圍著一個(gè)平臺(tái)形的Kong,可以避免對(duì)最復(fù)雜的計(jì)算技術(shù)的審查。 這種數(shù)據(jù)分析陷阱在數(shù)據(jù)科學(xué)展示,新聞報(bào)道和學(xué)術(shù)研究中越來(lái)越普遍,有效地掩蓋了平臺(tái)對(duì)人類行為的干預(yù)。 平臺(tái)對(duì)人類行為的干預(yù)既是平臺(tái)業(yè)務(wù)模型的中心,又是平臺(tái)努力隱藏的秘密。
What are the human actions and predispositions that initially spark our curiosity? What is the kind of self-knowledge that we would cherish as a foundation for enriching our sociality, our civil and public institutions, and our democratic process? Readily resorting to platform data analytics for such knowledge risks taking platform environments as our entire world. Instead, when dealing with platform data we should aspire to “put the platforms in perspective,” foregrounding rather than obscuring their interventions in how we behave.
最初激發(fā)我們好奇心的人類行為和傾向是什么? 我們將以什么樣的自我知識(shí)作為豐富我們的社會(huì),我們的公民和公共機(jī)構(gòu)以及我們的民主進(jìn)程的基礎(chǔ)? 隨便使用平臺(tái)數(shù)據(jù)分析來(lái)獲得這樣的知識(shí)風(fēng)險(xiǎn),需要把平臺(tái)環(huán)境當(dāng)作我們的整個(gè)世界。 相反,在處理平臺(tái)數(shù)據(jù)時(shí),我們應(yīng)該著眼于“透視平臺(tái)”,而不是掩蓋他們對(duì)我們行為的干預(yù)。
In this collective effort, non-corporate critical actors may find useful some of the strategies discussed in our paper.
在這種集體努力中,非企業(yè)的關(guān)鍵角色可能會(huì)發(fā)現(xiàn)本文討論的一些策略有用。
Angela Xiao Wu is an assistant professor in Media, Culture and Communication at New York University researching information technology, knowledge production, and political cultures.
吳小安(Angela Xiao Wu) 是紐約大學(xué)媒體,文化和傳播學(xué)的助理教授,研究信息技術(shù),知識(shí)生產(chǎn)和政治文化。
翻譯自: https://points.datasociety.net/how-not-to-know-ourselves-5227c185569
總結(jié)
- 上一篇: scrape创建_确实在2分钟内对Scr
- 下一篇: java项目经验行业_行业研究以及如何炫