返回首页

TriciaWang_2016X-_因大数据而失准的视野_

In ancient Greece, when anyone from slaves to soldiers, poets and politicians , needed to make a big decision on life's most important questions, like, "Should I get married?" 古希腊时期, 不论是奴隶或士兵,诗人或政治家, 当他们人生遇到重大问题时, 需要做出重要的决定, 像是「我该结婚吗?」
politicians:n.政治家;(蔑)政客;(美)政治贩;(politician的复数)
or "Should we embark on this voyage ?" 或是「我该开始这次的航行吗?」
embark:vi.从事,着手;上船或飞机;vt.使从事;使上船; voyage:v.航行;远行;(尤指)远航;n.航行;(尤指)航海;
or "Should our army advance into this territory ?" 或是「我的士兵该进攻这个领地吗?」
territory:n.领土,领域;范围;地域;版图;
they all consulted the oracle . 他们都会请示先知。
consulted:v.咨询;请教;商议;查阅;查询;参看(consult的过去分词和过去式) oracle:n.神谕;预言;神谕处;圣人;
So this is how it worked: you would bring her a question and you would get on your knees, and then she would go into this trance . 运行模式是这样的: 你把问题告诉她,接着屈膝跪下, 然后她就会进入出神状态。
trance:n.恍惚;出神;着迷,入迷;v.使恍惚;使发呆;
It would take a couple of days, and then eventually she would come out of it, giving you her predictions as your answer. 这会花上几天的时间, 最终她会回神, 答复你她的预知。
eventually:adv.最后,终于; predictions:n.预测,预言(prediction复数形式);
From the oracle bones of ancient China to ancient Greece to Mayan calendars , people have craved for prophecy in order to find out what's going to happen next. 从古中国的甲骨文, 到古希腊,再到马雅历, 人们都渴求着预言, 为了知道接下来会发生什么事。 而这是因为我们都想做正确的决定,
calendars:n.日历;挂历;日程表;记事本;(calendar的复数) craved:vt.渴望;恳求;vi.渴望;恳求; prophecy:n.预言;预言书;预言能力;
And that's because we all want to make the right decision. 我们不希望漏掉了什么。
We don't want to miss something. 未来令人害怕。
The future is scary, so it's much nicer knowing that we can make a decision with some assurance of the outcome . 所以能在某种程度上 保障决定的结果,是很棒的事。 我们有了新的先知, 名字叫大数据。
assurance:n.保证,担保;(人寿)保险;确信;断言;厚脸皮,无耻; outcome:n.结果,结局;成果;
Well, we have a new oracle, and it's name is big data, or we call it "Watson" or "deep learning" or " neural net." 也可以称它为「华生」、 「深度学习」或「人工神经网路」。 如今我们会问先知这样的问题: 「要将这批手机从中国 运到瑞典,怎样最有效率?」
neural:adj.神经的;神经系统的;背的;神经中枢的;
And these are the kinds of questions we ask of our oracle now, like, "What's the most efficient way to ship these phones from China to Sweden?" 或是「我的小孩出生就有 遗传疾病的机率是多少?」 或是「预期这产品的销售量多少?」 我养了只狗,名叫埃莱,最讨厌下雨。
efficient:adj.有效率的;有能力的;生效的;
Or, "What are the odds of my child being born with a genetic disorder ?" 我用尽方法来训练她, 让她适应下雨。 但因为我失败了,
odds:n.几率;胜算;不平等;差别; genetic:adj.基因的;遗传学的; disorder:n.混乱;骚乱;vt.使失调;扰乱;
Or, "What are the sales volume we can predict for this product?" 我还是得谘询一位叫 Dark Sky(天气预报公司)的先知,
volume:n.体积;容积;音量;响度;一册;合订本
I have a dog. Her name is Elle , and she hates the rain. 每次散步之前都会谘询,
Elle:n.世界时装之苑(时尚类期刊);依尼(服装品牌名);
And I have tried everything to untrain her. 以获得接下来十分钟的准确天气预报。
But because I have failed at this, 她真的很贴心。
I also have to consult an oracle, called Dark Sky, every time before we go on a walk, for very accurate weather predictions in the next 10 minutes. 基于这些理由,我们的「先知」 是个 1220 亿美元的产業。 先不论这个产業的规模, 令人惊讶的是它极低的报酬率。
accurate:adj.精确的;
She's so sweet. 投资大数据很简单,
So because of all of this, our oracle is a $122 billion industry. 运用大数据却很难。
Now, despite the size of this industry, the returns are surprisingly low. 73% 以上的大数据计画根本不赚钱, 有些業务主管跑来跟我说,
despite:prep.尽管,不管;n.轻视;憎恨;侮辱; surprisingly:adv.令人惊讶地;出乎意料地
Investing in big data is easy, but using it is hard. 「我们都面临了同样的问题。 我们投资了几个大数据系统,
Investing:v.投资;投入(时间、精力等);(invest的现在分词)
Over 73 percent of big data projects aren't even profitable , and I have executives coming up to me saying, "We're experiencing the same thing. 但我们的员工却还是不能 做出更优的决定。 他们当然也没有想出 更多突破性的点子。」 这些对我来说都很有趣,
profitable:adj.有利可图的;赚钱的;有益的; executives:n.经理,主管领导,管理人员;领导层;行政部门(executive的复数)
We invested in some big data system, and our employees aren't making better decisions. 因为我是个科技人类学家。 我研究并给予公司建议,
invested:v.投资;投入;(invest的过去分词和过去式)
And they're certainly not coming up with more breakthrough ideas." 告诉他们人们使用科技的形态,
breakthrough:n.突破;开始取得成功之时;adj.突破性的;
So this is all really interesting to me, because I'm a technology ethnographer . 我有兴趣的领域之一就是数据。 为什么获得更多数据 却没有幫我们做更好的决定,
technology:n.技术;工艺;术语; ethnographer:n.民族志(或民族学)研究者;
I study and I advise companies on the patterns of how people use technology, and one of my interest areas is data. 特别是那些有资源, 可以投资大数据系统的公司? 为什么他们没有更好地做决定? 我第一时间就目睹了这项困境。
advise:v.建议;通知;劝告;忠告;
So why is having more data not helping us make better decisions, especially for companies who have all these resources to invest in these big data systems? 2009 年,我开始了 在诺基亚的研究工作。 当时,诺基亚是世界上 最大的手机公司之一, 在中国、墨西哥、印度等 新兴市场中占有主要地位──
especially:adv.尤其;特别;格外;十分; resources:n.[计][环境]资源; v.向…提供资金(resource的第三人称单数);
Why isn't it getting any easier for them? 我在这些地方都做了很多研究,
So, I've witnessed the struggle firsthand . 研究低收入的人怎么使用科技产品。
witnessed:v.当场看到,目击;见证;作证;(witness的过去式和过去分词) firsthand:adj.直接的;直接采购的;直接得来的;adv.直接地;
In 2009, I started a research position with Nokia. 我在中国花了特别多时间
And at the time, 来了解地下经济。
Nokia was one of the largest cell phone companies in the world, dominating emerging markets like China, Mexico and India -- all places where I had done a lot of research on how low-income people use technology. 所以我当过街头摊贩, 卖水饺给建筑工人。 我也做过实地调查, 在网咖中日日夜夜地待着,
dominating:adj.个性强势的; v.支配; (dominate的现在分词) emerging:adj.新兴的;v.出现,浮现,露出;暴露;(emerge的现在分词)
And I spent a lot of extra time in China getting to know the informal economy . 和中国年轻人来往,这样我才知道 他们怎么玩游戏、使用手机,
extra time:n.[体]加时(赛);延长(赛); informal:adj.非正式的;不拘礼节的;通俗的;日常使用的; economy:n.经济;节约;理财;
So I did things like working as a street vendor selling dumplings to construction workers. 以及他们从农村地区 移居到城市时的使用情形。 透过我收集的定性资料,
vendor:n.卖主;小贩;[贸易]自动售货机; dumplings:n.小面团;汤团;饺子;水果布丁;(dumpling的复数) construction:n.建设;建筑物;解释;造句;
Or I did fieldwork , spending nights and days in internet cafés, hanging out with Chinese youth, so I could understand how they were using games and mobile phones and using it between moving from the rural areas to the cities. 我开始清楚看见 即将发生在低收入中国人身上的巨变。 虽然他们身边围绕着奢侈品的广告, 像是花俏的马桶──谁不想要呢── 还有公寓和车,
fieldwork:n.野外工作;现场工作;野战工事; mobile:n.手机;汽车;移动电话;adj.活跃的;可动的; rural:adj.农村的,乡下的;田园的,有乡村风味的;
Through all of this qualitative evidence that I was gathering, 从和他们的对话中,
qualitative:adj.定性的;质的,性质上的; evidence:n.证据,证明;迹象;明显;v.证明;
I was starting to see so clearly that a big change was about to happen among low-income Chinese people. 我发现最吸引他们的广告, 是 iPhone 的广告,
was about to:眼看就要;即将;正要;行将;
Even though they were surrounded by advertisements for luxury products like fancy toilets -- who wouldn't want one? -- and apartments and cars, through my conversations with them, 那些广告向他们保证了 进入高科技生活的途径。 即使我和他们一起 住在这样的城市贫民窟, 我也看到人们将半个月以上的收入 拿去买手机,
luxury:n.奢侈,奢华;奢侈品;享受;adj.奢侈的; fancy:n.幻想; adj.想象的; v.想象;
I found out that the ads the actually enticed them the most were the ones for iPhones, promising them this entry into this high-tech life. 而且越来越多都是「山寨品」, 也就是他们买得起的 iPhone 或其他品牌的仿冒品。 这些仿冒品很堪使用。
enticed:v.诱使;引诱;(entice的过去分词和过去式) high-tech:adj.高科技的,高技术的;仿真技术的;n.高科技;
And even when I was living with them in urban slums like this one, 原厂有的功能都能用。
urban:adj.城市的;都市的;城镇的;都市音乐的; slums:n.贫民,[经]贫民区(slum的复数); v.到贫民窟去;
I saw people investing over half of their monthly income into buying a phone, and increasingly , they were "shanzhai," 我和移民一起住、一起工作了数年, 真的是他们做什么,我就做什么, 我开始将所有数据拼凑在一起──
monthly:n.月刊:adv.每个月:每月一次:adj.每月的: increasingly:adv.越来越多地;渐增地;
which are affordable knock-offs of iPhones and other brands . 不论是看似不相关的事, 像是我卖水饺的事,
affordable:adj.负担得起的; brands:n.品牌;烙印(brand的复数);v.加商标于;铭刻于(brand的第三人称单数);
They're very usable. 或是较明显相关的事,
Does the job. 像是追踪他们花多少钱付手机费。
And after years of living with migrants and working with them and just really doing everything that they were doing, 所以我才有办法描绘出 这么多整体画面 来说明当时正发生什么事。
migrants:n.移民;移居者;候鸟(migrant的复数形式);
I started piecing all these data points together -- from the things that seem random , like me selling dumplings, to the things that were more obvious , like tracking how much they were spending on their cell phone bills. 这时我才开始理解到 连中国最穷的人也想要智慧型手机, 且他们几乎会不择手段拿到手。 你们要记得,
random:adj.[数]随机的;任意的;胡乱的;n.随意;adv.胡乱地; obvious:adj.明显的;显著的;平淡无奇的; tracking:n.追踪,跟踪;v.跟踪;(track的现在分词)
And I was able to create this much more holistic picture of what was happening. 当时是 2009 年,iPhone 才刚出现, 这是八年前的事,
holistic:adj.整体的;全盘的;
And that's when I started to realize that even the poorest in China would want a smartphone , and that they would do almost anything to get their hands on one. 安卓手机才刚开始像 iPhone。 很多聪明又现实的人说, 「智慧型手机只是一时的流行。
smartphone:n.智能手机;
You have to keep in mind , iPhones had just come out, it was 2009, so this was, like, eight years ago, and Androids had just started looking like iPhones. 谁会想带着这么重的东西到处走, 又很快就没电, 还会一掉地就坏?」 但我有很多数据,
keep in mind:记住; Androids:n.机器人;
And a lot of very smart and realistic people said, "Those smartphones -- that's just a fad . 我对自己的洞察观点非常有自信, 我兴奋地把数据告诉诺基亚。
realistic:adj.现实的;现实主义的;逼真的;实在论的; smartphones:智能手机(smartphone的复数); fad:n.时尚;一时的爱好;一时流行的狂热;
Who wants to carry around these heavy things where batteries drain quickly and they break every time you drop them?" 但我没能说服诺基亚, 因为那不是大数据。
batteries:n.电池;炮组;炮列;[法]殴打;(batteries是battery的复数) drain:v.排水;流干;喝光,耗尽;n.排水;下水道,排水管;消耗;
But I had a lot of data, and I was very confident about my insights , so I was very excited to share them with Nokia. 他们说:「我们有几百万则数据, 而我们没见到任何数据 指出有人想买智慧型手机, 你的 100 组数据太缺乏多样性,
confident:adj.自信的;确信的; insights:n.洞察力;眼力;深刻见解(insight的复数);
But Nokia was not convinced , because it wasn't big data. 我们完全无法重视这项数据。」 我说:「诺基亚,你说的没错。
convinced:adj.坚信; v.使确信; (convince的过去分词和过去式)
They said, "We have millions of data points, and we don't see any indicators of anyone wanting to buy a smartphone, and your data set of 100, as diverse as it is, is too weak for us to even take seriously." 你当然不会看到有人要买, 因为你所发送问卷的假设前提 是人们不知道智慧型手机是什么, 所以你的数据当然不会反映
indicators:n.指示信号;标志;指针;方向灯;(indicator的复数) diverse:adj.不同的;多种多样的;变化多的;
And I said, "Nokia, you're right. 两年内想买智慧型手机的人的想法。
Of course you wouldn't see this, because you're sending out surveys assuming that people don't know what a smartphone is, so of course you're not going to get any data back about people wanting to buy a smartphone in two years. 你问卷、研究方法的设计理念 都是想让现有的業务型态更好, 而我关注的是这些正浮现的人类动态, 那些是过去没有发生的, 我们看的是市场动态之外,
surveys:n.调查(survey的复数); assuming:conj.假设…为真; adj.傲慢的; v.假定; (assume的现在分词)
Your surveys, your methods have been designed to optimize an existing business model, and I'm looking at these emergent human dynamics that haven't happened yet. 这样我们才能先走一步。」 你们知道诺基亚怎么样了吗? 他们的产業跌落谷底。 这就是错失的代价。
optimize:vt.使最优化,使完善;vi.优化;持乐观态度; emergent:adj.紧急的;浮现的;意外的;自然发生的; dynamics:n.动力学,力学;
We're looking outside of market dynamics so that we can get ahead of it." 那代价是深不可测的。 但不是只有诺基亚这样。
get ahead of:v.胜过;
Well, you know what happened to Nokia? 我看到各机构一天到晚丢弃数据,
Their business fell off a cliff . 因为数据并非来自数量大的模型,
cliff:n.悬崖;绝壁;
This -- this is the cost of missing something. 或对不上数量大的模型数据。
It was unfathomable . 但这不是大数据的错。
unfathomable:adj.深不可测的;无底的;莫测高深的;
But Nokia's not alone. 是我们用错方法,
I see organizations throwing out data all the time because it didn't come from a quant model or it doesn't fit in one. 是我们的责任。 但一般认为大数据的成功之处 在于量化的对象非常的特定,
organizations:n.组织,构造,有机体(organization的复数);组织机构; quant:n.船桨;数量分析专家;vt.用篙撑;vi.用篙撑船;
But it's not big data's fault. 像是电网、物流运送或遗传密码,
It's the way we use big data; it's our responsibility. 也就是些基本上可操纵的系统。
Big data's reputation for success comes from quantifying very specific environments, like electricity power grids or delivery logistics or genetic code , when we're quantifying in systems that are more or less contained. 但并非所有的系统 都能被操纵得好好的。 若你在量化的系统是动态的, 特别是那些有人参与其中的系统, 会产生影响的事物复杂又难以预测,
quantifying:n.定量法;v.量化(quantify的ing形式);定量; specific:adj.特殊的,特定的;明确的;详细的;[药]具有特效的;n.特性;细节;特效药; electricity:n.电力;电流;强烈的紧张情绪; grids:n.[数]网格(grid的复数形式);栅格; delivery:n.[贸易]交付;分娩;递送; logistics:n.[军]后勤;后勤学;物流; genetic code:n.遗传密码; more or less:或多或少;
But not all systems are as neatly contained. 我们不太知道怎样建立这些模型。
neatly:adv.整洁地;熟练地;灵巧地;
When you're quantifying and systems are more dynamic, especially systems that involve human beings, forces are complex and unpredictable , and these are things that we don't know how to model so well. 即使你一时预测了人的行动, 又会出现新的要素, 因为情况持续在改变。 正因如此,这是个永无止境的回圈。
involve:v.包含;需要;牵涉;牵连;影响;(使)参加; complex:adj.复杂的;合成的;n.复合体;综合设施; unpredictable:adj.不可预知的;不定的;出乎意料的;n.不可预言的事;
Once you predict something about human behavior, new factors emerge, because conditions are constantly changing. 你以为你瞭解了一件事, 另一件未知的事物便进入了你的视野。 所以纯粹依靠大数据
factors:n.因素(factor的复数); v.做代理商; constantly:adv.不断地;时常地;
That's why it's a never-ending cycle. 便增加了我们错失的机率,
never-ending:adj.不停的;无限的;
You think you know something, and then something unknown enters the picture. 但同时让我们以为我们无所不知。 为什么我们很难发现这个矛盾,
And that's why just relying on big data alone increases the chance that we'll miss something, while giving us this illusion that we already know everything. 甚至也很难去理解它, 是因为我们有我所谓的「量化成见」, 也就是无意识地认为可量化的
relying:v.依赖;信任;指望(rely的现在分词); illusion:n.幻觉,错觉;错误的观念或信仰;
And what it makes it really hard to see this paradox and even wrap our brains around it is that we have this thing that I call the quantification bias , 比不可量化的更有价值。 我们工作时常有这样的经验。 或许我们和这样想的同事一起工作,
paradox:n.悖论,反论;似非而是的论点;自相矛盾的人或事; wrap:v.缠绕;隐藏;掩护;包起来;缠绕;穿外衣;n.外套;围巾; quantification:n.[统计]定量,量化; bias:adv.使有偏见;n.偏见;偏心;偏爱;v.使有偏见;使偏向;adj.斜的;[电]偏动的;
which is the unconscious belief of valuing the measurable over the immeasurable . 或者整个公司都这样想, 人们过于迷恋数字,
unconscious:adj.无意识的;失去知觉的;未发觉的; immeasurable:adj.无限的;[数]不可计量的;不能测量的;
And we often experience this at our work. 以至于看不见除此之外的任何东西,
Maybe we work alongside colleagues who are like this, or even our whole entire company may be like this, where people become so fixated on that number, that they can't see anything outside of it, even when you present them evidence right in front of their face. 即使你将证据贴到他们脸上,给他们看。 这是个十分吸引人的讯息, 因为量化并没有错; 量化事实上很让人满意。 我看着 Excel 电子表格就觉得安心,
colleagues:n.同事;同行(colleague的复数); fixated:adj.念念不忘的;稳固关系的;v.使固定下来;注视(fixate的过去分词);
And this is a very appealing message, because there's nothing wrong with quantifying; it's actually very satisfying. 即使是很简单的也一样。 (笑声) 那种感觉就是,
appealing:adj.吸引人的; v.呼吁; (appeal的现在分词)
I get a great sense of comfort from looking at an Excel spreadsheet , even very simple ones. 「好的!方程式没问题。 一切都很好。都在掌控之中。」 问题是,
Excel:v.超过;擅长; spreadsheet:n.电子制表软件;电子数据表;试算表;
(Laughter) 量化会使人上瘾。
It's just kind of like, "Yes! The formula worked. It's all OK. Everything is under control." 我们一旦忘记这件事, 若我们没能做到时时确认是否上瘾,
formula:n.公式; adj.(赛车)方程式的(指赛车要符合规定的体积,重量及汽缸容量等);
But the problem is that quantifying is addictive . 我们很容易直接扔掉这样的资料: 仅仅因为它无法用数值量化。
addictive:adj.使人上瘾的;使人入迷的;
And when we forget that and when we don't have something to kind of keep that in check, it's very easy to just throw out data because it can't be expressed as a numerical value. 很容易认为会有完美解决一切的絶招, 就好像有某种简单的解决方法一样。 因为这对任何一间机构来说, 都是危机的重要时刻, 时常,我们要预测的未来,
throw out:v.扔掉;伸出;说出;否决;突出; expressed:v.表示;表达;显而易见;不言自明;(express的过去分词和过去式) numerical:adj.数值的;数字的;用数字表示的(等于numeric);
It's very easy just to slip into silver-bullet thinking, as if some simple solution existed. 并不是在这安稳的草堆里, 而是在它之外, 是即将袭击我们的暴风中心。
slip:v.溜;下降;滑落;n.纸条;衬裙; solution:n.解决方案;溶液;溶解;解答;
Because this is a great moment of danger for any organization, because oftentimes , the future we need to predict -- it isn't in that haystack , but it's that tornado that's bearing down on us outside of the barn . 没有什么比对未知 一无所知来得有风险, 那会使你做出错误的决定。 那可能使你错失重要的事物。 但我们不用这样做。 到头来,是古希腊的先知 握有显示道路的神秘钥匙。
oftentimes:adv.时常地; haystack:n.干草堆;比喻如大海捞针般难找; tornado:n.[气象]龙卷风;旋风;暴风;大雷雨; bearing:n.关系;影响;姿态;举止v.承受;忍受;承担责任;(bear的现在分词) barn:n.谷仓;畜棚;车库;靶(核反应截面单位);v.把…贮存入仓;
There is no greater risk than being blind to the unknown. 近年的地质研究显示, 最有名的先知所在的阿波罗神庙,
It can cause you to make the wrong decisions. 事实上座落在两个地震断层上。
It can cause you to miss something big. 这些断层会从地壳下释出石油烟气,
But we don't have to go down this path. 而那位先知就直接坐在那些断层上方,
It turns out that the oracle of ancient Greece holds the secret key that shows us the path forward. 从缝隙中吸入数不尽的乙烯气体。 (笑声)
Now, recent geological research has shown that the Temple of Apollo , where the most famous oracle sat, was actually built over two earthquake faults. 那是真的。 (笑声) 那都是真的,那就是为什么 她讲话含糊不清还看到幻觉,
geological:adj.地质的,地质学的; Apollo:n.阿波罗(太阳神);美男子;
And these faults would release these petrochemical fumes from underneath the Earth's crust , and the oracle literally sat right above these faults, inhaling enormous amounts of ethylene gas, these fissures . 并进入类似出神的状态。 她感觉自己都飞上天了! (笑声) 所以大家要怎么──
release:v.释放;发射;让与;允许发表;n.释放;发布;让与; petrochemical:adj.石化的;n.石油化学产品; fumes:n.烟气;激动;空想的事物(fume的复数);v.蒸发;冒烟;发怒(fume的三单形式); underneath:prep.在…的下面;在…的支配下;n.下面;底部;adj.下面的;底层的; crust:n.壳;表面;厚颜无耻;[美国]雪壳;v.用外皮覆盖;结成硬皮;生痂儿;形成硬壳; literally:adv.按字面:字面上:确实地: inhaling:n.吸入;吸气;v.吸入;吸气;猛喝(inhale的现在分词); enormous:adj.庞大的,巨大的;凶暴的,极恶的; ethylene:n.乙烯; fissures:n.裂纹(fissure的复数形式); v.(使)分裂(fissure的第三人称单数形式);
(Laughter) 大家要怎么在这个状态下 得到有用的建议?
It's true. 看到那些围绕先知的人们了吗?
(Laughter) 你可以看到那些人支撑着她,
It's all true, and that's what made her babble and hallucinate and go into this trance-like state. 因为她好像有点头昏眼花? 有没有发现她左边的男子
babble:v.喋喋不休;呀呀学语;作潺潺声;泄露;n.含糊不清的话;胡言乱语;潺潺声; hallucinate:v.(由于生病、吸毒)幻听,幻视,产生幻觉;
She was high as a kite! 正拿着橘色小册子?
(Laughter) 那些是神庙的引导人员,
So how did anyone -- 他们与先知密切合作。
How did anyone get any useful advice out of her in this state? 当有人来下跪询问时, 神庙的引导人员就开始工作了,
Well, you see those people surrounding the oracle? 在来者向先知询问一些问题后,
You see those people holding her up, because she's, like, a little woozy ? 他们会观察来者的精神状态, 然后他们会问来者一些后续问题,
woozy:adj.虚弱的,微醉的;头昏眼花的;
And you see that guy on your left-hand side holding the orange notebook? 像是:「为什么你想知道 这个预言?你是谁? 你会怎么运用这个资讯?」
left-hand:adj.左手的;左侧的;
Well, those were the temple guides, and they worked hand in hand with the oracle. 接着神庙的引导人员会 用人类学的角度来看, 用质性资讯的角度来看,
hand in hand:adj.并进的;手拉手的;亲密的;
When inquisitors would come and get on their knees, that's when the temple guides would get to work, because after they asked her questions, they would observe their emotional state, and then they would ask them follow-up questions, like, "Why do you want to know this prophecy? Who are you? 然后翻译先知含糊不清的话。 所以先知并非自己承揽一切任务, 我们的大数据系统同样也不该如此。 我要澄清一下, 我并非在说大数据系统 在呼吸着乙烯气体, 甚至给予没用的预测。
inquisitors:n.检察官;询问者;审问者; observe:v.观察;看到;庆祝;监视; emotional:adj.情绪的;易激动的;感动人的; follow-up:adj.后续的;增补的;n.随访;跟进;后续行动;
What are you going to do with this information?" 完全相反。
And then the temple guides would take this more ethnographic , this more qualitative information, and interpret the oracle's babblings. 我想说的是, 就像先知需要神庙的引导人员那样, 大数据系统同样也需要。
ethnographic:adj.人种志的;民族志学的; interpret:v.诠释;说明;口译;把…理解为;
So the oracle didn't stand alone, and neither should our big data systems. 大数据需要人类学家以及用户研究人员 来收集我所谓的「厚数据」──
Now to be clear, 来自于人们的宝贵数据,
I'm not saying that big data systems are huffing ethylene gas, or that they're even giving invalid predictions. 像是故事、情绪和互动, 这些无法计量的事物。 就像我收集给诺基亚的那种数据,
huffing:v.生气地说;怒气冲冲(huff的现在分词) invalid:adj.无效的; n.病人; vt.使伤残; vi.变得病弱;
The total opposite. 数据样本规模非常小,
But what I am saying is that in the same way that the oracle needed her temple guides, our big data systems need them, too. 但传达的涵义却极其的深。 它如此厚重、内容丰富的原因是 那些从人们的话语中 明白更多信息的经验。
They need people like ethnographers and user researchers who can gather what I call thick data. 这才能幫助我们看到 模型里缺少了什么东西。 厚数据以人类问题为根基 来说明经济问题,
This is precious data from humans, like stories, emotions and interactions that cannot be quantified . 这就是为什么结合大数据和厚数据 能让我们得到的讯息更加完整。
precious:adj.宝贵的;珍贵的;矫揉造作的; emotions:n.强烈的感情;激情;情感;(emotion的复数) interactions:n.[计]交互,相互作用;相互交流;干扰;(interaction复数) quantified:adj.量化的,定量;v.被量化(quantify的过去分词);
It's the kind of data that I collected for Nokia that comes in in the form of a very small sample size, but delivers incredible depth of meaning. 大数据能在一定程度上洞悉问题, 并最大程度发挥机器智能, 而厚数据能幫我们找到 那缺失的背景资讯,
incredible:adj.难以置信的,惊人的;
And what makes it so thick and meaty is the experience of understanding the human narrative . 能让大数据便于使用, 并最大程度发挥人类智能。
meaty:adj.肉的;多肉的;似肉的; narrative:n.叙述;故事;讲述;adj.叙事的,叙述的;叙事体的;
And that's what helps to see what's missing in our models. 若你真的把这两个结合在一起 事情就会变得非常有趣,
Thick data grounds our business questions in human questions, and that's why integrating big and thick data forms a more complete picture. 如此一来,运用的就不只是 你早就收集的数据。 你还可以运用尚未收集的数据。 你就可以知道「为什么」:
integrating:v.(使)合并,成为一体;(使)加入,融入群体;(integrate的现在分词)
Big data is able to offer insights at scale and leverage the best of machine intelligence , whereas thick data can help us rescue the context loss that comes from making big data usable, and leverage the best of human intelligence. 为什么会变成这样? 所以说,网飞这样做 就开启了转换商業模式的全新方式。 网飞以拥有优秀的推荐演算法而闻名, 且发给任何能改善系统的人 一百万美元奖金。
scale:n.规模;比例;鳞;刻度;天平;数值范围;v.衡量;攀登;剥落;生水垢; leverage:n.影响力;杠杆作用;杠杆效力;v.举债经营;借贷收购; intelligence:n.智力;智慧;才智;(尤指关于敌国的)情报; whereas:conj.然而;鉴于;反之; rescue:n.救援;抢救;营救;获救;v.抢救;营救;援救; context:n.环境;上下文;来龙去脉;
And when you actually integrate the two, that's when things get really fun, because then you're no longer just working with data you've already collected. 有人赢了奖金。 但网飞发现效能提升还是不够明显。 为了知道发生了什么事,
integrate:v.成为一体;(使)加入;adj.完全的;
You get to also work with data that hasn't been collected. 他们雇用了人类学家, 格兰特.麦克拉肯,
You get to ask questions about why: 来收集厚数据以准确洞察理解。
Why is this happening? 他发现了网飞最初未能 从量化数据中看出来的,
Now, when Netflix did this, they unlocked a whole new way to transform their business. 他发现人们喜欢刷剧。 (注:短时间内狂看电视剧) 事实上,人们甚至不觉得有什么不对。
Netflix:n.网飞公司(出租DVD;在线观看电影的网站。); transform:v.使改变;使改观;使转换;n.[数]变换式;[化]反式;
Netflix is known for their really great recommendation algorithm, and they had this $1 million prize for anyone who could improve it. 他们非常享受这个过程。 (笑声)
recommendation:n.推荐;介绍;提议;正式建议; improve:v.改进;改善;
And there were winners. 网飞觉得:「噢,这是个新洞见。」
But Netflix discovered the improvements were only incremental . 于是叫他们的数据科学组
improvements:n.改善;改进;改善的事物;(improvement的复数) incremental:adj.增加的,增值的;
So to really find out what was going on, they hired an ethnographer, Grant McCracken, to gather thick data insights. 把这洞察放大到 量化数据的规模来衡量。 一旦他们再次确认了它的准确性, 网飞便决定做一件简单 却影响很大的事情。
And what he discovered was something that they hadn't seen initially in the quantitative data. 他们说: 「与其提供不同类型但相似的影集,
initially:adv.最初,首先;开头; quantitative:adj.定量的;量的,数量的;
He discovered that people loved to binge-watch . 或是给类似的观众 欣赏更多不同的影集,
binge-watch:vt.煲剧,刷剧;
In fact, people didn't even feel guilty about it. 只要同一影集提供更多集就好了。
guilty:adj.有罪的;内疚的;
They enjoyed it. 我们让你更容易刷剧。」
(Laughter) 而他们并没有止步于此。
So Netflix was like, "Oh. This is a new insight." 他们用一样的方式,
So they went to their data science team, and they were able to scale this big data insight in with their quantitative data. 重新设计了整个观众体验, 来真正地鼓励大家刷剧。 这就是为什么朋友会消失整个星期,
And once they verified it and validated it, 追上「无为大师」等戏剧的进度。
verified:adj.已查清的,已证实的; validated:v.证实;确认;使生效;批准;认可;(validate的过去分词和过去式)
Netflix decided to do something very simple but impactful . 结合大数据与厚数据,
impactful:adj.有效的;有力的;
They said, instead of offering the same show from different genres or more of the different shows from similar users, we'll just offer more of the same show. 不只让产業进步, 也转变了我们使用媒体的型态。 预期他们的股票 会在接下来几年内翻倍。
genres:n.流派(genre的复数);体裁;种类;
We'll make it easier for you to binge-watch. 这不只是关于看了更多影片,
And they didn't stop there. 或卖了更多智慧型手机,等等。
They did all these things to redesign their entire viewer experience, to really encourage binge-watching. 对于一些公司来说, 结合厚数据洞察和演算法, 可能让他们起死回生,
redesign:vt.重新设计;n.重新设计;新设计;
It's why people and friends disappear for whole weekends at a time, catching up on shows like "Master of None." 特别是那些已被边缘化的公司。 全国的警察局都用大数据来防止犯罪,
disappear:v.消失;失踪;不复存在;
By integrating big data and thick data, they not only improved their business, but they transformed how we consume media . 来设定保证金金额, 并用加剧偏见的方式来建议判刑。
improved:adj.改良的:v.改进:改善(improve的过去分词和过去式) transformed:v.使改变形态;使改变外观(或性质);(transform的过去分词和过去式) consume:v.消耗;吃;毁灭;烧毁; media:n.媒体;媒质(medium的复数);血管中层;浊塞音;中脉;
And now their stocks are projected to double in the next few years. 美国国家安全局的天网学习演算法
stocks:n.[金融]股票; v.采购;
But this isn't just about watching more videos or selling more smartphones. 可能致使几千名巴基斯坦平民死亡, 肇因于错误判读了行动电话的数据。
For some, integrating thick data insights into the algorithm could mean life or death, especially for the marginalized . 当我们的生活变得更加自动化, 从汽车、健康保险或者就業, 很可能我们所有人
marginalized:使边缘化;忽略;排斥(marginalize的过去式和过去分词);
All around the country, police departments are using big data for predictive policing, to set bond amounts and sentencing recommendations in ways that reinforce existing biases . 都会受量化偏见的影响。 好消息是 我们从吸入乙烯气体到做出预测 已有长足的进步。
predictive:adj.预言性的;成为前兆的; recommendations:n.推荐;推荐信;推荐规范(recommendation的复数形式); reinforce:vt.加强,加固; vi.求援; n.加强; biases:n.偏差,偏见(bias的复数形式);v.偏见(bias的三单形式);
NSA's Skynet machine learning algorithm has possibly aided in the deaths of thousands of civilians in Pakistan from misreading cellular device metadata . 我们有了更好的工具, 那麽让我们更好地利用它。 让我们将大数据与厚数据结合。 让我们使神庙的引导人员 与先知一起合作,
Skynet:n.天网(卫星);天网防火墙; civilians:n.平民;市民;文官;无军职人员;(civilian的复数) misreading:n.读错;误解;v.误解;读错;误读;(misread的现在分词) cellular:adj.细胞的;多孔的;由细胞组成的;n.移动电话;单元; device:n.装置;策略;图案; metadata:n.[计]元数据;
As all of our lives become more automated , from automobiles to health insurance or to employment , it is likely that all of us will be impacted by the quantification bias. 不论做这项工作的是 公司、非营利组织、 政府,甚至软体, 全部都有其意义,
automated:adj.自动化的;v.(使)自动化;(automate的过去式和过去分词) automobiles:n.[车辆]汽车,发动器(automobile复数);关于汽车; insurance:n.保险;保险业;保险费;保费;adj.胜券在握的; employment:n.使用;职业;雇用; impacted:adj.压紧的;结实的;嵌入的;(人口)稠密的;v.装紧;挤满(impact的过去分词);
Now, the good news is that we've come a long way from huffing ethylene gas to make predictions. 因为这代表我们全体一起努力 来得到更好的数据,
come a long way:突飞猛进;
We have better tools, so let's just use them better. 更好的演算法、更好的产品,
Let's integrate the big data with the thick data. 以及更好的决定。