Transcript
00:00:00 – Explaining model jaggedness
00:00:00 – 讲解模型的“锯齿感”
Ilya Sutskever 00:00:00
You know what’s crazy? That all of this is real.
你知道最疯狂的是什么吗?就是这一切都是真的。
Dwarkesh Patel 00:00:04
Meaning what?
什么意思?
Ilya Sutskever 00:00:05
Don’t you think so? All this AI stuff and all this Bay Area… that it’s happening. Isn’t it straight out of science fiction?
你不这么觉得吗?这些关于 AI 的东西,还有整个 Bay Area……这一切真的在发生。难道不是活脱脱从科幻小说里走出来的吗?
Dwarkesh Patel 00:00:14
Another thing that’s crazy is how normal the slow takeoff feels. The idea that we’d be investing 1% of GDP in AI, I feel like it would have felt like a bigger deal, whereas right now it just feels...
还有一件疯狂的事是,“慢启动”起来之后感觉有多正常。按理说,我们在 AI 上投入 1% 的 GDP,这件事应该会让人感觉是个超级大的事情,但现在感觉就只是……
Ilya Sutskever 00:00:26
We get used to things pretty fast, it turns out. But also it’s kind of abstract. What does it mean? It means that you see it in the news, that such and such company announced such and such dollar amount. That’s all you see. It’s not really felt in any other way so far.
结果证明我们适应得非常快。但这件事还有点抽象。什么意思呢?就是你只是在新闻里看到,某某公司宣布又投入了多少多少美元。你能看到的就只有这些。到目前为止,在其它层面上其实还没有什么真实的体感。
Dwarkesh Patel 00:00:45
Should we actually begin here? I think this is an interesting discussion.
我们要不要就从这里正式开始?我觉得这个话题挺有意思的。
Ilya Sutskever 00:00:47
Sure.
好啊。
Dwarkesh Patel 00:00:48
I think your point, about how from the average person’s point of view nothing is that different, will continue being true even into the singularity.
我觉得你刚才那个观点——从普通人的视角看起来,好像并没有什么太大不同——就算到了奇点时代,可能也会继续成立。
Ilya Sutskever 00:00:57
No, I don’t think so.
不,我不这么认为。
Dwarkesh Patel 00:00:58
Okay, interesting.
好,有意思。
Ilya Sutskever 00:01:00
The thing which I was referring to not feeling different is, okay, such and such company announced some difficult-to-comprehend dollar amount of investment. I don’t think anyone knows what to do with that.
我刚才说“感觉没什么不同”的意思是,比如,某家公司宣布了一笔很难理解规模的投资金额。我觉得没有人真正知道该如何对待这种信息。
But I think the impact of AI is going to be felt. AI is going to be diffused through the economy. There’ll be very strong economic forces for this, and I think the impact is going to be felt very strongly.
但我认为,AI 的影响是一定会被切实感受到的。AI 会在整个经济体系中扩散,会有非常强大的经济力量推动这种扩散,我觉得它带来的冲击会非常明显。
Dwarkesh Patel 00:01:30
When do you expect that impact? I think the models seem smarter than their economic impact would imply.
你觉得这种影响会在什么时候真正显现出来?在我看来,现在模型表现出来的“聪明程度”,要远远超过它们目前在经济上体现出来的影响。
Ilya Sutskever 00:01:38
Yeah. This is one of the very confusing things about the models right now. How to reconcile the fact that they are doing so well on evals? You look at the evals and you go, “Those are pretty hard evals.” They are doing so well. But the economic impact seems to be dramatically behind. It’s very difficult to make sense of, how can the model, on the one hand, do these amazing things, and then on the other hand, repeat itself twice in some situation?
是的。这正是当下这些模型非常让人困惑的地方之一。我们要怎么调和这样一个事实:它们在各种评测上的表现非常好?你看那些评测,会觉得“这些测试其实相当难”,而模型在上面表现得很好。但它们在经济层面造成的影响又似乎严重滞后。很难理解,为什么一个模型一方面可以做出那些惊人的事情,另一方面却在某些场景下会把同样的错误重复两次?
An example would be, let’s say you use vibe coding to do something. You go to some place and then you get a bug. Then you tell the model, “Can you please fix the bug?” And the model says, “Oh my God, you’re so right. I have a bug. Let me go fix that.” And it introduces a second bug. Then you tell it, “You have this new second bug,” and it tells you, “Oh my God, how could I have done it? You’re so right again,” and brings back the first bug, and you can alternate between those. How is that possible? I’m not sure, but it does suggest that something strange is going on.
比如说,你用 vibe coding 去完成一项任务。你跑到某个地方时遇到一个 bug,于是你对模型说:“你能帮我修一下这个 bug 吗?”模型回答:“天哪,你说得太对了,确实有个 bug,我这就给你修。”结果它又引入了第二个 bug。然后你告诉它:“现在你有了一个新的第二个 bug。”它又说:“天哪,我怎么会犯这种错误?你再次说得完全正确。”接着又把第一个 bug 带了回来,你们就这样在两个 bug 之间来回切换。这怎么可能?我也不确定,但这确实说明有些很奇怪的事情在发生。
I have two possible explanations. The more whimsical explanation is that maybe RL training makes the models a little too single-minded and narrowly focused, a little bit too unaware, even though it also makes them aware in some other ways. Because of this, they can’t do basic things.
我有两个可能的解释。一个比较“异想天开”的解释是,也许 RL 训练让模型有点过于单一、过于聚焦,在某些方面反而有些“不自知”,尽管 RL 在另外一些方面又确实提升了它们的觉察能力。正因为这种偏狭,它们在一些非常基本的事情上反而做不好。
But there is another explanation. Back when people were doing pre-training, the question of what data to train on was answered, because that answer was everything. When you do pre-training, you need all the data. So you don’t have to think if it’s going to be this data or that data.
但还有另一种解释。早期大家做 pre-training 的时候,“该用什么数据来训练”这个问题其实已经被回答了,因为答案就是:一切能拿到的数据。做预训练时,你需要所有的数据,所以根本不需要纠结到底是用这份数据还是那份数据。
But when people do RL training, they do need to think. They say, “Okay, we want to have this kind of RL training for this thing and that kind of RL training for that thing.” From what I hear, all the companies have teams that just produce new RL environments and just add it to the training mix. The question is, well, what are those? There are so many degrees of freedom. There is such a huge variety of RL environments you could produce.
但在做 RL 训练时,人们就必须要思考取舍了。他们会说:“好,我们要为这件事设计一类 RL 训练,为那件事设计另一类 RL 训练。”据我了解,各家公司都有专门的团队,不断产出新的 RL environment 然后加进训练配方里。问题在于,这些 environment 究竟是什么?自由度太多了,你可以构造出来的 RL 环境种类极其庞杂。
One thing you could do, and I think this is something that is done inadvertently, is that people take inspiration from the evals. You say, “Hey, I would love our model to do really well when we release it. I want the evals to look great. What would be RL training that could help on this task?” I think that is something that happens, and it could explain a lot of what’s going on.
有一件事是大家可以做、而且我认为实际上也在无意间发生的,就是人们会从 evals 中汲取灵感。你会说:“我希望我们发布模型时,它在这些 evals 上表现得非常好、分数好看。那我们应该设计什么样的 RL 训练来帮助它在这些任务上拿高分呢?”我觉得这种事情确实在发生,而且它可以解释目前很多现象。
If you combine this with generalization of the models actually being inadequate, that has the potential to explain a lot of what we are seeing, this disconnect between eval performance and actual real-world performance, which is something that we don’t today even understand, what we mean by that.
如果再把这一点同“模型的泛化其实并不充分”结合起来看,就很有可能解释我们现在看到的很多情况——也就是评估表现和真实世界表现之间的这种脱节。更棘手的是,直到今天,我们甚至还没有完全搞清楚,当我们说这种“脱节”时,究竟在精确地指什么。
某些死读书的傻子,就是这种泛化能力不够的表现,Ilya Sutskever看的很清楚。
Dwarkesh Patel 00:05:00
I like this idea that the real reward hacking is the human researchers who are too focused on the evals.
我挺喜欢这个说法:真正在做“reward hacking”的,其实是那些过度关注 eval 的人类研究者。
I think there are two ways to understand, or to try to think about, what you have just pointed out. One is that if it’s the case that simply by becoming superhuman at a coding competition, a model will not automatically become more tasteful and exercise better judgment about how to improve your codebase, well then you should expand the suite of environments such that you’re not just testing it on having the best performance in coding competition. It should also be able to make the best kind of application for X thing or Y thing or Z thing.
我觉得可以用两种方式来理解、或者说来思考你刚刚指出的那个现象。第一种是:如果事实是,仅仅在 coding competition 上变得超越人类,并不会自动让一个模型在整体上变得更有“品味”,也不会让它在如何改进你的 codebase 这件事上做出更好的判断,那你就应该扩展它所处的环境集合,而不是只看它在 coding competition 上是不是拿了最好的成绩。你还应该让它能为 X、Y、Z 各种不同的应用场景,做出最好的那一类真实应用。
Another, maybe this is what you’re hinting at, is to say, “Why should it be the case in the first place that becoming superhuman at coding competitions doesn’t make you a more tasteful programmer more generally?” Maybe the thing to do is not to keep stacking up the amount and diversity of environments, but to figure out an approach which lets you learn from one environment and improve your performance on something else.
另一种理解,也许这就是你在暗示的,是要问一句:“为什么一开始就应该默认,在 coding competitions 上变得超越人类,不会让你在整体上变成一个更有品味的程序员呢?”也许真正要做的事情,不是无止境地堆环境的数量和多样性,而是想办法找到一种路径,让你能从一个环境里学到东西,并把这种能力迁移到别的任务上、在别的事情上表现得更好。
Ilya Sutskever 00:06:08
I have a human analogy which might be helpful. Let’s take the case of competitive programming, since you mentioned that. Suppose you have two students. One of them decided they want to be the best competitive programmer, so they will practice 10,000 hours for that domain. They will solve all the problems, memorize all the proof techniques, and be very skilled at quickly and correctly implementing all the algorithms. By doing so, they became one of the best.
我有一个关于人的类比,可能会有帮助。既然你刚才提到 competitive programming,我们就用这个例子。假设有两个学生,其中一个决定要成为最强的 competitive programmer,于是他在这个领域上投入了一万小时的训练,把所有题目都刷完,把所有证明技巧都背下来,并且非常熟练地、又快又准地实现所有算法。通过这一切,他成了这个领域里数一数二的人。
Student number two thought, “Oh, competitive programming is cool.” Maybe they practiced for 100 hours, much less, and they also did really well. Which one do you think is going to do better in their career later on?
第二个学生只是觉得:“哦,competitive programming 挺酷的。”也许他只练了 100 小时,少得多,但他也做得不错。那你觉得,等他们走上职业道路之后,哪一个未来发展会更好?
Dwarkesh Patel 00:06:56
The second.
第二个。
Ilya Sutskever 00:06:57
Right. I think that’s basically what’s going on. The models are much more like the first student, but even more. Because then we say, the model should be good at competitive programming so let’s get every single competitive programming problem ever. And then let’s do some data augmentation so we have even more competitive programming problems, and we train on that. Now you’ve got this great competitive programmer.
对。我觉得现在发生的事情本质上就是这样。模型更像第一个学生,甚至有过之而无不及。因为我们会说:模型应该在 competitive programming 上很强,于是就把所有出现过的竞赛题统统抓来,然后再做 data augmentation,造出更多的竞赛题,再用这些东西去训练它。结果就是,你得到了一个极其强大的“竞赛程序员”。
With this analogy, I think it’s more intuitive. Yeah, okay, if it’s so well trained, all the different algorithms and all the different proof techniques are right at its fingertips. And it’s more intuitive that with this level of preparation, it would not necessarily generalize to other things.
在这个类比下,我觉得就更直观了。是的,既然它被训练得这么好,各种算法、各种证明技巧都信手拈来,那么也就更容易理解:在这样一种高度专门化的准备之下,它未必能很好地泛化到别的事情上去。
Dwarkesh Patel 00:07:39
But then what is the analogy for what the second student is doing before they do the 100 hours of fine-tuning?
但如果这样的话,那么在第二个学生进行那 100 小时“微调”之前,他在做的事情,在这个类比里对应的是什么?
Ilya Sutskever 00:07:48
I think they have “it.” The “it” factor. When I was an undergrad, I remember there was a student like this that studied with me, so I know it exists.
我觉得他们身上有“it”,就是所谓的“it factor”。我上本科的时候,身边就有这样一个同学跟我一起学习,所以我知道这种东西是存在的。
第一单纯的死记,明显缺少泛化的能力,第二个学生更有效率,效率的背后是简洁和优雅的知识结构,简洁优雅的背后是安全感,安全感通过简洁优雅的形式表现出来就是天赋(it factor)。
Dwarkesh Patel 00:08:01
I think it’s interesting to distinguish “it” from whatever pre-training does. One way to understand what you just said about not having to choose the data in pre-training is to say it’s actually not dissimilar to the 10,000 hours of practice. It’s just that you get that 10,000 hours of practice for free because it’s already somewhere in the pre-training distribution. But maybe you’re suggesting there’s actually not that much generalization from pre-training. There’s just so much data in pre-training, but it’s not necessarily generalizing better than RL.
我觉得把“it”和 pre-training 所做的事情区分开来这点挺有意思的。理解你刚才说“预训练阶段不用仔细挑数据”的一种方式,是把它看成跟“一万小时刻意练习”有点类似。只是说,你那“一万小时的练习”是白送的,因为它已经以某种形式存在于 pre-training 的数据分布里了。但也许你在暗示的是:预训练本身的泛化其实没有那么强,只是因为预训练里有海量的数据,但它的泛化未必就比 RL 更好。
Ilya Sutskever 00:08:31
The main strength of pre-training is that: A, there is so much of it, and B, you don’t have to think hard about what data to put into pre-training. It’s very natural data, and it does include in it a lot of what people do: people’s thoughts and a lot of the features. It’s like the whole world as projected by people onto text, and pre-training tries to capture that using a huge amount of data.
pre-training 的主要优势在于两点:第一,数据量极其庞大;第二,你不需要为“往里放什么数据”绞尽脑汁。它用的是非常“天然”的数据,而这些数据里包含了大量人类的行为、人类的想法,以及各种各样的特征。它有点像是“整个人类世界被投影到文本上的结果”,而 pre-training 就是在用海量数据去捕捉这个投影。
Pre-training is very difficult to reason about because it’s so hard to understand the manner in which the model relies on pre-training data. Whenever the model makes a mistake, could it be because something by chance is not as supported by the pre-training data? “Support by pre-training” is maybe a loose term. I don’t know if I can add anything more useful on this. I don’t think there is a human analog to pre-training.
pre-training 本身其实很难被彻底“想明白”,因为我们很难真正理解模型究竟以什么方式在依赖这些预训练数据。每当模型犯错时,你都会问:会不会只是因为恰好有些东西在预训练数据中“支持得不够”?“被 pre-training 支持”这个说法本身可能就比较松散。我也不确定自己在这方面还能补充多少有用的东西。我不认为在人类身上存在一个真正对应的、与 pre-training 等价的东西。
很可能需要先注入简洁优雅的知识结构再重新做一遍pre-training。
00:09:39 – Emotions and value functions
00:09:39 – 情绪与价值函数
Dwarkesh Patel 00:09:39
Here are analogies that people have proposed for what the human analogy to pre-training is. I’m curious to get your thoughts on why they’re potentially wrong. One is to think about the first 18, or 15, or 13 years of a person’s life when they aren’t necessarily economically productive, but they are doing something that is making them understand the world better and so forth. The other is to think about evolution as doing some kind of search for 3 billion years, which then results in a human lifetime instance.
这里有几种别人提出过的类比,用来解释在人类身上什么东西最接近 pre-training。我挺好奇你怎么看这些类比哪里可能不对。第一种是把一个人生命中前 18 年、15 年,或者 13 年看成类比——在那段时间里,他在经济上未必是“有产出”的,但确实在做一些让自己更好理解世界的事情。另一个类比是把进化本身看成一种持续了 30 亿年的“搜索过程”,最终在一个具体的人类一生中体现出来。
I’m curious if you think either of these are analogous to pre-training. How would you think about what lifetime human learning is like, if not pre-training?
我想问你觉得这两种说法有没有哪个真算得上是 pre-training 的类比?如果都算不上,你会怎么理解“人类一生中的学习”这件事,而不是把它简单等同于 pre-training?
Ilya Sutskever 00:10:22
I think there are some similarities between both of these and pre-training, and pre-training tries to play the role of both of these. But I think there are some big differences as well. The amount of pre-training data is very, very staggering.
我觉得这两种东西跟 pre-training 都有一些相似之处,而 pre-training 也确实是在试图扮演这两种角色。但同时,我也认为它们之间存在非常大的差别。pre-training 所用的数据量是大得惊人的。
Dwarkesh Patel 00:10:39
Yes.
是的。
Ilya Sutskever 00:10:40
Somehow a human being, after even 15 years with a tiny fraction of the pre-training data, they know much less. But whatever they do know, they know much more deeply somehow. Already at that age, you would not make mistakes that our AIs make.
但不知为何,人类在 15 年之后,只经历了远远少于 pre-training 的那点“数据”,他们所掌握的东西在数量上少得多,可是他们对那些东西的理解深度却要强得多。在那个年龄,人已经不会犯我们今天的 AI 会犯的那些错误了。
There is another thing. You might say, could it be something like evolution? The answer is maybe. But in this case, I think evolution might actually have an edge. I remember reading about this case. One way in which neuroscientists can learn about the brain is by studying people with brain damage to different parts of the brain. Some people have the most strange symptoms you could imagine. It’s actually really, really interesting.
还有一点。你可以说,这会不会更像进化?答案是:也许有点像。但在这里,我觉得进化其实还有它占优的一面。我记得自己曾看过一个案例。神经科学家研究大脑的一个方式,是去研究那些大脑不同区域受损的人。有些人的症状离奇到超出你的想象,这些案例其实非常非常有趣。
One case that comes to mind that’s relevant. I read about this person who had some kind of brain damage, a stroke or an accident, that took out his emotional processing. So he stopped feeling any emotion. He still remained very articulate and he could solve little puzzles, and on tests he seemed to be just fine. But he felt no emotion. He didn’t feel sad, he didn’t feel anger, he didn’t feel animated. He became somehow extremely bad at making any decisions at all. It would take him hours to decide on which socks to wear. He would make very bad financial decisions.
有一个和这里很相关的案例一直让我印象深刻。我读到过一个人,他因为中风或者某种意外,导致大脑中负责情绪处理的部分受损,于是他完全失去了情绪体验。他依然能说会道,做一些小的智力题也完全没问题,各种测试看起来都挺正常。但他不再有任何情绪体验——他不会感到悲伤,不会愤怒,也不会兴奋振作。结果是,他变得极其不擅长做任何决定。甚至光是决定要穿哪一双袜子,就能耗上好几个小时;在财务决策上也会做出非常糟糕的选择。
感受安全或者风险的能力没了,指引方向的奖励函数没了,人都是野生动物,感受风险或者安全可能是最早得到充分进化的能力。
What does it say about the role of our built-in emotions in making us a viable agent, essentially? To connect to your question about pre-training, maybe if you are good enough at getting everything out of pre-training, you could get that as well. But that’s the kind of thing which seems... Well, it may or may not be possible to get that from pre-training.
这件事说明了什么?它说明我们内置的情绪在让我们变成一个“能在世界里正常行动的智能体”这件事上,扮演了多么重要的角色。把这点再拉回到你关于 pre-training 的问题,也许如果你足够擅长从 pre-training 里把所有信息都挖干净,你也能学到类似的东西。但这种能力给人的感觉是……嗯,从 pre-training 里到底能不能获得这种东西,说不定也行,说不定也不行。
如何给一个新的生命体注入“美感”或“优雅”(Elegance)的隐性偏好?这个东西从哪里获取?
Dwarkesh Patel 00:12:56
What is “that”? Clearly not just directly emotion. It seems like some almost value function-like thing which is telling you what the end reward for any decision should be. You think that doesn’t sort of implicitly come from pre-training?
你说的“that”到底是什么?显然不只是情绪本身,它更像是一种接近 value function 的东西,在告诉你每个决策最终对应的回报应该是什么。你觉得这种东西不会在某种意义上隐含地来自 pre-training 吗?
Ilya Sutskever 00:13:15
I think it could. I’m just saying it’s not 100% obvious.
我觉得有可能会来自那里。我只是说,这点并不是百分之百显而易见的。
Dwarkesh Patel 00:13:19
But what is that? How do you think about emotions? What is the ML analogy for emotions?
但它究竟是什么?你是怎么看待情绪的?在机器学习里,情绪对应的类比会是什么?
Ilya Sutskever 00:13:26
It should be some kind of a value function thing. But I don’t think there is a great ML analogy because right now, value functions don’t play a very prominent role in the things people do.
它应该是一种类似 value function 的东西。但我不觉得在 ML 里有一个特别好的类比,因为目前在大家真正做的很多事情里,value function 并没有扮演特别核心、突出的角色。
Dwarkesh Patel 00:13:36
It might be worth defining for the audience what a value function is, if you want to do that.
也许可以先给大家解释一下什么是 value function,如果你方便的话。
Ilya Sutskever 00:13:39
Certainly, I’ll be very happy to do that. When people do reinforcement learning, the way reinforcement learning is done right now, how do people train those agents? You have your neural net and you give it a problem, and then you tell the model, “Go solve it.” The model takes maybe thousands, hundreds of thousands of actions or thoughts or something, and then it produces a solution. The solution is graded.
当然,我很乐意解释。当人们在做 reinforcement learning 的时候,以现在主流的做法来看,人们是怎么训练这些 agent 的呢?你有一个神经网络,给它一个问题,然后对模型说:“去把这个问题解决掉。”模型可能会走过成千上万、甚至几十万步的 action 或思考之类的过程,最后给出一个解,这个解会被打分。
And then the score is used to provide a training signal for every single action in your trajectory. That means that if you are doing something that goes for a long time—if you’re training a task that takes a long time to solve—it will do no learning at all until you come up with the proposed solution. That’s how reinforcement learning is done naively. That’s how o1, R1 ostensibly are done.
然后,这个分数会被用来给整条轨迹上的每一个 action 提供训练信号。这意味着,如果你在做一件“过程很长”的事情——比如在训练一个需要很久才能得出结果的任务——在你最终给出候选解之前,模型实际上是“完全没有学习”的。这就是最朴素的 reinforcement learning 的做法,也是 o1、R1 表面上看起来的做法。
The value function says something like, “Maybe I could sometimes, not always, tell you if you are doing well or badly.” The notion of a value function is more useful in some domains than others. For example, when you play chess and you lose a piece, I messed up. You don’t need to play the whole game to know that what I just did was bad, and therefore whatever preceded it was also bad.
而 value function 想做的是类似这样的事情:“也许我有时——不是总是——可以告诉你,你现在是在往好的方向走,还是在往坏的方向走。”value function 这个概念在某些领域比在另一些领域更有用。比如下棋的时候,如果你丢了一个子,那就说明“我刚才那步下错了”。你不需要等整盘棋下完才知道刚刚那一步很糟糕,从而也能推回去说在那之前的一连串决策其实也不理想。
我看Ilya Sutskever的意思光是“相信能吃饱”的value function就能在思维路径上提前判断什么方向是对的,什么方向是错的。
The value function lets you short-circuit the wait until the very end. Let’s suppose that you are doing some kind of a math thing or a programming thing, and you’re trying to explore a particular solution or direction. After, let’s say, a thousand steps of thinking, you concluded that this direction is unpromising. As soon as you conclude this, you could already get a reward signal a thousand timesteps previously, when you decided to pursue down this path. You say, “Next time I shouldn’t pursue this path in a similar situation,” long before you actually came up with the proposed solution.
value function 让你不必一直等到“最后结果”才开始学习。假设你在做一道数学题,或者在写程序,你在尝试探索某个特定的解法或方向。想了大概一千步之后,你得出结论:这个方向基本没戏。一旦你得出这个结论,其实就可以把奖励信号回灌到一千步之前——也就是你最初决定走这条路的那个时刻。你会说:“下次在类似情形下,我就不该再走这条路了。”而这一切发生在你真正给出最终候选解之前很久。
比田渊栋的解释更进一步,跟我们的想法非常接近,大部分的错误从一开始的起点就是错的,绕进去就麻烦了,很多人绕进去了就是一辈子,AI很容易回退到1000步之前,人会在自我强化中不断往错误的方向走下去。
Dwarkesh Patel 00:15:52
This was in the DeepSeek R1 paper— that the space of trajectories is so wide that maybe it’s hard to learn a mapping from an intermediate trajectory and value. And also given that, in coding for example you’ll have the wrong idea, then you’ll go back, then you’ll change something.
DeepSeek R1 的论文里提到过这样一个观点——轨迹空间太大了,所以也许很难从某个中间轨迹学到一条到 value 的映射。再加上,在写代码这类场景中,你一开始可能思路就不对,然后你会回退、再修改一些东西,过程本身就曲折反复。
Ilya Sutskever 00:16:12
This sounds like such a lack of faith in deep learning. Sure it might be difficult, but nothing deep learning can’t do. My expectation is that a value function should be useful, and I fully expect that they will be used in the future, if not already.
这听起来简直像是对 deep learning 缺乏信心。的确,这件事可能很难,但没什么是 deep learning 做不到的。我的预期是,value function 应该是有用的——即便现在还没被广泛用起来,我也完全相信未来一定会用到它们。
What I was alluding to with the person whose emotional center got damaged, it’s more that maybe what it suggests is that the value function of humans is modulated by emotions in some important way that’s hardcoded by evolution. And maybe that is important for people to be effective in the world.
我刚才提到那个情绪中枢受损的病人,更想暗示的是:也许在人类身上,所谓的 value function 是被情绪以某种重要方式调制的,而这种调制是被进化“硬编码”进去的。也许正是这种机制,让人类能够在真实世界中较为有效地行动。
巴菲特认为性格的重要性是90%,大部分的错误判断只是被错误情绪带偏,但显然某些没有情绪干扰的情况下正确的直觉也非常重要。
Dwarkesh Patel 00:17:00
That’s the thing I was planning on asking you. There’s something really interesting about emotions of the value function, which is that it’s impressive that they have this much utility while still being rather simple to understand.
这正是我本来打算问你的地方。关于“情绪作为 value function 的一部分”有一点非常有意思:它们一方面极其有用,另一方面在人类看来又相对容易理解,这点非常令人惊讶。
Ilya Sutskever 00:17:15
I have two responses. I do agree that compared to the kind of things that we learn and the things we are talking about, the kind of AI we are talking about, emotions are relatively simple. They might even be so simple that maybe you could map them out in a human-understandable way. I think it would be cool to do.
我有两个回应。第一,我同意,相对于我们在讨论的这些东西、相对于我们在谈的这种 AI 能力而言,情绪要简单得多。它们甚至可能简单到,你完全可以用一种人类可理解的方式把它们系统地描出来——我觉得如果真能做到这一点会非常酷。
In terms of utility though, I think there is a thing where there is this complexity-robustness tradeoff, where complex things can be very useful, but simple things are very useful in a very broad range of situations. One way to interpret what we are seeing is that we’ve got these emotions that evolved mostly from our mammal ancestors and then fine-tuned a little bit while we were hominids, just a bit. We do have a decent amount of social emotions though which mammals may lack. But they’re not very sophisticated. And because they’re not sophisticated, they serve us so well in this very different world compared to the one that we’ve been living in.
但从“效用”角度看,我觉得这里存在一个“复杂度—鲁棒性”的权衡:复杂的东西在特定领域可以非常强大,而简单的东西在极其广泛的情境下都挺好用。我们现在看到的现象,可以这样解释:我们的情绪大部分是从哺乳动物祖先那里进化来的,然后在人类演化为类人猿、再到现代人的过程中又被稍微 fine-tune 了一点点。人类确实多了一些哺乳动物可能不具备的社会性情绪。但总体上看,这些情绪并不算特别复杂。也正因为它们没那么复杂,所以在今天这个与祖先所处环境截然不同的世界里,它们依然能够很好地为我们服务。
可以相信人类最深层次的安全感是和简洁优雅的知识结构相互绑定的,巴菲特、乔布斯都是如此,“Make something wonderful,Put Something Back”,能这么想首先是为了改善自己的小环境,进一步改善自己所处的大环境。
Actually, they also make mistakes. For example, our emotions… Well actually, I don’t know. Does hunger count as an emotion? It’s debatable. But I think, for example, our intuitive feeling of hunger is not succeeding in guiding us correctly in this world with an abundance of food.
当然,它们也会犯错。比如我们的情绪——嗯,严格说来,我也不确定饥饿算不算一种情绪,这点是可以争论的。但举个例子:在一个食物极大丰富的世界里,我们对“饥饿感”的直觉显然并没有成功地把我们往正确方向引导。
00:18:49 – What are we scaling?
00:18:49 – 我们到底在放大什么?
Dwarkesh Patel 00:18:49
People have been talking about scaling data, scaling parameters, scaling compute. Is there a more general way to think about scaling? What are the other scaling axes?
大家一直在谈 scaling data、scaling parameters、scaling compute。有没有一种更一般的方式来理解“scaling”?还有哪些其他可以扩展的维度?
Ilya Sutskever 00:19:00
Here’s a perspective that I think might be true. The way ML used to work is that people would just tinker with stuff and try to get interesting results. That’s what’s been going on in the past.
我有一个也许是对的看法。过去的 ML 基本是这样做的:大家东改改、西试试,看看能不能折腾出一些有趣的结果。以前大概就是这么个状态。
Then the scaling insight arrived. Scaling laws, GPT-3, and suddenly everyone realized we should scale. This is an example of how language affects thought. “Scaling” is just one word, but it’s such a powerful word because it informs people what to do. They say, “Let’s try to scale things.” So you say, what are we scaling? Pre-training was the thing to scale. It was a particular scaling recipe.
后来,“scaling” 这个洞见出现了——Scaling laws、GPT-3 出来之后,大家一下子意识到:我们应该去 scale。这其实是“语言如何影响思维”的一个例子。“scaling” 只是一个词,但它非常有力量,因为它在暗示大家应该做什么——“我们来把一切都 scale 一下吧。”那你会问:我们究竟在 scale 什么?答案是:pre-training 是那个被拿来 scale 的对象,它是一种特定的 scaling recipe。

举国体制。
The big breakthrough of pre-training is the realization that this recipe is good. You say, “Hey, if you mix some compute with some data into a neural net of a certain size, you will get results. You will know that you’ll be better if you just scale the recipe up.” This is also great. Companies love this because it gives you a very low-risk way of investing your resources.
pre-training 的重大突破,在于大家认识到:这套 recipe 本身是“对”的。你可以说:“把一定量的算力和一定量的数据,喂进一个特定规模的神经网络里,你就会得到不错的结果;而如果你把这套 recipe scale 上去,结果就会更好。”这点太好了,公司们也很喜欢,因为这为你提供了一种风险很低的资源投入方式。
It’s much harder to invest your resources in research. Compare that. If you research, you need to be like, “Go forth researchers and research and come up with something”, versus get more data, get more compute. You know you’ll get something from pre-training.
相比之下,把资源投入“研究”要难多了。你得说:“各位研究员请出发,去研究、去想办法搞出点新东西来。”而另一边,你只需要说:去拿更多数据,去买更多算力,你就知道至少从 pre-training 里一定能榨出点什么结果来。
Indeed, it looks like, based on various things some people say on Twitter, maybe it appears that Gemini have found a way to get more out of pre-training. At some point though, pre-training will run out of data. The data is very clearly finite. What do you do next? Either you do some kind of souped-up pre-training, a different recipe from the one you’ve done before, or you’re doing RL, or maybe something else. But now that compute is big, compute is now very big, in some sense we are back to the age of research.
的确,看起来根据一些人 X/Twitter 上的说法,Gemini 似乎找到了从 pre-training 里“榨出更多东西”的办法。但在某个时间点上,pre-training 会把数据用完的——数据显然是有限的。那接下来怎么办?要么你做某种“加强版的 pre-training”,也就是一套不同于以往的新 recipe;要么你更多依赖 RL;或者做点别的东西。但既然如今算力已经大到这个程度,从某种意义上说,我们又回到了“研究时代”。
Maybe here’s another way to put it. Up until 2020, from 2012 to 2020, it was the age of research. Now, from 2020 to 2025, it was the age of scaling—maybe plus or minus, let’s add error bars to those years—because people say, “This is amazing. You’ve got to scale more. Keep scaling.” The one word: scaling.
也可以换一种说法:从 2012 到 2020,大致是“研究时代”;而从 2020 到 2025,大致是“scaling 时代”(前后年份可以加点误差范围),因为大家都在说:“这太神奇了,你必须继续 scale,再多 scale 一点。”就一个词:scaling。
But now the scale is so big. Is the belief really, “Oh, it’s so big, but if you had 100x more, everything would be so different?” It would be different, for sure. But is the belief that if you just 100x the scale, everything would be transformed? I don’t think that’s true. So it’s back to the age of research again, just with big computers.
但现在 scale 已经大到这个程度了。我们真的还在相信这样一件事吗:“是的,现在已经很大了,可如果你再多 100 倍,一切就会彻底不一样”?它当然会有所不同,但是否真能靠“再放大 100 倍 scale”就把一切彻底改写?我不这么认为。所以我们又回到了“研究时代”,只不过这次是带着非常巨大的计算机回去的。
进一步改善的程度有限,哪些商业模式在有限的智能下能发挥的很好?看着自动驾驶是有可能成功的,并且是最早实现突破的那一批应用,驾驶汽车不需要天才少年。
Dwarkesh Patel 00:22:06
That’s a very interesting way to put it. But let me ask you the question you just posed then. What are we scaling, and what would it mean to have a recipe? I guess I’m not aware of a very clean relationship that almost looks like a law of physics which existed in pre-training. There was a power law between data or compute or parameters and loss. What is the kind of relationship we should be seeking, and how should we think about what this new recipe might look like?
这个说法非常有意思。但那我就直接接着问你刚才自己提出的问题:我们到底在 scale 什么?所谓一套新的 recipe 究竟意味着什么?在 pre-training 里,我们曾经有一种几乎像“物理定律”那样干净的关系——数据量、算力、参数规模和 loss 之间存在 power law。那在接下来的阶段,我们应该去寻找什么样的关系?又应该如何想象这套“新 recipe”可能会长成什么样子?
Ilya Sutskever 00:22:38
We’ve already witnessed a transition from one type of scaling to a different type of scaling, from pre-training to RL. Now people are scaling RL. Now based on what people say on Twitter, they spend more compute on RL than on pre-training at this point, because RL can actually consume quite a bit of compute. You do very long rollouts, so it takes a lot of compute to produce those rollouts. Then you get a relatively small amount of learning per rollout, so you really can spend a lot of compute.
我们已经见证了一次从一种 scaling 到另一种 scaling 的转变:从 pre-training 转向 RL。现在大家在 scale 的是 RL。而且根据 Twitter 上的一些说法,目前在 RL 上花掉的算力已经比 pre-training 更多了,因为 RL 实际上可以吃掉非常多的算力。你会做非常长的 rollouts,所以光是生成这些 rollouts 就需要大量算力,但每个 rollout 能带来的学习信号又相对有限,所以你确实可以在上面“烧掉”非常多的算力。
I wouldn’t even call it scaling. I would say, “Hey, what are you doing? Is the thing you are doing the most productive thing you could be doing? Can you find a more productive way of using your compute?” We’ve discussed the value function business earlier. Maybe once people get good at value functions, they will be using their resources more productively. If you find a whole other way of training models, you could say, “Is this scaling or is it just using your resources?” I think it becomes a little bit ambiguous.
我甚至都不太想把这叫作 scaling。我更想问的是:“你现在在干的事情,真的是你能做的最有生产力的事吗?你能不能找到一种更高效利用算力的方式?”我们刚才已经讨论过 value function 这一块。也许当大家真的把 value function 玩明白之后,就能以更高的生产率使用资源。如果你找到了一整套全新的模型训练方式,你当然可以说“这也是一种 scaling”,但也可以说“这只是另一种资源使用方式”,在这里两者的界限就变得有点模糊了。
In the sense that, when people were in the age of research back then, it was, “Let’s try this and this and this. Let’s try that and that and that. Oh, look, something interesting is happening.” I think there will be a return to that.
从这个意义上说,当年大家还处在“研究时代”的时候,基本的工作模式就是:“我们来试试这个、这个、再这个;我们再试试那个、那个、再那个。哦,你看,好像有点有意思的东西出来了。”我觉得我们会回到那种状态。
Dwarkesh Patel 00:24:10
If we’re back in the era of research, stepping back, what is the part of the recipe that we need to think most about? When you say value function, people are already trying the current recipe, but then having LLM-as-a-Judge and so forth. You could say that’s a value function, but it sounds like you have something much more fundamental in mind. Should we even rethink pre-training at all and not just add more steps to the end of that process?
如果我们真的回到了“研究时代”,退一步看,在这套 recipe 里,我们最需要重新思考的部分究竟是什么?当你提到 value function 的时候,其实大家已经在用现有的 recipe 里加入一些东西了,比如 LLM-as-a-Judge 之类,你也可以说那就是一种 value function。但听起来,你在想的东西要比这“根本性”得多。我们是不是应该连 pre-training 本身都重新思考,而不是只是往现有流程的末尾再加几步?
Ilya Sutskever 00:24:35
The discussion about value function, I think it was interesting. I want to emphasize that I think the value function is something that’s going to make RL more efficient, and I think that makes a difference. But I think anything you can do with a value function, you can do without, just more slowly. The thing which I think is the most fundamental is that these models somehow just generalize dramatically worse than people. It’s super obvious. That seems like a very fundamental thing.
关于 value function 的讨论,我觉得是挺有意思的。我想强调的是,我认为 value function 会让 RL 更高效,这一点是会带来实际差别的。但我同样认为:凡是你能靠 value function 做到的事,不用 value function 也能做到,只是会慢很多。我觉得最根本的问题是,这些模型在某种意义上,泛化能力就是远远差于人类,这一点极其明显,而这看上去才是真正“最底层的那个问题”。
慢很多是个问题,缺少泛化能力的问题更大,更麻烦的是缺少泛化能力=缺少简洁优雅的知识结构=受恐惧左右=坏,看着还没有想到这个点。
00:25:13 – Why humans generalize better than models
00:25:13 – 为什么人类的泛化能力比模型强
Dwarkesh Patel 00:25:13
So this is the crux: generalization. There are two sub-questions. There’s one which is about sample efficiency: why should it take so much more data for these models to learn than humans? There’s a second question. Even separate from the amount of data it takes, why is it so hard to teach the thing we want to a model than to a human? For a human, we don’t necessarily need a verifiable reward to be able to… You’re probably mentoring a bunch of researchers right now, and you’re talking with them, you’re showing them your code, and you’re showing them how you think. From that, they’re picking up your way of thinking and how they should do research.
所以关键问题就是:泛化。这里有两个子问题。第一个跟样本效率有关:为什么这些模型要比人类多得多的数据才能学会东西?第二个问题是,即便撇开所需数据量不谈,为什么把我们想要的能力教给模型,要比教给人难这么多?对人来说,我们并不一定需要一个“可验证的奖励信号”才能……你现在大概就在带一批研究员,你跟他们交流,给他们看你的代码,展示你的思考方式,而他们从中就慢慢学会了你的思路,以及他们自己应该如何做研究。
You don’t have to set a verifiable reward for them that’s like, “Okay, this is the next part of the curriculum, and now this is the next part of your curriculum. Oh, this training was unstable.” There’s not this schleppy, bespoke process. Perhaps these two issues are actually related in some way, but I’d be curious to explore this second thing, which is more like continual learning, and this first thing, which feels just like sample efficiency.
你不需要给他们设定一个可以量化验证的奖励,说:“好,现在这是你课程的下一部分,然后这是再下一部分。哦,这次训练不稳定。”整件事情并不是一个又繁琐又定制化的流程。这两个问题也许某种程度上是相关的,但我更好奇去分别探讨:第二个更像“持续学习(continual learning)”,而第一个更像纯粹的“样本效率(sample efficiency)”。
Ilya Sutskever 00:26:19
You could actually wonder that one possible explanation for the human sample efficiency that needs to be considered is evolution. Evolution has given us a small amount of the most useful information possible. For things like vision, hearing, and locomotion, I think there’s a pretty strong case that evolution has given us a lot.
你其实可以这样想:对于人类样本效率高这一点,有一个必须纳入考虑的可能解释是“进化”。进化在先天层面给了我们一小部分“极其有用的信息”。就像视觉、听觉、运动能力这些方面,我觉得有非常充分的理由相信,进化在这些维度上给了我们很多东西。
For example, human dexterity far exceeds… I mean robots can become dexterous too if you subject them to a huge amount of training in simulation. But to train a robot in the real world to quickly pick up a new skill like a person does seems very out of reach. Here you could say, “Oh yeah, locomotion. All our ancestors needed great locomotion, squirrels. So with locomotion, maybe we’ve got some unbelievable prior.”
比如,人类的灵巧程度远远超过……我的意思是,机器人也可以在模拟环境中通过海量训练变得很灵巧。但要在真实世界里训练一个机器人,像人一样迅速掌握一项新技能,目前看上去还非常遥远。在这里,你就可以说:“对,运动能力——我们的所有祖先都需要很强的运动能力,像松鼠那样的敏捷,所以在运动这件事上,我们大概有某种非常强的、难以置信的先验。”
You could make the same case for vision. I believe Yann LeCun made the point that children learn to drive after 10 hours of practice, which is true. But our vision is so good. At least for me, I remember myself being a five-year-old. I was very excited about cars back then. I’m pretty sure my car recognition was more than adequate for driving already as a five-year-old. You don’t get to see that much data as a five-year-old. You spend most of your time in your parents’ house, so you have very low data diversity.
在视觉上你也可以做出类似论证。我记得 Yann LeCun 提过这样一点:小孩大概练 10 小时就能学会开车,这确实是事实。但那是因为我们视觉系统太强了。至少对我来说,我记得自己五岁时就对汽车非常着迷,而在那时候,我识别汽车的能力已经足够应付开车了。可你想想,五岁的小孩并没有“看过”多少数据——大部分时间都在父母家里活动,数据多样性其实很低。
But you could say maybe that’s evolution too. But in language and math and coding, probably not.
不过你同样可以说,那可能也是进化的结果。但在语言、数学和编程这些领域,情况就大概不是这样了。
这跟我们想到“精准执行”是一回事,这个部分有很好的稳定性和一致性,是进化的比较充分的部分。
Dwarkesh Patel 00:28:00
It still seems better than models. Obviously, models are better than the average human at language, math, and coding. But are they better than the average human at learning?
即便如此,人类在这些方面看起来还是比模型强。显然,在语言、数学和编程这些具体任务上,模型已经强过普通人了。但它们在“学习能力”这件事本身上,真的比普通人更强吗?
Ilya Sutskever 00:28:09
Oh yeah. Oh yeah, absolutely. What I meant to say is that language, math, and coding—and especially math and coding—suggests that whatever it is that makes people good at learning is probably not so much a complicated prior, but something more, some fundamental thing.
当然,当然,绝对是的。我的意思是:语言、数学和编程——尤其是数学和编程——这些事实说明,让人类善于学习的那个“关键东西”,可能并不主要是某种极其复杂的先验结构,而是别的什么,更加根本的东西。
Dwarkesh Patel 00:28:29
I’m not sure I understood. Why should that be the case?
我不太确定自己听懂了。为什么会是这样呢?
Ilya Sutskever 00:28:32
So consider a skill in which people exhibit some kind of great reliability. If the skill is one that was very useful to our ancestors for many millions of years, hundreds of millions of years, you could argue that maybe humans are good at it because of evolution, because we have a prior, an evolutionary prior that’s encoded in some very non-obvious way that somehow makes us so good at it.
那你可以想想这样一种能力:在人类身上,它表现出非常强的一致性和可靠性。如果这种能力,是对我们的祖先在数百万年、乃至上亿年的时间尺度上一直很有用的,那你就可以合理地认为,人类之所以在这方面表现这么好,是因为进化给我们植入了某种“先验”——一种被编码在极其隐蔽形式中的进化先验,让我们在这类任务上天生就占有优势。
But if people exhibit great ability, reliability, robustness, and ability to learn in a domain that really did not exist until recently, then this is more an indication that people might have just better machine learning, period.
但如果在人类历史上“直到最近才出现”的某个全新领域里,人们同样表现出很强的能力、可靠性、鲁棒性以及学习能力,那就更像是在说明:人类可能本身就拥有一种更强的“机器学习能力”,仅此而已。
手脑协调的一致性,我们自己的定义是“精准执行”,这跟泛化下的一致性、可靠性不一样,都体现了整个神经系统非常有韧性,但泛化的部分一旦错了很难修复,这跟手脑协调的“精准执行”很不一样。
Dwarkesh Patel 00:29:29
How should we think about what that is? What is the ML analogy? There are a couple of interesting things about it. It takes fewer samples. It’s more unsupervised. A child learning to drive a car… Children are not learning to drive a car. A teenager learning how to drive a car is not exactly getting some prebuilt, verifiable reward. It comes from their interaction with the machine and with the environment. It takes much fewer samples. It seems more unsupervised. It seems more robust?
我们应该怎样去理解“那个东西”究竟是什么?在机器学习里有什么类比?它有几个很有意思的特征:需要的样本更少、更接近无监督。一个孩子学会开车……好吧,小孩子其实还不会学开车,但一个青少年学开车的时候,并没有什么预先设计好的、可验证的奖励信号在那儿等着他。他获得反馈是通过自己跟车这台机器以及周围环境的互动来完成的。整个过程需要的样本少得多,看上去更像是无监督的,而且似乎更鲁棒?
Ilya Sutskever 00:30:07
Much more robust. The robustness of people is really staggering.
是的,鲁棒性要强太多了。人类在这方面的鲁棒性,说实话是非常惊人的。
Dwarkesh Patel 00:30:12
Do you have a unified way of thinking about why all these things are happening at once? What is the ML analogy that could realize something like this?
你有没有一种比较统一的方式来解释,为什么这些现象会同时出现?在 ML 里有什么样的类比,能把这种能力实现出来?
Ilya Sutskever 00:30:24
One of the things that you’ve been asking about is how can the teenage driver self-correct and learn from their experience without an external teacher? The answer is that they have their value function. They have a general sense which is also, by the way, extremely robust in people. Whatever the human value function is, with a few exceptions around addiction, it’s actually very, very robust.
你刚才反复在问的一个点,是:为什么一个刚学车的青少年,在没有外部“老师”持续给信号的情况下,仍然能够自我纠错、从经验中学习?这里的答案是:他们有自己的 value function。他们对自己行为好坏有一个总体性的直觉判断,而这种直觉顺带一提,在人类身上也是非常鲁棒的。不管人类的这个 value function 的本质到底是什么,除了在成瘾这类少数例外上会被劫持之外,它总体上其实极其稳健。
So for something like a teenager that’s learning to drive, they start to drive, and they already have a sense of how they’re driving immediately, how badly they are, how unconfident. And then they see, “Okay.” And then, of course, the learning speed of any teenager is so fast. After 10 hours, you’re good to go.
所以对一个正在学车的青少年来说,他一开始上路,就立刻能感觉到自己开得怎么样:哪里很糟、哪里没把握。然后他会慢慢看到自己的改进,“哦,好像好一点了”。再加上,任何一个青少年在这个阶段的学习速度都非常快,练个 10 小时基本就能上路了。
Dwarkesh Patel 00:31:17
It seems like humans have some solution, but I’m curious about how they are doing it and why is it so hard? How do we need to reconceptualize the way we’re training models to make something like this possible?
看起来人类在这方面显然有一套“解法”,但我很好奇他们到底是怎么做到的,以及为什么我们这么难在模型上复制这一点?为了让模型也具备这种能力,我们在训练范式上需要做怎样的“重新构思”?
Ilya Sutskever 00:31:27
That is a great question to ask, and it’s a question I have a lot of opinions about. But unfortunately, we live in a world where not all machine learning ideas are discussed freely, and this is one of them. There’s probably a way to do it. I think it can be done. The fact that people are like that, I think it’s a proof that it can be done.
这是一个非常值得问的问题,而且关于这一点我其实有很多自己的看法。但很遗憾,我们现在所处的这个世界,并不是所有的机器学习想法都可以公开讨论,而这就是其中一个不能随便说的方向。我个人几乎可以肯定,这里面是有一条路可走的,我也相信这是可以做成的——人类本身就能做到这一点,我认为这本身就是“这件事在原理上可行”的证明。
There may be another blocker though, which is that there is a possibility that the human neurons do more compute than we think. If that is true, and if that plays an important role, then things might be more difficult. But regardless, I do think it points to the existence of some machine learning principle that I have opinions on. But unfortunately, circumstances make it hard to discuss in detail.
不过,中间可能还有一个“额外的拦路虎”,那就是:有一种可能性是,人类神经元实际完成的算力比我们现在以为的要多得多。如果这个假设是真的,而且这部分算力在其中发挥了重要作用,那事情会变得更难一些。但不管怎样,我仍然认为,这一切都指向某种尚未完全成形、但确实存在的机器学习原理,而这方面我也有不少自己的想法。只是,很可惜,现实环境让我们现在很难把这些细节摊开来讲。
Dwarkesh Patel 00:32:28
Nobody listens to this podcast, Ilya.
没人听这个播客的啦,Ilya。
00:35:45 – Straight-shotting superintelligence
00:35:45 – 直线加速迈向超级智能
Dwarkesh Patel 00:35:45
I’m curious. If you say we are back in an era of research, you were there from 2012 to 2020. What is the vibe now going to be if we go back to the era of research?
我很好奇。如果像你说的那样,我们正回到一个“研究时代”,而你从 2012 年到 2020 年就一直在那个时代身处其中,那如果现在再次回到“研究时代”,整体氛围会是什么样?
For example, even after AlexNet, the amount of compute that was used to run experiments kept increasing, and the size of frontier systems kept increasing. Do you think now that this era of research will still require tremendous amounts of compute? Do you think it will require going back into the archives and reading old papers?
比如说,即便在 AlexNet 之后,用来跑实验的算力还在持续上升,前沿系统的规模也在不断变大。你觉得这一次的“研究时代”,仍然需要巨量的算力吗?会不会需要大家重新回到文献堆里,把老论文翻出来看?
You were at Google and OpenAI and Stanford, these places, when there was more of a vibe of research? What kind of things should we be expecting in the community?
当年在 Google、OpenAI、Stanford 这些地方时,整体氛围更偏“研究”。那从你的经验看,我们现在应该期待整个社区会出现些什么?
Ilya Sutskever 00:36:38
One consequence of the age of scaling is that scaling sucked out all the air in the room. Because scaling sucked out all the air in the room, everyone started to do the same thing. We got to the point where we are in a world where there are more companies than ideas by quite a bit. Actually on that, there is this Silicon Valley saying that says that ideas are cheap, execution is everything. People say that a lot, and there is truth to that. But then I saw someone say on Twitter something like, “If ideas are so cheap, how come no one’s having any ideas?” And I think it’s true too.
scaling 时代的一个后果是:scaling 把屋子里的空气都抽干了。因为“scaling 把空气抽干”这件事,导致所有人都开始做同一件事。我们已经来到了这样一个世界:公司数量远远多于真正的新想法。顺便说一句,硅谷有句老话:ideas are cheap, execution is everything——点子很廉价,执行才是一切。大家经常这么讲,这里面确实有道理。但我在 Twitter 上又看到有人说:“如果想法这么廉价,那为什么现在几乎没人再有真正的新想法?”我觉得这个说法也挺对。
If you think about research progress in terms of bottlenecks, there are several bottlenecks. One of them is ideas, and one of them is your ability to bring them to life, which might be compute but also engineering. If you go back to the ‘90s, let’s say, you had people who had pretty good ideas, and if they had much larger computers, maybe they could demonstrate that their ideas were viable. But they could not, so they could only have a very, very small demonstration that did not convince anyone. So the bottleneck was compute.
如果你从“瓶颈”的角度来看待科研进展,其实是有好几个不同的瓶颈的。其中一个是 idea 本身,另一个则是你把 idea 变成现实的能力,这可能包括算力,也包括工程能力。如果你回到上世纪 90 年代,会发现当时很多人其实已经有相当不错的想法,如果他们当时拥有大得多的计算机,也许早就能展示出这些想法是可行的。但他们做不到,所以只能跑一些非常非常小的 demo,根本不足以说服别人。因此,在那个阶段,真正的瓶颈是算力。
Then in the age of scaling, compute has increased a lot. Of course, there is a question of how much compute is needed, but compute is large. Compute is large enough such that it’s not obvious that you need that much more compute to prove some idea. I’ll give you an analogy. AlexNet was built on two GPUs. That was the total amount of compute used for it. The transformer was built on 8 to 64 GPUs. No single transformer paper experiment used more than 64 GPUs of 2017, which would be like, what, two GPUs of today? The ResNet, right? You could argue that the o1 reasoning was not the most compute-heavy thing in the world.
然后到了 scaling 时代,算力大幅增长。当然,“到底需要多少算力”仍然是个问题,但至少我们可以说:算力已经足够大,大到你很难再理直气壮地说,“为了证明某个想法,我们还必须把算力再放大几个数量级”。我举几个对比:AlexNet 是在两块 GPU 上跑出来的,那就是它全部的算力预算;Transformer 则是在 8 到 64 块 GPU 上训练的,没有任何一组 2017 年的 Transformer 论文实验用到超过 64 块 GPU——按今天的标准算,大概也就相当于两块现代引擎级别的 GPU?再加上 ResNet 之类。你也可以说,像 o1 这类 reasoning 工作,从绝对值上看,并不是世界上算力消耗最大的东西。
So for research, you definitely need some amount of compute, but it’s far from obvious that you need the absolutely largest amount of compute ever for research. You might argue, and I think it is true, that if you want to build the absolutely best system then it helps to have much more compute. Especially if everyone is within the same paradigm, then compute becomes one of the big differentiators.
所以,对研究来说,你当然需要一定规模的算力,但要说“做研究必须拿到全世界最大那一档算力”,这就远谈不上是显然的事实了。你可以争辩——而且我也同意——如果你想打造“当前最强、最顶配”的系统,那多一些算力肯定有帮助。尤其是在大家都待在同一个范式里的时候,算力自然就会成为一个非常重要的差异化因素。
Dwarkesh Patel 00:39:41
I’m asking you for the history, because you were actually there. I’m not sure what actually happened. It sounds like it was possible to develop these ideas using minimal amounts of compute. But the transformer didn’t immediately become famous. It became the thing everybody started doing and then started experimenting on top of and building on top of because it was validated at higher and higher levels of compute.
我之所以问你“历史细节”,是因为你当时真的在现场。我自己其实并不完全清楚当时究竟发生了什么。听你这么说,好像在当年,是有可能用相对很小的算力去提出、验证这些关键性想法的。但一开始 Transformer 并没有马上“横空出世、天下皆知”,它之所以最后变成大家都在用、都在其上做实验、做二次构建的东西,是因为后来在越来越高的算力规模上,它一次次被验证、被巩固,才慢慢定格为“范式中心”的。
Ilya Sutskever 00:40:06
Correct.
没错。
Dwarkesh Patel 00:40:07
And if you at SSI have 50 different ideas, how will you know which one is the next transformer and which one is brittle, without having the kinds of compute that other frontier labs have?
那如果在 SSI 你们手上有 50 个不同的想法,在没有其它前沿实验室那种量级算力的情况下,你们怎么分辨,哪一个会是下一个 Transformer,哪一个只是比较脆弱的点子?
Ilya Sutskever 00:40:22
I can comment on that. The short comment is that you mentioned SSI. Specifically for us, the amount of compute that SSI has for research is really not that small. I want to explain why. Simple math can explain why the amount of compute that we have is comparable for research than one might think. I’ll explain.
这个我可以讲几句。简单说,你刚才提到 SSI。对我们来说,用于研究的算力其实一点也不小。我想解释一下为什么。用很简单的算术就能说明,我们拿来做研究的算力,其实比很多人想象的要更可比。我来解释一下。
SSI has raised $3 billion, which is a lot by any absolute sense. But you could say, “Look at the other companies raising much more.” But a lot of their compute goes for inference. These big numbers, these big loans, it’s earmarked for inference. That’s number one. Number two, if you want to have a product on which you do inference, you need to have a big staff of engineers, salespeople. A lot of the research needs to be dedicated to producing all kinds of product-related features. So then when you look at what’s actually left for research, the difference becomes a lot smaller.
SSI 融了 30 亿美元,从任何绝对数的角度看这都是一大笔钱。你当然可以说:“看看别的公司,融得多得多。”但要注意,他们很大一部分算力是用在推理上的。这些看起来很大的数字、很大的贷款,其实很多都是指定要拿去做推理服务的。这是第一点。第二点是,如果你要有一个可以跑推理的产品,你就必须养一大批工程师、销售人员,还有大量研究要被用在各种跟产品功能相关的开发上。这样一来,当你真正去看“纯研究”所剩下的算力时,双方之间的差距就小了很多。
The other thing is, if you are doing something different, do you really need the absolute maximal scale to prove it? I don’t think that’s true at all. I think that in our case, we have sufficient compute to prove, to convince ourselves and anyone else, that what we are doing is correct.
另外一点是:如果你在做的事情本身是“不同的东西”,那你真的需要用到业界最大的 scale 才能证明它吗?我一点也不这么认为。就我们来说,我们有足够的算力去验证、去说服自己,也说服其他人:我们在做的这条路,是走得通的。
Dwarkesh Patel 00:42:02
There have been public estimates that companies like OpenAI spend on the order of $5-6 billion a year just so far, on experiments. This is separate from the amount of money they’re spending on inference and so forth. So it seems like they’re spending more a year running research experiments than you guys have in total funding.
外界有一些公开的估算,像 OpenAI 这样的公司,目前每年光是做实验的花费就有 50–60 亿美元,这还不包括他们在推理等其它方面的支出。这样看起来,他们一年砸在研究实验上的钱,就已经比你们总共拿到的融资还多了。
Ilya Sutskever 00:42:22
I think it’s a question of what you do with it. It’s a question of what you do with it. In their case, in the case of others, there is a lot more demand on the training compute. There’s a lot more different work streams, there are different modalities, there is just more stuff. So it becomes fragmented.
我觉得关键在于:你拿这些算力去做什么,用法是什么。对他们来说,对很多类似的公司来说,training 这块的算力需求多得多,工作流也更多,模态更多,事情本身也更杂。结果就是,算力被分散到一大堆不同的方向上。
竞争导致整体策略的碎片化,这正是OpenAI现在的处境。
Dwarkesh Patel 00:42:44
How will SSI make money?
SSI 未来怎么赚钱?
Ilya Sutskever 00:42:46
My answer to this question is something like this. Right now, we just focus on the research, and then the answer to that question will reveal itself. I think there will be lots of possible answers.
我对这个问题的回答大概是这样:眼下我们只专注在研究上,等研究走到一定程度,关于“怎么赚钱”这个问题,答案自然会浮现出来。我相信到时候会有很多种可能的路径。
Dwarkesh Patel 00:43:01
Is SSI’s plan still to straight shot superintelligence?
SSI 现在的计划还是要“straight shot superintelligence”(直接冲向超级智能)吗?
Ilya Sutskever 00:43:04
Maybe. I think that there is merit to it. I think there’s a lot of merit because it’s very nice to not be affected by the day-to-day market competition. But I think there are two reasons that may cause us to change the plan. One is pragmatic, if timelines turned out to be long, which they might. Second, I think there is a lot of value in the best and most powerful AI being out there impacting the world. I think this is a meaningfully valuable thing.
也许会。我觉得这种做法有它的道理,而且优点不少,尤其是你可以不被日常的市场竞争牵着走,这点非常好。但我觉得有两个理由可能会让我们改变这个计划。第一个是务实层面,如果时间线被证明其实很长——这是有可能的。第二个是,我认为让最强、最有能力的 AI 真正走向世界、产生影响,本身就是一件非常有价值的事情。
Dwarkesh Patel 00:43:48
So then why is your default plan to straight shot superintelligence? Because it sounds like OpenAI, Anthropic, all these other companies, their explicit thinking is, “Look, we have weaker and weaker intelligences that the public can get used to and prepare for.” Why is it potentially better to build a superintelligence directly?
那为什么你们的默认计划还是“straight shot superintelligence”呢?因为听上去 OpenAI、Anthropic 等这些公司,公开的思路更像是:“我们先放出一个又一个相对更弱的系统,让公众逐步适应、逐步做好准备。”那为什么在你看来,直接造出一个超级智能,可能反而是更好的路线?
Ilya Sutskever 00:44:08
I’ll make the case for and against. The case for is that one of the challenges that people face when they’re in the market is that they have to participate in the rat race. The rat race is quite difficult in that it exposes you to difficult trade-offs which you need to make. It is nice to say, “We’ll insulate ourselves from all this and just focus on the research and come out only when we are ready, and not before.” But the counterpoint is valid too, and those are opposing forces. The counterpoint is, “Hey, it is useful for the world to see powerful AI. It is useful for the world to see powerful AI because that’s the only way you can communicate it.”
我可以把赞成和反对的理由都摊开讲一讲。先说赞成的一面:身处市场竞争中的公司,面临的一个挑战就是不得不卷入“rat race”(老鼠赛跑式的竞争)。这种 rat race 很难熬,因为它会迫使你不断面对各种艰难的权衡。在这种背景下,说一句“我们把自己从这一切中隔离出来,只专注于研究,等我们真正准备好了再出手,而不是提前”是很有吸引力的。但反面观点同样是有道理的,这两点其实是在拉扯的两股力量。反面观点是:“让世界亲眼见到强大的 AI 是有用的。让世界见到真正强大的 AI 很重要,因为那几乎是你唯一能真正‘传达’它的方式。”
Dwarkesh Patel 00:44:57
Well, I guess not even just that you can communicate the idea—
我猜不仅仅是你可以把“这个想法”传达出去——
Ilya Sutskever 00:45:00
Communicate the AI, not the idea. Communicate the AI.
准确地说,是把 AI 本身“传达”出去,而不是把想法传达出去——是 communicate the AI。
Dwarkesh Patel 00:45:04
What do you mean, “communicate the AI”?
你说的“communicate the AI”是什么意思?
Ilya Sutskever 00:45:06
Let’s suppose you write an essay about AI, and the essay says, “AI is going to be this, and AI is going to be that, and it’s going to be this.” You read it and you say, “Okay, this is an interesting essay.” Now suppose you see an AI doing this, an AI doing that. It is incomparable. Basically I think that there is a big benefit from AI being in the public, and that would be a reason for us to not be quite straight shot.
想象一下,你写了一篇关于 AI 的长文,里面说:“AI 将来会变成这样,会变成那样,会做到这,会做到那。”读者看完可能会说:“好吧,这篇文章挺有意思的。”但如果换一种情况,你亲眼看到一个 AI 在做这些事情、在实际完成那些能力,那两者是完全不可同日而语的。从根本上讲,我觉得让 AI 真正出现在公众视野中,有非常大的好处,而这就可能成为我们不那么“straight shot”的一个理由。
Dwarkesh Patel 00:45:37
I guess it’s not even that, but I do think that is an important part of it. The other big thing is that I can’t think of another discipline in human engineering and research where the end artifact was made safer mostly through just thinking about how to make it safe, as opposed to, why airplane crashes per mile are so much lower today than they were decades ago. Why is it so much harder to find a bug in Linux than it would have been decades ago? I think it’s mostly because these systems were deployed to the world. You noticed failures, those failures were corrected and the systems became more robust.
我觉得不光是这一点,尽管这一点本身就很重要。另一个更大的问题是,我想不出有哪一个人类工程或科研领域,是主要靠“在纸面上思考如何让最终产物更安全”来实现安全性的。相反,你看为什么今天的民航,每英里飞行对应的坠机率比几十年前低这么多?为什么今天在 Linux 里发现一个 bug 比几十年前要难得多?我认为关键原因在于,这些系统被真正部署到现实世界中,人们在使用中发现故障、修复故障,系统在这个过程中变得愈发稳健。
I’m not sure why AGI and superhuman intelligence would be any different, especially given—and I hope we’re going to get to this—it seems like the harms of superintelligence are not just about having some malevolent paper clipper out there. But this is a really powerful thing and we don’t even know how to conceptualize how people interact with it, what people will do with it. Having gradual access to it seems like a better way to maybe spread out the impact of it and to help people prepare for it.
我不确定为什么在 AGI 和 superhuman intelligence 这里,情况会完全不同。尤其是考虑到——我也希望我们接下来会聊到这一点——超级智能带来的风险并不只是“出现一个邪恶的 paper clipper”这么简单。问题在于,这将是一种极其强大的东西,而我们现在甚至都还不知道该如何给“人类如何与它互动、人类会拿它做什么”建立一个像样的概念框架。让全社会以一种渐进的方式接触它,似乎会更有利于把它的冲击摊得更开一些,也更有助于让人们为它的到来做好准备。
00:46:47 – SSI’s model will learn from deployment
00:46:47 – SSI 的模型将从部署中学习
Ilya Sutskever 00:46:47
Well I think on this point, even in the straight shot scenario, you would still do a gradual release of it, that’s how I would imagine it. Gradualism would be an inherent component of any plan. It’s just a question of what is the first thing that you get out of the door. That’s number one.
我觉得在这一点上,即便是在 straight shot 的方案下,你依然会采用一个渐进式的发布过程——至少在我的设想中是这样。渐进本身会是任何计划的内在组成部分,只是问题在于:你最先推出门的那个东西到底是什么。这是第一点。
Number two, I believe you have advocated for continual learning more than other people, and I actually think that this is an important and correct thing. Here is why. I’ll give you another example of how language affects thinking. In this case, it will be two words that have shaped everyone’s thinking, I maintain. First word: AGI. Second word: pre-training. Let me explain.
第二点,我觉得你比大多数人更强调持续学习,而我也认为这是一个既重要又正确的方向。为什么这么说?我再举一个“语言如何影响思维”的例子。在这里,我认为有两个词深刻地塑造了所有人的思维。第一个词:AGI。第二个词:pre-training。让我来解释一下。
再一次提及这个问题,语言和注意力有很强的关系,Ilya Sutskever在这里的意思是人类自己创造的这些新词把自己带偏了。
The term AGI, why does this term exist? It’s a very particular term. Why does it exist? There’s a reason. The reason that the term AGI exists is, in my opinion, not so much because it’s a very important, essential descriptor of some end state of intelligence, but because it is a reaction to a different term that existed, and the term is narrow AI. If you go back to ancient history of gameplay and AI, of checkers AI, chess AI, computer games AI, everyone would say, look at this narrow intelligence. Sure, the chess AI can beat Kasparov, but it can’t do anything else. It is so narrow, artificial narrow intelligence. So in response, as a reaction to this, some people said, this is not good. It is so narrow. What we need is general AI, an AI that can just do all the things. That term just got a lot of traction.
先说 AGI 这个词,为什么会存在这样一个词?这是个非常特殊的术语,它为什么会出现?这是有原因的。在我看来,AGI 这个词的存在,与其说是因为它是某种“智能最终形态”的关键描述,不如说更多是对另一个早已存在的词的“反应”,那个词就是 narrow AI。你如果回到早期游戏和 AI 的历史——比如 checkers AI、chess AI、各类电脑游戏 AI——大家当时都会说:你看,这是一个很窄的智能。没错,这个 chess AI 能打败 Kasparov,但它什么别的也干不了,它是极端狭窄的 artificial narrow intelligence。于是,作为对这种状况的反应,有些人就说:这不行,太窄了,我们需要的是 general AI,是一种可以把所有事情都做掉的 AI。然后,“AGI”这个说法就开始得到广泛响应。
The second thing that got a lot of traction is pre-training, specifically the recipe of pre-training. I think the way people do RL now is maybe undoing the conceptual imprint of pre-training. But pre-training had this property. You do more pre-training and the model gets better at everything, more or less uniformly. General AI. Pre-training gives AGI.
第二个得到巨大关注的是 pre-training,确切地说,是那套 pre-training 的 recipe。我认为,现在大家在做 RL 的方式,某种程度上是在“冲淡”当年 pre-training 留下的观念烙印。但 pre-training 的确有一个显著特征:你做更多的 pre-training,模型在几乎所有任务上的表现都会比较均匀地变好。general AI,pre-training 似乎“通往 AGI”。
But the thing that happened with AGI and pre-training is that in some sense they overshot the target. If you think about the term “AGI”, especially in the context of pre-training, you will realize that a human being is not an AGI. Yes, there is definitely a foundation of skills, but a human being lacks a huge amount of knowledge. Instead, we rely on continual learning.
但 AGI 和 pre-training 一起带来的一个结果,是在某种意义上“瞄得太高了”。如果你在 pre-training 的语境下去想“AGI”这个词,你会发现:一个人类其实并不是 AGI。没错,人类确实有一套基础技能,但在人类身上,绝大多数具体知识是缺失的,我们真正依赖的是 continual learning。
So when you think about, “Okay, so let’s suppose that we achieve success and we produce some kind of safe superintelligence.” The question is, how do you define it? Where on the curve of continual learning is it going to be?
所以,当你在想:“好,假设我们成功了,造出了一种 safe superintelligence。”接下来真正的问题是:你怎么定义它?在持续学习的整条曲线上,它到底处于哪个位置?
I produce a superintelligent 15-year-old that’s very eager to go. They don’t know very much at all, a great student, very eager. You go and be a programmer, you go and be a doctor, go and learn. So you could imagine that the deployment itself will involve some kind of a learning trial-and-error period. It’s a process, as opposed to you dropping the finished thing.
比如说,我“造出”一个超级聪明的 15 岁少年,他非常有干劲,但实际上懂得并不多,是一个很好的学生,非常渴望学习。你对他说:你去当程序员,你去当医生,去学吧。于是你可以想象,这样的系统在部署到现实世界的过程中,本身就会伴随一个带有学习和试错的阶段。这是一个过程,而不是你把一个完工的成品直接丢到世界上。
跟原来的定义有重要区别,这在AI很可能已经是一种共识,田渊栋还在研究,Ilya已经干上了,但大部分投资者还不知道。
Dwarkesh Patel 00:50:45
I see. You’re suggesting that the thing you’re pointing out with superintelligence is not some finished mind which knows how to do every single job in the economy. Because the way, say, the original OpenAI charter or whatever defines AGI is like, it can do every single job, every single thing a human can do. You’re proposing instead a mind which can learn to do every single job, and that is superintelligence.
我明白了。你的意思是,你在谈论的 superintelligence,并不是那种“已经完工的心智”,一出生就会做经济体系中的所有工作。因为按照最初的 OpenAI charter 一类的定义,AGI 几乎被描述成:它能做每一种工作,能做人类能做的每一件事。而你现在提出的,更像是一种“可以学会做所有工作”的心智,而正是这种能够学会一切的能力,才是你所谓的 superintelligence。
Ilya Sutskever 00:51:15
Yes.
是的。
Dwarkesh Patel 00:51:16
But once you have the learning algorithm, it gets deployed into the world the same way a human laborer might join an organization.
但一旦你有了这种学习算法,它被部署到现实世界中的方式,就很像一个人类劳动者加入一家组织。
Ilya Sutskever 00:51:25
Exactly.
没错,正是这样。
Dwarkesh Patel 00:51:26
It seems like one of these two things might happen, maybe neither of these happens. One, this super-efficient learning algorithm becomes superhuman, becomes as good as you and potentially even better, at the task of ML research. As a result the algorithm itself becomes more and more superhuman.
看起来接下来可能会发生两种情况之一,也可能两种都不发生。第一种是,这个高效得惊人的学习算法,在 ML research 这个任务上会变得超越人类,至少和你一样好,甚至比你更好。结果就是,这个算法本身会变得越来越“超人化”。
The other is, even if that doesn’t happen, if you have a single model—this is explicitly your vision—where instances of a model which are deployed through the economy doing different jobs, learning how to do those jobs, continually learning on the job, picking up all the skills that any human could pick up, but picking them all up at the same time, and then amalgamating their learnings, you basically have a model which functionally becomes superintelligent even without any sort of recursive self-improvement in software. Because you now have one model that can do every single job in the economy and humans can’t merge our minds in the same way. So do you expect some sort of intelligence explosion from broad deployment?
第二种是,即便前一种没有发生,如果你有的是同一个 model——这是你明确提出的那个愿景——它的无数实例被部署到整个经济体系中,去做不同的工作,在岗位上持续学习如何把这些工作做好,把任何人类能学会的技能都学会,而且是**同时**学会,然后再把这些学习成果汇总起来,那么即使没有任何“在软件层面上的递归自我提升”,你实际上也已经得到了一个在功能上接近超级智能的模型。因为你现在拥有的是:一个可以胜任经济中所有岗位的单一模型,而人类做不到把我们的心智以这种方式合并。那在你看来,这种广泛部署会不会引发某种“智能爆炸”?
Ilya Sutskever 00:52:30
I think that it is likely that we will have rapid economic growth. I think with broad deployment, there are two arguments you could make which are conflicting. One is that once indeed you get to a point where you have an AI that can learn to do things quickly and you have many of them, then there will be a strong force to deploy them in the economy unless there will be some kind of a regulation that stops it, which by the way there might be.
我觉得我们**很有可能**会经历一个高速经济增长阶段。在广泛部署的情形下,你可以提出两种相互冲突的判断。其一是:一旦你真的拥有了一种能够快速学会做各种事情的 AI,而且数量还很多,那除非有某种监管专门出来按下暂停键(顺带一提,这种监管也不是不可能),否则会有非常强的力量把它们推向整个经济体系。
But the idea of very rapid economic growth for some time, I think it’s very possible from broad deployment. The question is how rapid it’s going to be. I think this is hard to know because on the one hand you have this very efficient worker. On the other hand, the world is just really big and there’s a lot of stuff, and that stuff moves at a different speed. But then on the other hand, now the AI could… So I think very rapid economic growth is possible. We will see all kinds of things like different countries with different rules and the ones which have the friendlier rules, the economic growth will be faster. Hard to predict.
但“在一段时间内出现非常快速的经济增长”这件事,我认为在广泛部署的前提下是相当有可能的。真正的问题是:到底会有多快?这一点很难判断,因为一方面你有一个极其高效的“劳动者”;另一方面,这个世界本身非常庞大,充满各种各样的东西,而这些东西各自运转的速度并不相同。可与此同时,AI 又能够……所以总体上,我倾向于认为非常快速的经济增长是可能的。我们大概率会看到各种情形:不同国家采用不同的规则,那些规则更友好的国家,经济增长会更快。但具体会如何,确实很难预测。
1亿个巴菲特再加1亿个乔布斯同时出现,经济快速增长,消费者获得更大的好处,但资本不见得有什么好处,除经济增长以外,人和AI的相处方式可能会出现戏剧性的场景,相当于阿甘进到清华大学的姚班读书,周围都是高智商的同学,同吃同住,阿甘肯定不想跟这些人讨论学习上的问题。
00:55:07 – Alignment
00:55:07 – 对齐(Alignment)
Dwarkesh Patel 00:55:07
It seems to me that this is a very precarious situation to be in. In the limit, we know that this should be possible. If you have something that is as good as a human at learning, but which can merge its brains—merge different instances in a way that humans can’t merge—already, this seems like a thing that should physically be possible. Humans are possible, digital computers are possible. You just need both of those combined to produce this thing.
在我看来,这其实是一个非常危险、非常不稳的状态。从极限上讲,我们知道这样的东西在原理上应该是可行的:如果你有一个在“学习能力”上不输人类、但又能“合并大脑”的系统——也就是可以把不同实例的脑子合并在一起,而人类做不到这一点——那么光从物理上看,这种东西就应该是可能存在的。人类是可能存在的,数字计算机也是可能存在的,只要把这两者结合起来,就能造出这种东西。
It also seems this kind of thing is extremely powerful. Economic growth is one way to put it. A Dyson sphere is a lot of economic growth. But another way to put it is that you will have, in potentially a very short period of time... You hire people at SSI, and in six months, they’re net productive, probably. A human learns really fast, and this thing is becoming smarter and smarter very fast. How do you think about making that go well? Why is SSI positioned to do that well? What is SSI’s plan there, is basically what I’m trying to ask.
而且这种东西看起来会极其强大。你可以用“经济增长”来描述它——比如 Dyson sphere 这种级别就是极端的经济增长。但你也可以换一种说法:在可能非常短的时间里……你在 SSI 招人,大概六个月之后,这些人就已经产生净产出了。人类学东西本来就很快,而这个系统本身还在以极快的速度变得越来越聪明。你是怎么考虑“让这一切变成一个好结局”的?为什么 SSI 适合、或者说有能力把这件事做好?本质上,我是在问:SSI 在这方面的计划是什么?
Ilya Sutskever 00:56:10
One of the ways in which my thinking has been changing is that I now place more importance on AI being deployed incrementally and in advance. One very difficult thing about AI is that we are talking about systems that don’t yet exist and it’s hard to imagine them.
我这段时间在思考上的一个变化,是我现在更加重视“让 AI 提前、而且以渐进的方式部署出去”。AI 的一个很大的难点在于,我们讨论的是那些“尚不存在的系统”,而这类系统很难被真正想象出来。
I think that one of the things that’s happening is that in practice, it’s very hard to feel the AGI. It’s very hard to feel the AGI. We can talk about it, but imagine having a conversation about how it is like to be old when you’re old and frail. You can have a conversation, you can try to imagine it, but it’s just hard, and you come back to reality where that’s not the case. I think that a lot of the issues around AGI and its future power stem from the fact that it’s very difficult to imagine. Future AI is going to be different. It’s going to be powerful. Indeed, the whole problem, what is the problem of AI and AGI? The whole problem is the power. The whole problem is the power.
我觉得现在正在发生的一件事,是在现实生活中,人们其实很难 **真正“感受到”** AGI。AGI 是很难被“感觉到”的。我们可以谈它,但这有点像,你现在年轻健康,却在谈“老年虚弱时是种什么感觉”。你可以聊,可以试着去想象,但它就是很难真正被感同身受,你最终还是会回到当下那个并非如此的现实状态。我认为,围绕 AGI 以及其未来力量的很多问题,都源于:这种东西太难想象。未来的 AI 会非常不一样,会非常强大。严格说来,AI / AGI 的根本问题是什么?整个问题就是 **power**,就是“力量”。问题的核心就是力量本身。
When the power is really big, what’s going to happen? One of the ways in which I’ve changed my mind over the past year—and that change of mind, I’ll hedge a little bit, may back-propagate into the plans of our company—is that if it’s hard to imagine, what do you do? You’ve got to be showing the thing. You’ve got to be showing the thing. I maintain that most people who work on AI also can’t imagine it because it’s too different from what people see on a day-to-day basis.
当这种力量真的变得巨大时,会发生什么?在过去一年里,我改变看法的一个地方——而这种改变,我稍微保留一点说,很可能会反向影响我们公司的计划——是:如果一件事很难被想象,那你该怎么办?你就得 **把这个东西拿出来给人看**。你必须把它展示出来。我认为,甚至绝大多数做 AI 的人自己,其实也很难真正想象未来的样子,因为它和人们日常所见的东西差距太大了。
I do maintain, here’s something which I predict will happen. This is a prediction. I maintain that as AI becomes more powerful, people will change their behaviors. We will see all kinds of unprecedented things which are not happening right now. I’ll give some examples. I think for better or worse, the frontier companies will play a very important role in what happens, as will the government. The kind of things that I think you’ll see, which you see the beginnings of, are companies that are fierce competitors starting to collaborate on AI safety. You may have seen OpenAI and Anthropic doing a first small step, but that did not exist. That’s something which I predicted in one of my talks about three years ago, that such a thing will happen. I also maintain that as AI continues to become more powerful, more visibly powerful, there will also be a desire from governments and the public to do something. I think this is a very important force, of showing the AI.
不过有一点,我至今坚持认为是会发生的,我把它当成一个明确的预测:随着 AI 变得越来越强,人们的行为也会随之改变,我们会看到各种前所未有、现在还没有发生的事情。举几个例子:不管是好是坏,我认为前沿公司会在整个进程中扮演非常重要的角色,政府也是。你会看到、而现在已经有迹象的一件事,是那些原本激烈竞争的公司开始在 AI safety 上合作。你可能已经看到 OpenAI 和 Anthropic 做过一个很小的第一步,以前这是没有的。这件事是我大约三年前在一次演讲中就预测过的:会出现这种“竞争公司在安全上合作”的情况。我同样认为,随着 AI 继续变得更强、而且这种强大变得越来越“可见”,政府和公众会越来越有“必须做点什么”的冲动。我觉得这是一股非常关键的力量——通过“把 AI 展示出来”所触发的那股力量。
That’s number one. Number two, okay, so the AI is being built. What needs to be done? One thing that I maintain that will happen is that right now, people who are working on AI, I maintain that the AI doesn’t feel powerful because of its mistakes. I do think that at some point the AI will start to feel powerful actually. I think when that happens, we will see a big change in the way all AI companies approach safety. They’ll become much more paranoid. I say this as a prediction that we will see happen. We’ll see if I’m right. But I think this is something that will happen because they will see the AI becoming more powerful. Everything that’s happening right now, I maintain, is because people look at today’s AI and it’s hard to imagine the future AI.
这是第一点。第二点是,好,AI 在被建造出来,那我们还需要做什么?我坚持认为,将来一定会出现这样的情况:现在在做 AI 的人,之所以觉得 AI “没那么可怕”,是因为它现在还会犯很多错误,让人觉得它不那么强大。但我相信,会有一个时间点,AI 真的会开始“让人感到强大”。一旦到了那一步,我们会看到所有 AI 公司在安全策略上的态度发生巨大变化——它们会变得偏执得多(paranoid 得多)。我把这当作一个预测,看未来会不会应验。但我认为这件事会发生,因为他们会亲眼看到 AI 变得更强。就像我刚才说的,我认为现在发生的一切,很大程度上是因为大家看到的是“今天的 AI”,从而很难真正想象“明天的 AI”。
There is a third thing which needs to happen. I’m talking about it in broader terms, not just from the perspective of SSI because you asked me about our company. The question is, what should the companies aspire to build? What should they aspire to build? There has been one big idea that everyone has been locked into, which is the self-improving AI. Why did it happen? Because there are fewer ideas than companies. But I maintain that there is something that’s better to build, and I think that everyone will want that.
还有第三件必须发生的事。我接下来讲的会更偏“大框架”,不只是从 SSI 的角度,因为你刚才问的是我们公司。问题是:这些公司真正应该立志要造的东西是什么?他们应该追求建造什么?过去有一个“巨大想法”几乎锁住了所有人,那就是 self-improving AI(自我改进的 AI)。为什么会这样?在我看来,是因为“想法的数量少于公司的数量”。但我一直认为,有一种东西比“自我改进 AI”更值得去造,而且我觉得最终所有人都会想要它。
It’s the AI that’s robustly aligned to care about sentient life specifically. I think in particular, there’s a case to be made that it will be easier to build an AI that cares about sentient life than an AI that cares about human life alone, because the AI itself will be sentient. And if you think about things like mirror neurons and human empathy for animals, which you might argue it’s not big enough, but it exists. I think it’s an emergent property from the fact that we model others with the same circuit that we use to model ourselves, because that’s the most efficient thing to do.
我要说的是:**真正值得去造的,是一种在稳健意义上(robustly)被对齐到“关心有感知生命(sentient life)”的 AI。**特别是,我认为有一个很强的论据:从技术上讲,“造一个关心 sentient life 的 AI”,可能比“只关心 human life 的 AI”更容易做到,因为这个 AI 自身也会是感知能力的。如果你去想一想 mirror neurons(镜像神经元)这些现象,以及人类对动物那种虽然不够强烈、但确实存在的共情能力,你可以争论它还不够,但它是真实存在的。我认为,这是一种从底层机制中自然涌现出来的性质——我们用同一套神经回路来建模自己和建模他者,因为那是最有效率的做法。
Dwarkesh Patel 01:02:06
So even if you got an AI to care about sentient beings—and it’s not actually clear to me that that’s what you should try to do if you solved alignment—it would still be the case that most sentient beings will be AIs. There will be trillions, eventually quadrillions, of AIs. Humans will be a very small fraction of sentient beings. So it’s not clear to me if the goal is some kind of human control over this future civilization, that this is the best criterion.
所以即便你真的让一个 AI 去关心所有有感知的生命——而且在我看来,就算你把 alignment 解出来,这是不是“正确目标”本身都还未必清楚——现实仍然会是:绝大多数的有感知生命都会是 AI。它们的数量会达到万亿级,最终甚至可能是千万亿级,而人类只会是在所有有感知生命中占比极小的一部分。于是如果我们的目标是要让人类在这种未来文明中保有某种控制权,我就不确定“关心所有有感知生命”是不是一个最合适的准则。
Ilya Sutskever 01:02:37
It’s true. It’s possible it’s not the best criterion. I’ll say two things. Number one, care for sentient life, I think there is merit to it. It should be considered. I think it would be helpful if there was some kind of short list of ideas that the companies, when they are in this situation, could use. That’s number two.
你说得对,它确实有可能不是那个最好的准则。我先讲两点。第一,“关心有感知生命”这个目标,我觉得本身是有价值的,值得放进讨论范围里。第二,我认为如果能有一份“简短的、可选的思路清单”,在公司真正走到那一步时可以拿来参考,会是很有帮助的。
Number three, I think it would be really materially helpful if the power of the most powerful superintelligence was somehow capped because it would address a lot of these concerns. The question of how to do it, I’m not sure, but I think that would be materially helpful when you’re talking about really, really powerful systems.
第三,我认为如果能在某种程度上“给最强的那层超级智能设一个上限”,在实质上会非常有帮助,因为这能缓解很多你刚才说的担忧。至于具体怎么做,我现在也说不清,但一旦我们真的面对的是极其强大的系统,这样的“封顶机制”在实质上会非常重要。
Dwarkesh Patel 01:03:35
Before we continue the alignment discussion, I want to double-click on that. How much room is there at the top? How do you think about superintelligence? Do you think, using this learning efficiency idea, maybe it is just extremely fast at learning new skills or new knowledge? Does it just have a bigger pool of strategies? Is there a single cohesive “it” in the center that’s more powerful or bigger? If so, do you imagine that this will be sort of godlike in comparison to the rest of human civilization, or does it just feel like another agent, or another cluster of agents?
在继续聊 alignment 之前,我想在这个问题上再深挖一下。你觉得“顶端”到底还有多大的空间?你自己是怎么理解 superintelligence 的?按照你之前讲的“学习效率”的框架,它是不是主要体现在:对新技能、新知识的学习速度极快?还是说,它只是拥有一个大得多的策略库?抑或是,在中间有一个更强大、更庞大的统一“自我(it)”?如果是那样,在你想象中,它相对于其他整个人类文明,会更像一种“类似神”的存在,还是更像一个额外的 agent,或者一簇 agent?
Ilya Sutskever 01:04:10
This is an area where different people have different intuitions. I think it will be very powerful, for sure. What I think is most likely to happen is that there will be multiple such AIs being created roughly at the same time. I think that if the cluster is big enough—like if the cluster is literally continent-sized—that thing could be really powerful, indeed. If you literally have a continent-sized cluster, those AIs can be very powerful. All I can tell you is that if you’re talking about extremely powerful AIs, truly dramatically powerful, it would be nice if they could be restrained in some ways or if there were some kind of agreement or something.
在这一点上,不同人直觉会差别很大。我自己的看法是:它肯定会非常强大。而我觉得最有可能发生的情形,是会有多个这样的 AI 在大致相近的时间被造出来。如果一个集群足够大——比如说它真的达到了“洲级规模”的集群——那样的东西就会强大到惊人。如果你真的有一个“洲级数据中心规模”的集群,这些 AI 的能力会非常可怕。我能肯定的一点是:只要我们在谈的是这种极端强大的 AI,真正“戏剧性强大”的那种,那最好能在某种程度上对它们加以约束,或者至少有某种协议、某种共同约定存在。
What is the concern of superintelligence? What is one way to explain the concern? If you imagine a system that is sufficiently powerful, really sufficiently powerful—and you could say you need to do something sensible like care for sentient life in a very single-minded way—we might not like the results. That’s really what it is.
那 superintelligence 的核心担忧到底是什么?可以用一种方式来描述:如果你想象有一个系统,它强大到某个临界点,真的足够强大——哪怕你给它设定了一个听上去很“合理”的目标,比如“单一地、专一地关心所有有感知的生命”——我们最终未必会喜欢这个系统带来的结果。担忧的本质大概就是这个。
Maybe, by the way, the answer is that you do not build an RL agent in the usual sense. I’ll point several things out. I think human beings are semi-RL agents. We pursue a reward, and then the emotions or whatever make us tire out of the reward and we pursue a different reward. The market is a very short-sighted kind of agent. Evolution is the same. Evolution is very intelligent in some ways, but very dumb in other ways. The government has been designed to be a never-ending fight between three parts, which has an effect. So I think things like this.
顺便说一句,也许答案之一是:你根本就不去构建那种“传统意义上的 RL agent”。我举几个例子说明我的意思。在我看来,人类其实是某种“半 RL agent”:我们追求某种 reward,但情绪等等这些东西又会让我们对当前这个 reward 产生厌倦,转而去追求另一个 reward。市场也是一种 agent,但它是非常短视的 agent。进化也是类似的,在某些方面极其聪明,在另一些方面却异常愚蠢。再比如政府,它被设计成三权之间永无止境的博弈,这种结构本身就在起作用。所以我会往这些方向去想。
Another thing that makes this discussion difficult is that we are talking about systems that don’t exist, that we don’t know how to build. That’s the other thing and that’s actually my belief. I think what people are doing right now will go some distance and then peter out. It will continue to improve, but it will also not be “it”. The “It” we don’t know how to build, and a lot hinges on understanding reliable generalization.
让这类讨论变得格外困难的另一点在于:我们讨论的是那些目前并不存在的系统,而且我们其实还不知道要怎么把它们造出来。这也是我自己的信念之一:我认为大家现在在做的这些事情,确实还能再往前推一段路,但最终会渐渐“力不从心”,会继续改进,但那仍然不会是“那个 It”。真正的“那一个东西(the It)”,我们现在完全还不知道该怎样去构建,而这当中有很大一部分取决于:我们能不能真正搞懂“可靠的泛化(reliable generalization)”。
I’ll say another thing. One of the things that you could say about what causes alignment to be difficult is that your ability to learn human values is fragile. Then your ability to optimize them is fragile. You actually learn to optimize them. And can’t you say, “Are these not all instances of unreliable generalization?” Why is it that human beings appear to generalize so much better? What if generalization was much better? What would happen in this case? What would be the effect? But those questions are right now still unanswerable.
我再补充一点。你可以这样理解:alignment 之所以难,很大程度上是因为“学习人类价值观”的能力本身是脆弱的,而你“优化这些价值”的能力同样是脆弱的——你本质上是在学着如何去优化它们。你完全可以问一句:“这些不都算是各种形式的‘不可靠泛化’的表现吗?”为什么在人类身上,泛化看上去要好得多?如果模型的泛化能力真的大幅提升,会发生什么?这种情况下会有什么效果?但这些问题,目前都还是无法被真正回答的。
如果泛化的底层是安全感,想获得更好的智能,几乎必须有更强的“内在安全背书”,否则大脑每一步都在防御,就没空做结构化、简化、抽象,换句话说坏人有支离破碎的小聪明但不可能掌握简洁优雅的知识结构,坏人很难在“人生整体策略 / 道德与现实一体化”上做到简洁优雅,他们的世界观几乎注定是 patchwork。
Dwarkesh Patel 01:07:21
How does one think about what AI going well looks like? You’ve scoped out how AI might evolve. We’ll have these sort of continual learning agents. AI will be very powerful. Maybe there will be many different AIs. How do you think about lots of continent-sized compute intelligences going around? How dangerous is that? How do we make that less dangerous? And how do we do that in a way that protects an equilibrium where there might be misaligned AIs out there and bad actors out there?
我们应该如何理解“AI 发展得很好”到底长什么样?你刚才已经勾勒了 AI 可能的演化路径:我们会拥有这种持续学习的智能体,AI 会变得非常强大,而且可能会有很多个不同的 AI。同一时间到处游走着大量“洲级算力规模”的智能,这种场景在你看来会有多危险?我们又该如何降低这种危险?同时,在可能存在失控 AI 和恶意行为者的前提下,怎样做才能保护某种长期可维持的“均衡状态”?
Ilya Sutskever 01:07:58
Here’s one reason why I liked “AI that cares for sentient life”. We can debate on whether it’s good or bad. But if the first N of these dramatic systems do care for, love, humanity or something, care for sentient life, obviously this also needs to be achieved. This needs to be achieved. So if this is achieved by the first N of those systems, then I can see it go well, at least for quite some time.
这也是我为什么会喜欢“关心有感知生命的 AI”这个设想的一个原因。我们可以争论这是不是最好的目标。但如果最早出现的前 N 个这类“戏剧性强大”的系统,确实是关心、爱护人类,或者更一般地,关心所有有感知生命——而这一点显然需要被实现、必须被实现——那么只要前 N 个系统做到这一点,我就能想象出一个“至少在相当长一段时间内都还不错”的发展轨迹。
Then there is the question of what happens in the long run. How do you achieve a long-run equilibrium? I think that there, there is an answer as well. I don’t like this answer, but it needs to be considered.
接下来就是一个“长期会怎样”的问题:我们要怎样才能获得一个长期的均衡?在这里,我觉得也有一种“答案”。我不喜欢这个答案,但它必须被认真讨论。
In the long run, you might say, “Okay, if you have a world where powerful AIs exist, in the short term, you could say you have universal high income. You have universal high income and we’re all doing well.” But what do the Buddhists say? “Change is the only constant.” Things change. There is some kind of government, political structure thing, and it changes because these things have a shelf life. Some new government thing comes up and it functions, and then after some time it stops functioning. That’s something that we see happening all the time.
从长远来看,你可以想象这样一个世界:强大的 AI 已经存在,在短期内,你可以说这带来了类似“全民高收入”的状态——大家收入都很高,生活都不错。但佛教怎么说的?“诸行无常”,唯一不变的是变化本身。事情总会发生变化。某种政府、政治结构会存在一段时间,然后因为“保质期”到了就发生改变;新的政治结构出现,一段时间内运转得很好,然后又逐渐失灵。这种事情我们在现实中一再看到。
So I think for the long-run equilibrium, one approach is that you could say maybe every person will have an AI that will do their bidding, and that’s good. If that could be maintained indefinitely, that’s true. But the downside with that is then the AI goes and earns money for the person and advocates for their needs in the political sphere, and maybe then writes a little report saying, “Okay, here’s what I’ve done, here’s the situation,” and the person says, “Great, keep it up.” But the person is no longer a participant. Then you can say that’s a precarious place to be in.
所以,从长期均衡的角度看,一种思路是:也许每个人都会拥有一个为他“效劳”的 AI,从表面看这很好。如果这种状态可以无限期维持下去,听上去似乎没问题。但它的负面之处在于:AI 会替这个人去赚钱,会在政治领域替他表达诉求,最后也许再写一份小报告说:“我帮你做了这些,这就是目前的情况。”而那个人只需要说:“太好了,继续保持。”但此时,这个人本身已经不再是真正的参与者了。你就可以说,这其实是一个非常危险、非常脆弱的位置。
I’m going to preface by saying I don’t like this solution, but it is a solution. The solution is if people become part-AI with some kind of Neuralink++. Because what will happen as a result is that now the AI understands something, and we understand it too, because now the understanding is transmitted wholesale. So now if the AI is in some situation, you are involved in that situation yourself fully. I think this is the answer to the equilibrium.
我要事先强调,我并不喜欢这个“解”,但它确实是一种解法。这个解法是:如果人类通过某种 Neuralink++ 之类的技术,变成“部分 AI、部分人类”的存在。这样带来的结果是:当 AI 理解了某件事情,你也能同步理解,因为这种理解可以被“整体传输”到你这里。于是,当 AI 身处某个情境中时,你自己也是以一种完全参与的方式置身其中。我认为,这可能是实现长期均衡的一种答案。
可能想多了,这些研究员研究“人的行为”但没有开始系统的研究人类当中最优秀的个体,比如,巴菲特、乔布斯。
Dwarkesh Patel 01:10:47
I wonder if the fact that emotions which were developed millions—or in many cases, billions—of years ago in a totally different environment are still guiding our actions so strongly is an example of alignment success.
我在想,情绪这种东西,是在几百万年——很多情况下甚至是几十亿年——以前、在完全不同的环境下被进化出来的,而它们如今仍然如此强烈地支配着我们的行为,这会不会本身就是一个“alignment 成功案例”的例子?
To spell out what I mean—I don’t know whether it’s more accurate to call it a value function or reward function—but the brainstem has a directive where it’s saying, “Mate with somebody who’s more successful.” The cortex is the part that understands what success means in the modern context. But the brainstem is able to align the cortex and say, “However you recognize success to be—and I’m not smart enough to understand what that is— you’re still going to pursue this directive.”
具体一点说——我不太确定该把它叫作 value function 还是 reward function——大脑的脑干里有一条很古老的指令,大概就是:“去和更成功的个体交配。”而大脑皮层负责在现代语境里理解“什么叫成功”。但脑干却能够把皮层“对齐”过来,相当于在说:“不管你怎么在当下世界里定义‘成功’——那部分我没那么聪明,自己也搞不清——你最终都还是会朝着这条指令去行动。”
Ilya Sutskever 01:11:36
I think there’s a more general point. I think it’s actually really mysterious how evolution encodes high-level desires. It’s pretty easy to understand how evolution would endow us with the desire for food that smells good because smell is a chemical, so just pursue that chemical. It’s very easy to imagine evolution doing that thing.
我觉得这里有一个更一般性的问题:进化是如何把“高层次的欲望”编码进我们的?这其实非常神秘。很容易理解的是,进化怎么会让我们渴望“闻起来好吃的食物”:气味本质上是化学信号,那就追逐这种化学物质就行了——很容易想象进化怎么把这一套做出来。
But evolution also has endowed us with all these social desires. We really care about being seen positively by society. We care about being in good standing. All these social intuitions that we have, I feel strongly that they’re baked in. I don’t know how evolution did it because it’s a high-level concept that’s represented in the brain.
但进化还赋予了我们各种社会性的欲望。我们会极度在意“在社会眼中被正面看待”,在意自己“社会地位是否稳当”。这些关于社会的直觉,我强烈怀疑都是被“预烘焙”进去的。我完全不知道进化是怎么做到的,因为这些都是以“大脑中的高层概念”这种形式存在的。
Let’s say you care about some social thing, it’s not a low-level signal like smell. It’s not something for which there is a sensor. The brain needs to do a lot of processing to piece together lots of bits of information to understand what’s going on socially. Somehow evolution said, “That’s what you should care about.” How did it do it?
比如说,你在乎某件“社会评价”上的事情,它不是类似气味那种低层次信号,不是有个简单传感器就能测到的。大脑需要做大量的信息加工,拼接许多细节,才能搞清楚“社会上到底发生了什么”。可不知怎么地,进化却“指定”说:“你就该在乎这个。”那它到底是怎么指定的?
It did it quickly, too. All these sophisticated social things that we care about, I think they evolved pretty recently. Evolution had an easy time hard-coding this high-level desire. I’m unaware of a good hypothesis for how it’s done. I had some ideas I was kicking around, but none of them are satisfying.
而且,这件事似乎进展得很快。我们今天在意的这些复杂社会性东西,我认为它们其实是在相对很短的进化时间里才出现的。进化似乎相当容易地就把这种“高层欲望”硬编码了进去。关于它是怎么做到的,我现在并不知道有哪种特别好的假说。我自己想过一些可能性,但没有一个让我真正满意。
Dwarkesh Patel 01:13:26
What’s especially impressive is it was desire that you learned in your lifetime, it makes sense because your brain is intelligent. It makes sense why you would be able to learn intelligent desires. Maybe this is not your point, but one way to understand it is that the desire is built into the genome, and the genome is not intelligent. But you’re somehow able to describe this feature. It’s not even clear how you define that feature, and you can build it into the genes.
更让人惊讶的一点是:如果这种欲望是你在一生中学会的,那还说得过去,因为你的大脑本身是智能的,大脑能学会“智能化的欲望”是可以理解的。但也许你的意思并不是这个,另一种理解方式是:这些欲望是写在基因组里的,而基因组本身并不是“智能体”。但它却能以某种方式刻画出这种特征——而且这个特征到底该如何精确定义都还说不清,居然还能把它写进基因里。
Ilya Sutskever 01:13:55
Essentially, or maybe I’ll put it differently. If you think about the tools that are available to the genome, it says, “Okay, here’s a recipe for building a brain.” You could say, “Here is a recipe for connecting the dopamine neurons to the smell sensor.” And if the smell is a certain kind of good smell, you want to eat that.
差不多是这个意思,或者我换一种说法。如果你去想想基因组手上到底有哪些“工具”,它能做的是类似于:“好,这是一个构建大脑的配方。”你可以再具体点说:“这是一个把多巴胺神经元接到嗅觉传感器上的配方。”如果闻到的气味属于某种“好闻的味道”,那你就会想吃它。
I could imagine the genome doing that. I’m claiming that it is harder to imagine. It’s harder to imagine the genome saying you should care about some complicated computation that your entire brain, a big chunk of your brain, does. That’s all I’m claiming. I can tell you a speculation of how it could be done. Let me offer a speculation, and I’ll explain why the speculation is probably false.
我可以想象基因组做到这一点。但我想强调的是:要想象“基因组如何指定你应该关心的是某种由你的整个大脑、或者大脑的一大块区域,才算出来的复杂运算结果”,就难多了——我只是在说这个。我可以给出一个猜测,看它可能是怎么做到的;同时我也会解释为什么这个猜测大概率是不对的。
So the brain has brain regions. We have our cortex. It has all those brain regions. The cortex is uniform, but the brain regions and the neurons in the cortex kind of speak to their neighbors mostly. That explains why you get brain regions. Because if you want to do some kind of speech processing, all the neurons that do speech need to talk to each other. And because neurons can only speak to their nearby neighbors, for the most part, it has to be a region.
大脑是有不同脑区的,我们有 cortex(皮层),里面又分布着各种脑区。皮层在结构上相对是“均匀”的,但不同脑区里的神经元大多只跟自己附近的神经元“说话”。这就是为什么会形成一个个功能区:如果你想做语音处理,那负责语音的神经元就必须彼此持续交流;而既然神经元大体上只能跟周边邻居沟通,那它们就必须聚成一个“区域”。
All the regions are mostly located in the same place from person to person. So maybe evolution hard-coded literally a location on the brain. So it says, “Oh, when the GPS coordinates of the brain such and such, when that fires, that’s what you should care about.” Maybe that’s what evolution did because that would be within the toolkit of evolution.
而且,不同人的这些脑区,大体上都分布在相近的位置。所以或许进化做的事情就是:在大脑上“硬编码一个具体的物理坐标”。等于是说:“当你大脑里某个类似 GPS 坐标的区域被激活时,那就是你应该在乎的东西。”也许进化就是这样干的,因为这看起来还在它“能做到的工具箱”范围之内。
Dwarkesh Patel 01:15:35
Yeah, although there are examples where, for example, people who are born blind have that area of their cortex adopted by another sense. I have no idea, but I’d be surprised if the desires or the reward functions which require a visual signal no longer worked for people who have their different areas of their cortex co-opted.
是的,尽管也有一些例子,比如先天失明的人,他们皮层中原本用于视觉的区域会被其他感官“接管”。我不太清楚细节,但如果那些原本需要视觉信号才能运作的欲望或 reward functions,在这些皮层区域被重新占用之后就完全失效了,那我会很惊讶。
For example, if you no longer have vision, can you still feel the sense that I want people around me to like me and so forth, which usually there are also visual cues for.
举个例子,如果你不再有视觉,你是否仍然会有“我希望周围的人喜欢我”这种感觉?而在正常情况下,这种感受其实也依赖很多视觉线索。
Ilya Sutskever 01:16:12
I fully agree with that. I think there’s an even stronger counterargument to this theory. There are people who get half of their brains removed in childhood, and they still have all their brain regions. But they all somehow move to just one hemisphere, which suggests that the brain regions, their location is not fixed and so that theory is not true.
我完全同意你的说法。我觉得还有一个更有力的反例可以反驳刚才那套理论。有些人在童年时被切除了半个大脑,但他们仍然拥有所有的脑区——只是这些脑区都以某种方式迁移到了另一侧半球。这说明脑区的位置其实不是固定的,所以我刚才猜的那套“位置硬编码”理论并不成立。
It would have been cool if it was true, but it’s not. So I think that’s a mystery. But it’s an interesting mystery. The fact is that somehow evolution was able to endow us to care about social stuff very, very reliably. Even people who have all kinds of strange mental conditions and deficiencies and emotional problems tend to care about this also.
如果那套说法成立,其实会很酷,但事实并非如此。所以我认为这仍然是一个谜,不过是个很有意思的谜。事实上,不知出于什么机制,进化极其稳定、极其可靠地让我们去在意“社会相关的东西”。就算是那些有各种精神状况缺陷或者情绪问题的人,通常也同样会在意这些社会性的东西。
01:18:13 – “We are squarely an age of research company”
01:18:13 – “我们就是一家典型的‘研究时代’公司”
Dwarkesh Patel 01:18:13
What is SSI planning on doing differently? Presumably your plan is to be one of the frontier companies when this time arrives. Presumably you started SSI because you’re like, “I think I have a way of approaching how to do this safely in a way that the other companies don’t.” What is that difference?
SSI 打算在哪些方面走一条不一样的路?大概你的计划是:当那个时刻真正到来时,SSI 会成为前沿公司之一。你之所以创办 SSI,大概也是因为你在想:“我觉得自己有一套与众不同的路径,能更安全地把这件事做好,而其他公司没有。”那这种“不同”到底在哪里?
Ilya Sutskever 01:18:36
The way I would describe it is that there are some ideas that I think are promising and I want to investigate them and see if they are indeed promising or not. It’s really that simple. It’s an attempt. If the ideas turn out to be correct—these ideas that we discussed around understanding generalization—then I think we will have something worthy.
我会这样描述:我手上有一些在我看来很有前景的想法,我想要认真去验证它们,看它们是否真的有前景。事情其实就这么简单,这是一次尝试。如果这些想法最终被证明是对的——包括我们刚才在谈的那些和“理解泛化”有关的东西——那我觉得我们手里就会有一套真正值得的东西。
Will they turn out to be correct? We are doing research. We are squarely an “age of research” company. We are making progress. We’ve actually made quite good progress over the past year, but we need to keep making more progress, more research. That’s how I see it. I see it as an attempt to be a voice and a participant.
它们最终会不会被证明是对的?我们现在在做的就是研究。我们就是一家标准意义上的“研究时代”公司。我们确实在不断取得进展,过去一年里其实已经有了相当不错的进步,但我们还需要继续推进,做更多研究。我就是这样看待这件事的——把 SSI 视作一次尝试,尝试成为这个时代中的一个声音、一个真正的参与者。
Dwarkesh Patel 01:19:29
Your cofounder and previous CEO left to go to Meta recently, and people have asked, “Well, if there were a lot of breakthroughs being made, that seems like a thing that should have been unlikely.” I wonder how you respond.
你的联合创始人、前任 CEO 最近离职去了 Meta,很多人会问:“如果 SSI 这边真在不断取得重大突破,这种事看起来不太可能发生吧?”你会怎么回应这种说法?
Ilya Sutskever 01:19:45
For this, I will simply remind a few facts that may have been forgotten. I think these facts which provide the context explain the situation. The context was that we were fundraising at a $32 billion valuation, and then Meta came in and offered to acquire us, and I said no. But my former cofounder in some sense said yes. As a result, he also was able to enjoy a lot of near-term liquidity, and he was the only person from SSI to join Meta.
关于这件事,我只想简单提醒几个可能被人忽略的事实。我认为这些事实提供了必要的背景,也足以解释当时的情况。当时的背景是,我们正在以 320 亿美元的估值进行融资,然后 Meta 进来提出要收购我们,我拒绝了。但在某种意义上,我的前联合创始人选择了“同意”那条路。结果就是,他可以享受到相当可观的短期套现,而在整个过程中,他也是 SSI 里唯一一个去了 Meta 的人。
Dwarkesh Patel 01:20:27
It sounds like SSI’s plan is to be a company that is at the frontier when you get to this very important period in human history where you have superhuman intelligence. You have these ideas about how to make superhuman intelligence go well. But other companies will be trying their own ideas. What distinguishes SSI’s approach to making superintelligence go well?
听上去,SSI 的规划是:在那个人类历史上极其关键、会真正出现超人智能的时期,成为处在最前沿的一批公司之一。你有一整套关于“如何让超人智能走向一个好结果”的想法,但其他公司也会尝试他们自己的路线。那在“如何让 superintelligence 走向好结局”这件事上,SSI 的路径到底有什么不同?
Ilya Sutskever 01:20:49
The main thing that distinguishes SSI is its technical approach. We have a different technical approach that I think is worthy and we are pursuing it.
区分 SSI 的首要一点,是我们的技术路径。我们走的是一条不同的技术路线,而在我看来,这条路线是值得投入的,我们现在正在沿着它往前推进。
I maintain that in the end there will be a convergence of strategies. I think there will be a convergence of strategies where at some point, as AI becomes more powerful, it’s going to become more or less clearer to everyone what the strategy should be. It should be something like, you need to find some way to talk to each other and you want your first actual real superintelligent AI to be aligned and somehow care for sentient life, care for people, democratic, one of those, some combination thereof.
我一直认为,最后各家的策略会在某种意义上“收敛”。当 AI 变得足够强之后,什么样的策略才是正确方向,这一点会渐渐对所有人变得更清晰。大致会是这样一套东西:你需要找到一种方式彼此沟通;你希望自己第一个真正意义上的 superintelligent AI 是被对齐的,并且在某种程度上会关心有感知生命、关心人、尊重某种民主原则,或者这些要素的某种组合。
I think this is the condition that everyone should strive for. That’s what SSI is striving for. I think that this time, if not already, all the other companies will realize that they’re striving towards the same thing. We’ll see. I think that the world will truly change as AI becomes more powerful. I think things will be really different and people will be acting really differently.
我觉得这是所有人都应该去努力满足的前提条件——SSI 现在做的,就是朝这个方向努力。我认为这一次,或者说最晚在不久之后,其他公司也会意识到自己其实在向同一个目标靠拢。我们拭目以待。我相信,随着 AI 继续变强,这个世界会发生真正的变化,很多事情会完全不同,而人们的行为模式也会被彻底改变。
Dwarkesh Patel 01:22:14
Speaking of forecasts, what are your forecasts to this system you’re describing, which can learn as well as a human and subsequently, as a result, become superhuman?
既然说到“预测”,那你对你刚才描述的那种系统怎么看时间表?就是那种“学习能力不逊于人类,并且因此最终会变得超人化”的系统。
Ilya Sutskever 01:22:26
I think like 5 to 20.
我觉得大概是 5 到 20。
Dwarkesh Patel 01:22:28
5 to 20 years?
5 到 20 年?
Ilya Sutskever 01:22:29
Mhm.
嗯。
Dwarkesh Patel 01:22:30
I just want to unroll how you might see the world coming. It’s like, we have a couple more years where these other companies are continuing the current approach and it stalls out. “Stalls out” here meaning they earn no more than low hundreds of billions in revenue? How do you think about what stalling out means?
我想把你心里的那幅“未来时间线”摊开一点聊。大概是:未来还有几年时间,现在这些公司会沿着现有路线继续做下去,然后这条路线会“stall out(遇到瓶颈)”。那“stall out”在你这里具体是什么意思?是指它们的营收可能也就停在“几百亿美元到一两千亿美元”这个量级,不再有质的飞跃?或者你怎么给“stall out”这个状态下定义?
Ilya Sutskever 01:22:49
I think stalling out will look like…it will all look very similar among all the different companies. It could be something like this. I’m not sure because I think even with stalling out, I think these companies could make a stupendous revenue. Maybe not profits because they will need to work hard to differentiate each other from themselves, but revenue definitely.
我觉得所谓“熄火卡住”(stalling out)会呈现出一种状态:所有公司的情况看上去都差不多。大致可能是这样一种样子——当然我也不敢说得太死——即便在“增长停滞”的阶段,这些公司依然可能创造出惊人的营收。只不过未必会有同等规模的利润,因为它们需要非常用力地彼此差异化竞争,但从营收角度看,肯定还是会非常高。
Dwarkesh Patel 01:23:20
But something in your model implies that when the correct solution does emerge, there will be convergence between all the companies. I’m curious why you think that’s the case.
不过在你的叙述里,似乎隐含着这样一点:一旦真正“正确的解决方案”出现,各家公司之间在某种程度上会走向收敛。我很好奇你为什么这么认为。
Ilya Sutskever 01:23:32
I was talking more about convergence on their alignment strategies. I think eventual convergence on the technical approach is probably going to happen as well, but I was alluding to convergence to the alignment strategies. What exactly is the thing that should be done?
我刚才说的收敛,其实更多是指在 **alignment 策略** 上的收敛。技术路径上,最后大概率也会出现某种收敛,但当时我主要暗示的是:大家在“对齐策略”上会慢慢走到一块——也就是,到底应该做什么,什么样的做法才算“正确的那一套”。
Dwarkesh Patel 01:23:46
I just want to better understand how you see the future unrolling. Currently, we have these different companies, and you expect their approach to continue generating revenue but not get to this human-like learner. So now we have these different forks of companies. We have you, we have Thinking Machines, there’s a bunch of other labs. Maybe one of them figures out the correct approach. But then the release of their product makes it clear to other people how to do this thing.
我想再多理解一点你心里那条未来展开的路径。现在我们有这些不同的公司,而在你的设想中,它们沿着现有路线会继续创造营收,但到不了“类人学习者”那一步。与此同时,出现了一些“分叉”:比如你们 SSI,比如 Thinking Machines,还有一堆其他实验室。也许其中某一家找到正确路线,但一旦他们把产品发布出来,其他人就大致能看明白“这事可以这样做”。
Ilya Sutskever 01:24:09
I think it won’t be clear how to do it, but it will be clear that something different is possible, and that is information. People will then be trying to figure out how that works. I do think though that one of the things not addressed here, not discussed, is that with each increase in the AI’s capabilities, I think there will be some kind of changes, but I don’t know exactly which ones, in how things are being done. I think it’s going to be important, yet I can’t spell out what that is exactly.
我觉得未必会“清楚到”大家都知道具体该怎么做,但至少会很清楚一件事:**有一种“完全不同的东西”是可能存在的**,而这一点本身就是信息。接下来人人都会想办法搞清楚它是怎么运作的。不过,还有一个我们刚才没怎么展开、但我觉得很关键的点:每当 AI 能力又往前迈进一步,整个世界“做事的方式”都会发生一些变化——至于具体会变在哪儿、怎么变,我现在也说不清,但我相信那里会有很重要的结构性变化,只是目前还无法把它精确地描述出来。
Dwarkesh Patel 01:24:49
By default, you would expect the company that has that model to be getting all these gains because they have the model that has the skills and knowledge that it’s building up in the world. What is the reason to think that the benefits of that would be widely distributed and not just end up at whatever model company gets this continuous learning loop going first?
按常理推断,最先搞出这种模型的那家公司,应该会把绝大部分收益都攥在自己手里,因为它拥有那个正在全世界范围内不断积累技能与知识的模型。那你有什么理由认为,这种收益会被比较广泛地分散,而不是几乎全部集中到那个最先跑通“持续学习闭环”的模型公司头上?
Ilya Sutskever 01:25:13
Here is what I think is going to happen. Number one, let’s look at how things have gone so far with the AIs of the past. One company produced an advance and the other company scrambled and produced some similar things after some amount of time and they started to compete in the market and push the prices down. So I think from the market perspective, something similar will happen there as well.
我觉得会发生的事情大致是这样的。第一,我们先看看过去这几轮 AI 的演化是怎么走的:通常是某家公司率先做出一个重大突破,随后其他公司迅速跟进,在一段时间后搞出类似的东西,然后大家一起在市场上竞争,把价格往下打。我认为,从市场机制的角度看,这一轮也会出现类似的过程。
We are talking about the good world, by the way. What’s the good world? It’s where we have these powerful human-like learners that are also… By the way, maybe there’s another thing we haven’t discussed on the spec of the superintelligent AI that I think is worth considering. It’s that you make it narrow, it can be useful and narrow at the same time. You can have lots of narrow superintelligent AIs.
顺带一提,我们现在讨论的是“比较好的那条世界线”。什么叫“比较好的世界”?就是我们拥有这种强大的、类人学习者的 AI,同时还……再顺着说一个我们没怎么展开、但在设计 superintelligent AI 规格时很值得考虑的点:**你可以把它做窄一点,让它既超级智能,又在职能上保持窄域实用**。你完全可以有一堆“窄领域的超级智能 AI”。
But suppose you have many of them and you have some company that’s producing a lot of profits from it. Then you have another company that comes in and starts to compete. The way the competition is going to work is through specialization. Competition loves specialization. You see it in the market, you see it in evolution as well. You’re going to have lots of different niches and you’re going to have lots of different companies who are occupying different niches. In this world we might say one AI company is really quite a bit better at some area of really complicated economic activity and a different company is better at another area. And the third company is really good at litigation.
在这种设定下,假设有很多这样的系统,其中一家公司靠此赚取了巨额利润,然后另一家公司也杀进来开始竞争。竞争会怎么展开?答案是:**通过专业化来展开**。竞争最“偏爱”的就是分工与专业化——你在市场里能看到这一点,在进化里也能看到。最终你会出现大量细分“生态位”,对应一批占据不同生态位的公司。在那样的世界里,你可能会看到这样的格局:某一家 AI 公司在某种非常复杂的经济活动领域上明显更强;另一家公司则在另一个领域更胜一筹;还有第三家公司,也许在诉讼与法律业务上特别厉害。
Dwarkesh Patel 01:27:18
Isn’t this contradicted by what human-like learning implies? It’s that it can learn…
这难道不是和“类人学习能力”的含义相矛盾吗?因为它可以去学习……
Ilya Sutskever 01:27:21
It can, but you have accumulated learning. You have a big investment. You spent a lot of compute to become really, really good, really phenomenal at this thing. Someone else spent a huge amount of compute and a huge amount of experience to get really good at some other thing. You apply a lot of human learning to get there, but now you are at this high point where someone else would say, “Look, I don’t want to start learning what you’ve learned.”
它当然可以学,但这里还有“沉淀的学习成果”这件事。你已经在某个方向上投入了巨大的资源,你花了非常多的算力,才在这件事情上变得极其擅长、好到惊人。与此同时,另一个人(或系统)则在另一个方向上投入了同样巨量的算力和经验,在那件事上变得非常强。你们都用上了大量“类人学习”的能力才爬到各自的高点,而当你已经站在这个高度时,其他人就会说:“算了,我可不想从头开始学你已经学完的这些东西。”
Dwarkesh Patel 01:27:48
I guess that would require many different companies to begin at the human-like continual learning agent at the same time so that they can start their different tree search in different branches. But if one company gets that agent first, or gets that learner first, it does then seem like… Well, if you just think about every single job in the economy, having an instance learning each one seems tractable for a company.
那这大概就要求很多不同的公司在差不多同一时间,都拿到这种“类人持续学习的智能体”,这样它们才能各自在不同分支上展开自己的那棵“搜索树”。但如果只有一家公司最先拿到了这种智能体,或者说最先拿到了这种学习者,那看起来就会变成这样……毕竟如果你只考虑经济体系里的每一种工作,让模型的不同实例分别去学习每一种工作,对一家公司来说似乎是可行的。
Ilya Sutskever 01:28:19
That’s a valid argument. My strong intuition is that it’s not how it’s going to go. The argument says it will go this way, but my strong intuition is that it will not go this way. In theory, there is no difference between theory and practice. In practice, there is. I think that’s going to be one of those.
这是一个有道理的论点。不过我的强烈直觉是:事情最后不会那样发展。你的推理指向那条路径,但我的直觉非常强烈地觉得,现实不会按那条路径走。从理论上讲,理论与实践之间没有差别;但在实践中,差别是存在的。我觉得这会是“理论和实践不一样”的又一个例子。
Dwarkesh Patel 01:28:41
A lot of people’s models of recursive self-improvement literally, explicitly state we will have a million Ilyas in a server that are coming up with different ideas, and this will lead to a superintelligence emerging very fast.
很多人对“递归自我提升”的模型,几乎是字面上、明确地写着:我们会在一台服务器上跑一百万个 Ilya,让他们各自想不同的点子,这样超级智能就会很快涌现出来。
Do you have some intuition about how parallelizable the thing you are doing is? What are the gains from making copies of Ilya?
你对自己在做的事情“可并行化”的程度,有什么直觉吗?复制很多个 Ilya,实际能带来多少增益?
Ilya Sutskever 01:29:02
I don’t know. I think there’ll definitely be diminishing returns because you want people who think differently rather than the same. If there were literal copies of me, I’m not sure how much more incremental value you’d get. People who think differently, that’s what you want.
我也不太确定。但我觉得肯定会有“边际收益递减”,因为你真正需要的是思维方式不同的人,而不是一模一样的人。如果真的是复制了很多个“完全相同的我”,我不太确定那样还能额外带来多少真正的增量价值。你真正想要的,是那些思考方式彼此有差异的人。
巴菲特和乔布斯的思维方式其实是一样的,只是做了不同的工作。
01:29:23 – Self-play and multi-agent
01:29:23 – Self-play 和多智能体
Dwarkesh Patel 01:29:23
Why is it that if you look at different models, even released by totally different companies trained on potentially non-overlapping datasets, it’s actually crazy how similar LLMs are to each other?
为什么会出现这样一种情况:你去看不同的模型,即便是由完全不同的公司发布、在理论上可能并不重叠的数据集上训练出来的 LLM,它们之间相似得离谱?
Ilya Sutskever 01:29:38
Maybe the datasets are not as non-overlapping as it seems.
也许这些数据集之间,并没有看上去那么“互不重叠”。
Dwarkesh Patel 01:29:41
But there’s some sense in which even if an individual human might be less productive than the future AI, maybe there’s something to the fact that human teams have more diversity than teams of AIs might have. How do we elicit meaningful diversity among AIs? I think just raising the temperature just results in gibberish. You want something more like different scientists have different prejudices or different ideas. How do you get that kind of diversity among AI agents?
不过在某种意义上,即便单个人类个体在生产力上不如未来的 AI,人类“团队”的多样性,大概仍然会明显强于一群 AI 所组成的团队。这就引出一个问题:我们如何在 AI 之间激发出**有意义的多样性**?我觉得单纯把 temperature 调高,只会换来一堆胡言乱语。我们真正想要的是那种类似“不同行业科学家各自有不同成见、不同想法”的差异。那么,对 AI agent 来说,我们要怎么才能得到这种层面的多样性?
Ilya Sutskever 01:30:06
So the reason there has been no diversity, I believe, is because of pre-training. All the pre-trained models are pretty much the same because they pre-train on the same data. Now RL and post-training is where some differentiation starts to emerge because different people come up with different RL training.
在我看来,之所以现在看不到什么多样性,原因在于 pre-training。所有这些预训练模型本质上都差不多,因为大家在 pre-training 阶段用的都是类似的数据。现在开始出现差异化的地方,主要是在 RL 和 post-training 上,因为不同团队会设计出不同的 RL 训练方案。
Dwarkesh Patel 01:30:26
I’ve heard you hint in the past about self-play as a way to either get data or match agents to other agents of equivalent intelligence to kick off learning. How should we think about why there are no public proposals of this kind of thing working with LLMs?
我之前听你暗示过,用 self-play 来获取数据,或者用它来把智能体和“同等智能水平的其他智能体”配对,从而启动某种学习过程。那我们该怎么理解这样一个现象:目前几乎没有公开的方案,能清楚展示这类东西在 LLM 上跑得非常好?
Ilya Sutskever 01:30:49
I would say there are two things to say. The reason why I thought self-play was interesting is because it offered a way to create models using compute only, without data. If you think that data is the ultimate bottleneck, then using compute only is very interesting. So that’s what makes it interesting.
我觉得可以从两点来说。首先,我之所以觉得 self-play 有意思,是因为它提供了一种只用算力、不用额外数据就能继续训练模型的途径。如果你认为“数据才是终极瓶颈”,那能只靠算力继续往前推,就变得非常有吸引力——self-play 的有趣之处在这里。
The thing is that self-play, at least the way it was done in the past—when you have agents which somehow compete with each other—it’s only good for developing a certain set of skills. It is too narrow. It’s only good for negotiation, conflict, certain social skills, strategizing, that kind of stuff. If you care about those skills, then self-play will be useful.
但问题在于,至少按过去那种做法——也就是让多个 agent 以某种方式彼此对抗——self-play 只对某一小撮技能特别有用,范围其实很窄。它适合用来发展谈判、博弈、冲突应对、部分社会性技能、以及制定策略这类东西。如果你在意的是这些能力,那 self-play 就会派上用场。
Actually, I think that self-play did find a home, but just in a different form. So things like debate, prover-verifier, you have some kind of an LLM-as-a-Judge which is also incentivized to find mistakes in your work. You could say this is not exactly self-play, but this is a related adversarial setup that people are doing, I believe.
实际上,我觉得 self-play 已经“找到归宿”了,只是换了一种形式。比如各种 debate、prover–verifier 结构,还有 LLM-as-a-Judge 这种设置——模型被激励去发现你推理或回答中的错误。严格说,这些不算传统意义上的 self-play,但它们属于同一个家族,是类似的对抗式架构,而我相信大家已经在这样用它们了。
Really self-play is a special case of more general competition between agents. The natural response to competition is to try to be different. So if you were to put multiple agents together and you tell them, “You all need to work on some problem and you are an agent and you’re inspecting what everyone else is working,” they’re going to say, “Well, if they’re already taking this approach, it’s not clear I should pursue it. I should pursue something differentiated.” So I think something like this could also create an incentive for a diversity of approaches.
本质上,self-play 只是“多智能体竞争”这一更一般框架中的一个特例。对竞争最自然的反应是:试着变得不一样。所以如果你把多个 agent 丢在一起,对它们说:“你们都要一起解决这个问题,同时你可以观察其他人正在做什么。”那它们自然会产生一种反应:“如果别人已经在走这条路,那我就不一定要再走同一条,我应该找一个差异化的方向。”所以我觉得,类似这样的设定,本身也可以在结构上为“方法上的多样性”创造激励。
01:32:42 – Research taste
01:32:42 – 什么是“研究品味”
Dwarkesh Patel 01:32:42
Final question: What is research taste? You’re obviously the person in the world who is considered to have the best taste in doing research in AI. You were the co-author on the biggest things that have happened in the history of deep learning, from AlexNet to GPT-3 to so on. What is it, how do you characterize how you come up with these ideas?
最后一个问题:什么是“研究品味”(research taste)?在全世界范围内,大家显然普遍认为,你是做 AI 研究时“品味最好”的那个人之一。深度学习历史上那些最重要的成果——从 AlexNet 到 GPT-3 等等——你都是共同作者。那么,这种“研究品味”到底是什么?你会怎样刻画、描述自己是如何想出这些点子的?
Ilya Sutskever 01:33:14
I can comment on this for myself. I think different people do it differently. One thing that guides me personally is an aesthetic of how AI should be, by thinking about how people are, but thinking correctly. It’s very easy to think about how people are incorrectly, but what does it mean to think about people correctly?
我可以从自己的角度谈谈这个问题。我相信不同的人有不同的做法。对我个人来说,指导我的其中一件事,是一种关于“AI 应该是什么样子”的审美——这种审美来自于对“人类是怎样的”的思考,而且是**尽可能正确地**思考人类。很容易以错误的方式去想象人类是怎样的,但“正确地思考人类到底意味着什么”才是关键。
I’ll give you some examples. The idea of the artificial neuron is directly inspired by the brain, and it’s a great idea. Why? Because you say the brain has all these different organs, it has the folds, but the folds probably don’t matter. Why do we think that the neurons matter? Because there are many of them. It kind of feels right, so you want the neuron. You want some local learning rule that will change the connections between the neurons. It feels plausible that the brain does it.
我举几个例子。人工神经元(artificial neuron)这个概念,就是直接从大脑得到的灵感,而且是一个非常好的点子。为什么?因为你会观察到:大脑有各种不同的结构器官,有很多褶皱,但那些褶皱本身大概并不是关键。那我们为什么会觉得“神经元本身才是重要的”?因为它们数量极其庞大,这种“感觉上对了”的直觉会把你带到那里——你会想要用“神经元”这种基本单元;你还会希望有一种局部学习法则,能够改变神经元之间的连接权重,而这种设定也让人觉得“很像大脑就是这样干的”。
The idea of the distributed representation. The idea that the brain responds to experience therefore our neural net should learn from experience. The brain learns from experience, the neural net should learn from experience. You kind of ask yourself, is something fundamental or not fundamental? How things should be.
再比如 distributed representation(分布式表征)的思想;再比如,大脑会对“经验”产生响应,因此我们的神经网络也应该“从经验中学习”。大脑是通过经验在学习,那神经网络也应该通过经验来学习。你会不断地问自己:什么东西是“基本的”、是“底层”的,什么不是?事物“应该是怎样的”?
I think that’s been guiding me a fair bit, thinking from multiple angles and looking for almost beauty, beauty and simplicity. Ugliness, there’s no room for ugliness. It’s beauty, simplicity, elegance, correct inspiration from the brain. All of those things need to be present at the same time. The more they are present, the more confident you can be in a top-down belief.
我觉得这些一直在很大程度上指导着我:从多个角度思考,同时去寻找某种“接近美感的东西”——美感、简洁性。丑陋的东西,是不该留下空间的。你要追求的是:美、简洁、优雅,以及来自大脑的“正确灵感”。这些要素需要同时出现,而且出现得越充分,你就越能在“自上而下的信念”(top-down belief)上有信心。
The top-down belief is the thing that sustains you when the experiments contradict you. Because if you trust the data all the time, well sometimes you can be doing the correct thing but there’s a bug. But you don’t know that there is a bug. How can you tell that there is a bug? How do you know if you should keep debugging or you conclude it’s the wrong direction? It’s the top-down. You can say things have to be this way. Something like this has to work, therefore we’ve got to keep going. That’s the top-down, and it’s based on this multifaceted beauty and inspiration by the brain.
这种“自上而下的信念”,就是当实验结果暂时和你唱反调时,支撑你继续往前走的东西。因为如果你永远只信任数据,那有时会出现这样一种情况:你做的是对的,但实验里有 bug,而你不知道那里有 bug。那你要如何判断,到底还要不要继续调试?是该说“这方向错了”,还是该说“系统里还有没找到的问题”?靠的就是这种 top-down 信念。你会对自己说:“事物必须是这样的,这种结构总得有一种方式是能工作的,所以我们得继续干下去。”这种 top-down 信念,正是建立在多维度的“美感”和“来自大脑的正确灵感”之上的。
头脑清晰是在大脑发展的早期就已经有一些简洁优雅的知识结构,在后面的人生中泛化到其他领域,必须是非常早的时期,两个方向(有安全感或者受恐惧困扰)都是自我强化的,简洁优雅如果是胜出的一方只可能出现在非常早的时期,可能是1岁以前,甚至是娘胎里都有可能。
Dwarkesh Patel 01:35:31
Alright, we’ll leave it there.
好,那我们今天就先到这里。
Ilya Sutskever 01:35:33
Thank you so much.
非常感谢。
Dwarkesh Patel 01:35:34
Ilya, thank you so much.
Ilya,非常感谢你。
Ilya Sutskever 01:35:36
Alright. Appreciate it.
好的,谢谢,感激不尽。
Dwarkesh Patel 01:35:37
That was great.
这次对话太棒了。
Ilya Sutskever 01:35:38
Yeah, I enjoyed it.
是的,我很享受这次对话。
Dwarkesh Patel 01:35:39
Yes, me too.
我也是。