Speaker 1:
Hey everyone. Thanks for joining us. We've got an IOS special. Sergey Brin. We're talking all things Google. Thanks for taking the time to chat.
大家好,感谢大家加入。本期是 I/O 特别节目,嘉宾 Sergey Brin。我们要聊聊 Google 的方方面面。感谢你抽时间接受采访。
Sergey Brin:
Thank you, Logan. You and I are all the time in chat spaces on all kinds of products, but it's nice to hang out in real life.
谢谢你,Logan。我们俩经常在各种产品的聊天空间里交流,但能在现实中见面真是不错。
Speaker 1:
Yeah, it is. My California experience is always, it's incredibly fun. I spent a bunch of time with Cora yesterday and today. It's like you feel the warmth and the humanity of AI progress when you spend time in person with everyone.
确实如此。我在加州的体验总是非常有趣。我昨天和今天都花了很多时间和 Cora 在一起。和大家面对面相处时,你能感受到 AI 进步中的温暖与人性。
So it's been a ton of fun. But we're sitting here at IO and I think the general sentiment from the world and I think also from the team internally is like incredibly great day for Google,
所以这段时间非常有趣。但现在我们身处 I/O,我觉得无论是外界还是内部团队的普遍感受都是:这是 Google 的极佳一天,
a ton of progress across models, across all of our products. What's your take? What's your reaction? Obviously, there's a lot of stuff we still need to do, but where's your head at?
从模型到所有产品都取得了巨大进展。你怎么看?你的反应如何?显然我们还有很多工作要做,但你现在的想法是什么?
Sergey Brin:
Yeah, I think it was definitely a phenomenal set of announcements. Honestly, I probably didn't even know about 30% of them or so. You know,
是的,我认为这确实是一系列非凡的发布。老实说,其中大约有 30% 的内容我此前都不知道。你知道的,
there's only so much time and I'm kind of deep in the weeds on Gemini and I didn't even know about the virtual fit, for example, for products in Google search. I didn't realize we were shipping that.
时间有限,而我一直深耕于 Gemini。例如,我甚至不知道 Google 搜索里的虚拟试衣功能——我不知道我们已经上线了。

幼稚。
So there were many things that also even surprised me. That was great. I think the reception's been great. There are so many things though. I think it takes a while for people to explore them, wrap their heads around them.
所以有很多事情连我都感到惊喜。这非常棒。我认为外界反馈也很好。不过内容太多,我想大家需要时间去体验、去了解。
And obviously we're very busy delivering all of them right now. And it's a lot of energy across the way. Just make sure things are actually shipping, shipping smoothly, that people are able to sign up for Ultra,
显然我们现在正忙着把所有功能交付落地,团队充满活力。我们要确保这些东西真正发布、顺利发布,让用户能注册 Ultra,
have all those new features and so forth.
并获得所有那些新功能等等。
Speaker 1:
Yeah, I feel like IO is the start of lots of work for lots of other people. It's like at the finish line for some teams and then it's like the starting line for a bunch of other teams. How do you think about Obviously, we launched more.
是的,我觉得 I/O 对许多人来说是大量工作的开始。对一些团队来说像是终点线,而对另一些团队则是起跑线。你怎么看?显然我们发布了更多内容。
There's a bunch of Gemini announcements, Gemini diffusion, we'll talk more about later, Deep Think, which is like continuing to push the frontier on reasoning models. I see you in the reasoning strike chat all the time,
我们有一系列 Gemini 的公告,Gemini Diffusion,我们稍后会聊;还有 DeepThink,它在持续推动推理模型前沿。我经常在 reasoning strike 的聊天室看到你,
sort of pushing on people to keep pushing the frontier, which I love to see. How do you think about sort of Maybe how you think about your focus, but also just generally the DeepMind team's focus around VO and Imagine,
你总是督促大家继续突破前沿,我很喜欢。你如何看待自己的专注点,也包括 DeepMind 团队在 VEO、Imagen 方面的整体重点?
and we have a whole suite of generative media models. We just announced Lyria, our music model, and then also at the same time, the core Gemini main model.
我们还有一整套生成式媒体模型。我们刚刚发布了音乐模型 Lyria,同时还有核心的 Gemini 主模型。
Do you work on the GenMedia stuff at all or is it mostly like laser focus on Gemini at the moment?
你现在有涉足 GenMedia 方面的工作吗?还是主要聚焦于 Gemini?
Sergey Brin:
Mostly I'm on Gemini, the core text model, primarily because I think that that's what will lead to, for example, self-improvement, will help us code and develop the science behind AI even better. So that's the number one focus that I have.
我主要还是负责 Gemini,也就是核心文本模型,因为我认为它能带来自我改进,比如帮助我们编程、进一步发展 AI 背后的科学。所以这仍是我的首要关注点。
At the same time, The generative media is so amazing.
与此同时,生成式媒体确实令人惊叹。
Speaker 1:
Yeah.
好的。
Sergey Brin:
I mean, that is just so superhuman, you know. I mean, with a text model, you know, I mean, there's some like math problems or whatever that I might be able to solve that it gets wrong or something like that or stumbles on a piece of code, although that's less and less frequent and I actually end up now relying on Gemini to do some like coding math and so forth. But nevertheless, it's sort of in the range of human.
那真的太超人类了。就拿文本模型来说吧,有些数学题或代码问题它可能会出错,而我还能解出来——虽然这种情况越来越少,现在我也经常依赖 Gemini 来做编码和数学。但总的说来,它还是在人类能力范围内。
Given my artistic talent, there'd be zero chance of me ever coming up with either an image or a video. I mean, just the amount of work That that would I imagine would take if I was an expert, you know, videographer, 3D renderer, whatever special effects person. I mean, I mean, that's got to be, you know, a month of solid work. Yeah, to get the thing I can get in a few minutes.
以我的艺术天赋,我几乎不可能自己创作出图像或视频。想想看,如果我是摄影师、3D 渲染师或特效专家,那得花多少工夫——至少要连干一个月。而现在我几分钟就能得到同样的结果。
And obviously, it's visually It's so compelling that, you know, it just kind of sucks you in. You can't escape it.
而且视觉效果极具冲击力,会把你完全吸引进去,让你无法自拔。
Speaker 1:
The audio piece with Veo just makes it feel like I historically have, you know, personally, I think like generating videos is awesome and stuff, but it's always like kind of felt slightly gimmicky to me.
Veo 的音频部分让我感觉……一直以来我都觉得生成视频很酷,但多少有点噱头。
And I think when I saw audio in Veo 3 on stage yesterday, I think it like that was the moment for me where I was like, OK, this is actually So many people are going to be able to just because also practically historically like you could generate a video But then like you'd have to go and like where does audio come from you know how do you you know? Sync everything up and now like you can make humans like talking and having a conversation And like it just does it all well, and it's it it blew my mind Yeah, you're right.
昨天在台上看到 Veo 3 加了音轨,那一刻我就想,好吧,这真的不一样了。过去你能生成视频,但还得想办法配音、对齐。而现在你能让人物开口对话,一切都自动搞定,效果太震撼了——你说得对。
Sergey Brin:
I mean I've been a huge fan of that. I'm like a pretty visual person personally I'm not like a very I guess audio person,
我也是这方面的超级粉丝。我个人偏视觉,不算是很注重音频的人,
but Over the years especially you know like with Google Glass I mean the moment we added some sound I mean that's just like it adds so much richness to have sound.
但这些年,尤其是在 Google Glass 上,一旦加入声音,整体体验就丰富了许多。
I mean it's like you're better off adding audio than adding like 3D for example. Although some of the 3D stuff's cool if you played with the big wearable thing. But anyhow Yeah, it's it's it's just an incredible change in perception when you get audio working and I knew I saw the model training the last month or two and You know, I just saw it from checkpoint to checkpoint and I knew that Wow, this is just going to feel different.
与其加 3D,不如先加音频。当然,有些 3D 在大型可穿戴设备上也很酷。但总之,一旦有了声音,感知就发生巨大变化。我这两个月一直看模型训练的检查点,心想:哇,出来后的体验会截然不同。
Speaker 1:
Yeah, it'll be interesting to see how the like fusion of those capabilities, because it does seem like there's a lot of like similarities to mainline Gemini, like mainline Gemini model.
是啊,接下来把这些能力融合会很有意思,感觉跟主线 Gemini 模型有不少相似之处。
And obviously, we landed native audio support in the mainline Gemini model at IO and in VO as well. And I was having a conversation with Tulsi this morning about just like how Are those similar breakthroughs? Are they different?
而且我们在 I/O 上给主线 Gemini 和 Veo 都加入了原生音频支持。我今天早上还跟 Tulsi 讨论,这两者是类似的突破吗,还是不同?
It sounds like it's actually technologically very different from a technology standpoint, but it is cool that we have other rails to do this innovation and ideally it all upstreams back to Gemini in some way.
听起来在技术实现上差别很大,但我们有多条路径做这类创新也很棒,理想情况下最终都能回流到 Gemini。
Sergey Brin:
Yeah. I mean, honestly, I think we've just taken a long time to ship the native audio in Gemini. It's been in there for...
确实。说实话,Gemini 的原生音频功能上线拖了很久,它其实已经内置了……
Speaker 3:
I mean, over a year.
差不多一年多了。
Speaker 1:
Oh, really?
哦,真的吗?
Sergey Brin:
No, no, the base model has had audio that it's trained on for at least a year. And, um, I don't know, there's always, like, I think honestly it's just...
对,基础模型至少训练了音频一年以上。而且,说实话,我觉得……
There's just so much to do so much to ship that nobody is for whatever reason Gotten it out there,
事情太多,发布太多,不知为何一直没人真正把它推出来。
and I mean like native audio in native audio out I think native audio in has been in there even longer But to get through all the little hoops to make it really work well and so forth I just think for whatever reason, took a long time.
原生音频输入输出——输入功能放得更久。要跨过各种小障碍让它真正好用,不知为何就拖了这么久。

开始关注执行力,执行力在科技企业确实很重要。
But yeah, that's finally out there. I don't think that's done, as you say, the same way Veo does it. I believe does the audio also through diffusion, just like it does the video.
不过现在终于上线了。我想它的实现方式与 Veo 并不相同。Veo 的音频和视频一样,应该也是通过扩散模型生成的。
Speaker 1:
Yeah.
是的。
Sergey Brin:
In fact, if you watch the during the training run, you can actually see it generating videos that are like, you know, a couple percent into it, that's these kind of You know, the shapes aren't quite right and the words are kind of warbled and stuff like that, but then it takes shape and it develops until, you know, at the end of the run you have what you have, what you see today.
事实上,如果你在训练过程中观察,就能看到它在生成视频——比如说训练才进行百分之几的时候,形状还不太对、词语也有些扭曲,但随后它会逐渐成形、不断完善,直到训练结束时达到如今的效果。
So yeah, I'm pretty sure that that's diffusion-based audio. Yeah, I mean, diffusion is a really powerful technique. You know, as you know, we shipped text diffusion for, you know, small early test run.
所以,是的,我几乎可以肯定那是基于扩散的音频。扩散确实是一种非常强大的技术。你也知道,我们曾做过文本扩散的早期小规模测试。
I mean, I think that's one of the things that I'm grateful that we have, you know, the bench Of machine learning researchers, isn't there that we can pursue simultaneously different base techniques for across modalities?
我很感激我们拥有一支机器学习研究人员的强大队伍,这使我们能够同时在多种模态上探索不同的底层技术。
Speaker 1:
Yeah, the results of Gemini Diffusion look super, super promising so far. I'm hopeful the model progress is there and like it all fully works because the demo works. We were talking off camera. The demo looks really good.
是的,到目前为止,Gemini Diffusion 的效果看起来非常、非常有前景。我希望模型能顺利进展,因为演示已经跑通了。我们在镜头外也聊过,Demo 效果真的很好。
So hopefully it like the capability translates well and everything works from that perspective. You mentioned this before about like watching the training run. I actually haven't seen what this looks like.
所以希望它的能力能够顺利迁移,一切都能按预期运行。你之前提到过“观看训练过程”,其实我没见过那是怎样的场景。
So like what does it actually mean to watch the training run?
那么,“观看训练过程”到底意味着什么?
Sergey Brin:
Oh, okay. Well, maybe you've seen for our text models, but you know we're able to To test out the intermediate checkpoints, you know, 10% of training, 20% of training and so forth.
哦,好吧。也许你见过我们的文本模型——我们可以在训练的中间检查点进行测试,比如训练到 10%、20% 时就测一下。
And it's, you know, the model is weak at those points in time, but it kind of, you can kind of get a sense for the trajectory. Yeah. And so, you know, usually, especially if you have a big training run, they have a lot sort of, you're using a lot of compute and you have high hopes for it. You're going to test it out. In various ways through, you know, many times throughout the run. So you're gonna have a pretty good sense of what it's on trend for.
那时模型还比较弱,但你能观察到它的走势。通常,尤其是在一次大型训练中,你投入了大量算力、寄予厚望,就会反复在不同阶段测试它,于是就能大致判断它的发展趋势。
So that's true for the text models. That's true for the Fusion, you know, video model for VO. Yeah, all of these models kind of have these intermediate results that you can take a peek at and if you're You know, really deep in there, you're for sure checking them because you're, you know, nervous and excited about what exactly it will produce.
文本模型如此,Veo 的扩散式视频模型也是如此。所有这些模型都有中间结果可供查看;如果你真的深度参与,你肯定会去看这些结果,因为既紧张又兴奋,想知道它最终能产出什么。
Speaker 1:
Yeah, I need to come and hang out at the MK more and watch. Come over later and watch some of them. Yeah. One of the, I was listening to Sundar's conversation with Dave Freeburg and he, Sundar made the comment that Even 15 years ago, you and Larry and him were having conversations about and like the team at Google were having conversations about like what this Future-facing AI moment would look like and that there it's like eerily close to what you all were talking about 10 or 15 years ago.
是啊,我得多去 MK(数据中心)那边待会儿,看看训练过程。待会儿过去看看。对了,我听了 Sundar 和 Dave Freeburg 的对话,Sundar 说早在 15 年前,你、Larry 和他以及 Google 团队就一直在讨论未来的 AI 时刻会是什么样子,而现在的情形与当年你们的设想惊人地接近。
I'm curious like how like what what are the things that have been most surprising to you about this this moment like even and we can ground it in products if you want to look at search or just the technology or like what what's been surprising and what's been like almost what you would have expected to have happened.
我想知道,在这个时刻,哪些事情最让你感到惊讶?你可以从产品角度谈,比如搜索,或单纯从技术角度谈——哪些是出乎意料的,哪些又几乎在你的预期之中?
Sergey Brin:
Yeah, you know, I think from an intellectual standpoint, you can kind of reason your way through the singularity. And, you know, famously Ray Kurzweil did this, I don't know, however many decades ago. I mean, it kind of, I don't remember what date he said it was 2037. I can't remember. He put some date on it, based on his extrapolation. Today it looks like maybe that was kind of conservative. I don't know.
是的,我想从智识角度,你可以推演出“奇点”。众所周知,Ray Kurzweil 几十年前就做过这样的推算,他好像说是 2037 年?我记不清具体年份了。他根据自己的外推给了一个日期。现在看来,也许那个预测还偏保守。
But you can intellectually sort of reason through it. I think to see it happening is altogether different. And I think when you're kind of talking whatever 15 years ago, I won't say you're like joking around you're like truly talking about it but you're kind of like you know imagining the science fiction future but it's almost like like a game like like you're just kind of you know chatting with the other folks who are interested in it like I think it's fun.
你可以在理性层面推导这一切,但真正看到它开始发生则完全是另一回事。15 年前我们谈论这些时,我不能说是在开玩笑;那是认真讨论,但也像是想象科幻未来的一种“游戏”——和志同道合者聊天,很有趣。
But yeah, as I said, seeing it actually start to happen, you know, feels very different.
然而,正如我所说,当这一切真的开始发生时,感觉就截然不同了。
Speaker 1:
Yeah.
好的。
Sergey Brin:
And of course, the way in which it happens is pretty surprising. And I can give you an example. I mean, the fact that language models seem to be right now the way that, you know, AI is developing.
当然,事情发展的方式相当令人惊讶。我可以举个例子——如今 AI 的发展路径竟然主要是语言模型,这一点就颇出人意料。
I don't think you necessarily would have known that 15 years ago. In fact, DeepMind You know, in the past, and even now to a certain extent,
15 年前你未必能预见这一点。事实上,DeepMind 过去如此,甚至在今天某种程度上依然如此,
has bet a lot on this kind of physical grounding that it's important to have kind of a physical world to ground on. And we're obviously doing experiments. In that vein, like Genie and whatnot.
一直押注于“物理锚定”——认为必须以物理世界为基础。我们显然也在做这方面的实验,比如 Genie 等项目。
But the fact that these language models have come as far as they have wasn't obvious. And an interesting side effect of that, especially with the thinking models, is that they are also surprisingly interpretable.
然而,语言模型能走到今天这一步并非显而易见。由此带来的一个有趣副作用,尤其在“思考模型”上,就是它们的可解释性居然出乎意料地好。
Like you can look through the thoughts of one of these thinking models and how it's coming to the conclusions that it's coming to. You can't necessarily, without a huge number of tools,
例如,你可以透视这些思考模型的“想法”,了解它如何得出结论。当然,如果没有大量工具,你不一定
inspect the weights of the model and try to infer stuff from that, but you can understand a lot of its reasoning in very understandable terms. So I think that's kind of, you wouldn't necessarily have guessed that 15 years ago.
能直接检查权重并推断其中信息,但你可以用非常易懂的方式理解它的大量推理过程。15 年前我们未必能料到这一点。
So that's been an interesting surprise, which I think gives a lot of Comfort, not infinite comfort, I'm not saying we should ignore it, but from a safety standpoint,
因此,这是个有趣的惊喜,也带来诸多安心——虽非绝对安心,也不是说可以掉以轻心,但从安全角度看,
the fact that these things do to some extent say what they're thinking, I think is a big plus. And yeah, there are papers out there about how they're sort of lying and stuff, but I think that's a relatively smaller effect.
这些模型在某种程度上会“说出”自己的思考,这是一大优点。当然,也有论文称它们会“撒谎”等等,但我认为那只是相对较小的现象。

这一段算是反思,算是承认自己没有洞察力。
Speaker 1:
What's your sense just being close to the model training process today as far as like how different or how similar it's going to look as the models move from being like text in,
作为近距离参与模型训练的人,你觉得当模型从“文本输入、
token in, token out, or text out to being like actual systems? And I think we're actually like we've already taken it like Gemini 2.0 like search is natively in there,
令牌输入、令牌输出或文本输出”转向“真正的系统”时,训练过程会有多大不同或相似?事实上,在 Gemini 2.0 中,我们已将搜索原生融入,
code execution is natively in there, like the model is learning that in the process.
代码执行也原生集成,模型在训练过程中学习这些能力。
Do you think the like training infrastructure or just like the way we think about models is going to be fundamentally different as they like they're not models anymore, they're really like full systems that we're baking for people?
当模型不再只是模型,而是真正面向用户烘焙的完整系统时,你认为我们的训练基础设施或思考方式会发生根本变化吗?
Sergey Brin:
I think it's the confluence of a few things. One thing, it's kind of remarkable how similar architecturally all the different models are,
我认为这是多种因素的汇聚。首先,令人惊讶的是,不同模型在架构上竟如此相似,
including, for example, VEO, which you would think video diffusions like very different than some text language model. But architecturally, there's a huge amount shared. That's kind of astonishing how much is shared.
包括 VEO 在内——你可能觉得视频扩散模型与文本语言模型截然不同,但在架构上它们共享了大量组件,这种共通性之大令人震惊。
And a lot of it uses transformers at the core. Which, thanks to Noam's crew, we've had now for approaching a decade. Now we are adding things like tool use and so forth. Mostly those things are I come about during what we call post-training.
其中许多模型核心都用到 Transformer。多亏 Noam 团队,我们已使用它近十年。如今我们在模型中加入工具使用等能力,这些大多发生在所谓的“后训练”阶段。
And post-training is an increasing fraction of the overall training. You know, it used to be that everything was like 99% pre-training and it's sort of shifting now to now maybe it's 90%, it'll be 80% and so forth.
后训练在整体训练中的比例正不断上升。过去几乎 99% 都是预训练;现在可能降到 90%,未来或许是 80% 等等。
And this post-training, which is sort of, you know, some people call fine-tuning but it includes, you know, the RL kind of work that we do. You know, it used to be this just little bit of,
而这个后训练,有人称之为微调,也包括我们所做的强化学习工作。过去它只占很小一部分,
tiny bit of shaping you do in the end but now it's more and more material and the kind of things you're mentioning with tool use and so forth come in during that now much larger phase and Yeah,
只是最后的一点点“塑形”,但现在这部分内容越来越多,你提到的工具使用等功能也在这更大的阶段加入。
I mean, look, that makes the model vastly more powerful.
可以说,这让模型的能力大幅提升。

创始人是技术出身,非常懂其中的技术。
Speaker 1:
Yeah. I've got two more questions because I want to get you back in the office working so that we can keep making model progress. First one is just around the reasoning scaling. I think we showed the results of Deep Think, which is sort of continuous scale up 2.5 Pro and let it reason for longer and have sort of parallel thought processes. What's your sort of general reaction to that?
好的。我还有两个问题,因为我想让你回到办公室继续工作,以便我们能持续推进模型的进展。第一个问题与推理扩展有关。我想我们展示了 Deep Think 的结果,也就是将 2.5 Pro 不断放大,让它进行更长时间的推理并采用并行思考方式。对此你总体有什么反应?
I think it feels like we're so early in that scaling paradigm that there's going to be a huge amount of additional unlock, but obviously you're in the weeds on this one, so I'm curious what your thoughts are.
我觉得在这种扩展范式上我们还处于非常早期阶段,未来会有大量额外的突破被释放,但显然你才是真正深入其中的人,所以我很好奇你的看法。
Sergey Brin:
We had about five different approaches to doing that kind of thing, and they all converged on this Deep Think. It was great to see all those people and those teams come together. Sometimes we fragment, and it takes a long time.
我们曾有大约五种不同的方法来做这件事,最终都汇聚到了 Deep Think。看到所有人和团队聚到一起真的很棒。有时我们各自为战,这会耗费很长时间。
But in this case, we took the best ideas of all of them, combined it in one go, and it's definitely yielding stronger results, obviously. I think that the more that continues to happen, that That is like a superpower.
但在这一次,我们把所有方法中最好的想法一次性整合在一起,显然这已经带来了更强的成果。我认为随着这种情况持续发生,它就像一种超能力。
If you can have these models, and I know lots of the top AI labs talk about this, but if you can have these models instead of just thinking for a minute, spinning out an answer, if you can leave them to go for an hour, for a day, or for a month maybe, and they actually get you a significantly better answer to a really important question, That can be incredibly valuable. And that's kind of new and it's non-trivial.
如果你能够拥有这些模型——我知道许多顶尖 AI 实验室都在谈论这个——而不是让模型只思考一分钟就给出答案,而是让它们运行一个小时、
一天,甚至也许一个月,并且它们真的能就一个非常重要的问题给出显著更好的答案,那将极其有价值。这是全新的,并且并非易事。
It's a little bit like, you know, we cracked long context for input. We did that a while ago and we've had a million plus context for input. A year and a half or something like that.
这有点像我们当初攻克长上下文输入。那已经是早些时候的事了,我们能处理超过一百万的上下文,大概是一年半前的事情。
Speaker 1:
Now we need infinite context. So you got to keep pushing.
现在我们需要无限的上下文,所以你得继续努力推进。
Sergey Brin:
Yeah, I'm not saying millions enough, but that generalization is sort of non-trivial. It's as if you were to go through like Groundhog Day, you know, you just get over and over, you get sample one day as a person, you try this, you try that, and now all of a sudden you're meant to live your life where stuff happens from day to day and week to week and month to month.
是的,我并不是说“百万”就足够了,但那种泛化确实并不简单。这有点像《土拨鼠之日》——你一次又一次地经历同一天,作为一个人,你在一天里被抽样体验,你试试这个、试试那个,突然之间,你就得真正开始生活,去经历每天、每周乃至每月发生的事情。
It's kind of a non-trivial generalization, but, you know, we figured out how to do that. On the output side, it's also, it's just kind of non-trivial if all you do is ...say short little math problems.
这种泛化并非易事,不过我们已经找到了解决办法。在输出层面同样如此——如果你只做一些简短的数学题,这事也并不简单。
Going from that, well, it's kind of like, you know, we interview people, we ask them 10 interview questions or whatever, and then we expect them to build these big systems over months.
要从那里迈向下一步,就好像我们面试候选人,只问他们 10 个面试题,然后就指望他们在几个月里构建庞大的系统。
And it's not clear if that's actually the right way to test a person. But on the AI models, we do that a million times over. We're only training them to do short, little, clever math questions, coding things, whatever. And then the expectation from there that, hey, they actually can spend a long time to develop something new that requires thinking over days, let's say, that's very non-trivial.
其实并不确定这是否真的是考察一个人的正确方式。但在训练 AI 模型时,我们却这样做了成百万次——我们只训练它们去解答简短而巧妙的数学题、编码题之类。随后却期待它们能投入大量时间,在数天乃至更长时间里开发出需要深度思考的新东西——可以说,这非常难。
But that is a gap that That we're starting to overcome and that's that's a that's a huge leap forward.
但这是一个我们正在跨越的鸿沟,这也是一次巨大的飞跃。
Speaker 1:
This example that you gave around just like how we test and eval models is always my consistent reminder that so much of life is a like this AI moment has taught me that so much of life is like actually an eval problem and like the challenge of like even interviewing people and like trying to build a great team and all that stuff like is an eval problem at its heart and like Humans haven't solved that.
你举的关于我们如何测试和评估模型的例子,总提醒我:这次 AI 时刻让我意识到,现实生活中有太多事情其实都是评估问题——例如面试候选人、组建优秀团队等等,本质上都是评估问题,而人类还没有解决这一难题。
It's not surprising to me that we haven't solved the AI eval problem as well. It is non-trivial to do that. My final question for you, and this is just in reaction to everything that we're seeing with I.O. and the pace of innovation.
因此我们尚未破解 AI 评估问题也就不足为奇了,这确实不容易。我最后一个问题是想回应我们在 I/O 上看到的一切以及创新的速度。
Sundar had a slide on the screen. Actually, Demis did. Where it was like everything we shipped in 2024 and then everything we shipped in 2025 so far. And it was like, I'm pretty sure the 2025 section was bigger than the 2024 section.
Sundar 曾在屏幕上展示一张幻灯片——实际上是 Demis 展示的——上面列出了 2024 年发布的全部内容,以及 2025 年迄今发布的全部内容。我几乎可以确定,2025 年的部分比 2024 年还要大。
So like clear acceleration happening. It feels like, at least on a personal level, for me joining Google, I've been here a little more than a year now, Google truly feels like a startup experience to me.
显然,增速正在加快。至少对我个人而言,自从加入 Google 以来,我在这里工作一年多了,Google 确实让我感受到一种初创公司的体验。
I'm curious to get your reaction to that, but also just your reaction in general, having seen Google grow and expand and everything that's happened in the last 20 years. How do you think about that?
我想听听你的看法,也想听听你在亲历 Google 过去 20 年的成长与扩张后,对此总体有何感受。你如何看待这一切?
Sergey Brin:
Great question. I think companies need to periodically reinvent themselves. And there are different important technology shifts, I guess. We started as a web company. We had to make mobile work.
好问题。我认为公司需要定期自我重塑。而且总会出现一些重要的技术变革。我们最初是一家 Web 公司,后来必须在移动端取得成功。
We were never particularly good at social, to be honest, as an example. But now we're in the AI domain. And I think from there, It's exciting because in some ways Google has always been an AI company. We've always been about large-scale data and analysis. We are also the place that gave birth to a lot of the modern large-scale machine learning, from Google Brain to the Transformer and so forth. I mean, it is in the company's DNA.
坦率地说,我们在社交领域从未表现突出。但现在我们进入了 AI 领域。从这个角度来看,令人振奋的是,某种意义上 Google 一直都是一家 AI 公司。我们始终专注于大规模数据和分析。Google 也是许多现代大规模机器学习技术的发源地——从 Google Brain 到 Transformer 等等——这已经写进公司的 DNA。
So this is a shift we should be really well-equipped to make. You know, any shift is probably hard on any company, but I'm feeling really good about it.
因此,这一次转型我们理应得心应手。当然,对任何公司来说转型都很艰难,但我对此感觉非常好。

这一段的评论还算中肯。
I think going from 24 to where we were honestly catching up on many levels to 25 and particularly with the launch of 2.5 Pro.
我认为,从 2024 年在许多层面努力追赶,到 2025 年,尤其是推出 2.5 Pro 之后,我们确实迎头赶上了。
Speaker 1:
Yeah.
发言者 1:
好的。
Sergey Brin:
I mean that was just that was just like a clear leap forward and I know whatever on different benchmarks maybe we were for a bit number one before or not. 2.5 Pro was a big step forward. Kind of across the board.
Sergey Brin:
我指的是,那确实是一次明显的飞跃。我知道在不同的基准测试上,也许之前我们曾一度排在第一,也可能不是。2.5 Pro 在各个方面都是一次重大进步。
And even, you know, to date, it remains on most leaderboards, number one, you know, with style control, without style control, however you measure it. So that's been just a really exciting leap forward.
而且直到现在,在多数排行榜上它依然名列第一——无论有无风格控制、无论怎样衡量。因而这确实是一次令人振奋的跃迁。
And I think we've, it's both sort of cause and effect of the ...kind of science, the science engine we have going behind it. It's going to help propel us forward, and it's thanks to what, you know,
我认为这既是原因也是结果,归功于我们背后运转的那套科研引擎。它将继续推动我们前进,而这一切都要感谢,你懂的,
all the science we were doing over the past year that we were able to finally produce that model. And in quick succession after that, a lot of other things have followed.
过去一年里我们所做的全部科研工作,才终于让我们产出那款模型。随后紧接着又有许多其他成果接踵而至。
We've already gone through a few different iterations of the 2.5 Pro model. I don't know if everybody noticed yesterday we launched the new 2.5 flash.
我们已经对 2.5 Pro 模型进行了几次不同的迭代。不知道大家是否注意到,我们昨天发布了新的 2.5 Flash。
Yeah, I mean do you notice that's actually like On many measurements, it's number two behind 2.5 Pro.
是的,你知道吗?在许多衡量指标上,它实际上排在 2.5 Pro 之后的第二位。
Speaker 1:
Yeah.
是的。
Sergey Brin:
So we're like one and two on many different leaderboards now with that flash model, which I think with all the other announcements, I just think a lot of people might have glossed over.
因此,凭借 Flash 模型,我们现在在许多不同的排行榜上位居第一和第二。不过在各种其他公告的掩映下,我觉得很多人可能忽略了这一点。
Speaker 1:
It got buried.
它被埋没了。
Sergey Brin:
But it's like a super fast model. It's really powerful. I think it's going to appeal to a lot of use cases. Yeah, but really with that kind of cornerstone of 2.5 Pro this year,
但它是个超级快速的模型,性能非常强大。我认为它会吸引很多用例。不过今年真正的基石还是 2.5 Pro,
I think for us to be able to build off of that and continue the momentum is really exciting.
我们能够在此基础上继续蓄势前行,真的令人兴奋。
Speaker 1:
It's going to be a great year. Sergey, I appreciate you for taking the time. I appreciate you for pushing everyone hard. It's a ton of fun to see. And we have a special gift for you, which I'd love to see you unbox.
今年一定会很精彩。Sergey,感谢你抽时间参与,也感谢你对大家的激励。看到这一切真的很有趣。我们还为你准备了一份特别的礼物,希望能看到你开箱。
And somebody will bring it over to us in one sec.
稍后会有人把它拿过来。
Sergey Brin:
Well, thank you, Logan. And while they bring it over, I'll just say thank you, Logan. I mean, I see you working hard all the time and, you know, making all of our customers and partners happy and tracking,
好吧,谢谢你,Logan。在他们拿过来的时候,我还想再感谢你。我看到你一直在努力工作,让我们的客户和合作伙伴满意,并且跟踪,
you know, the millions of issues that might arise. I mean, it's not that easy, you know, having these models that so many people and businesses want and, you know, getting them deployed and not having the TPUs meltdown and You know,
要知道可能出现的数以百万计的问题。这并不容易——既要满足众多个人和企业对这些模型的需求,又要部署它们,还要防止 TPU 故障,你知道的,
every little nuance from function calling to caching to all the million things, and I see you being really good about putting the customers first, communicating the needs back to the team, being just really on top of the ball.
从函数调用到缓存等无数细微之处,你总是把客户放在首位,把需求及时反馈给团队,始终保持最佳状态。
So, thank you.
所以,谢谢你。
Speaker 1:
The team's crushing it. No, thank you. The team's crushing it. Special gift for you.
团队干得漂亮。不,用不着谢。团队真的很棒。给你一份特别的礼物。
Sergey Brin:
All right. Thank you. Shall I unbox it?
好的。谢谢。我现在拆箱吗?
Speaker 1:
Yeah, yeah, you gotta unbox it right now. We gotta capture. The thread of this was just like one of the pieces, other than all the people inside of Google, that makes all this possible. It's our TPU.
对,对,你得立刻拆箱。我们得把过程拍下来。除了谷歌内部的所有人之外,使这一切成为可能的另一关键组件就是我们的 TPU。
Sergey Brin:
This is the TPU-V4, which internally, by the way, we call Pufferfish. It's probably not too secret. I think Pufferfish is V4, right? I never know the external translation. We just call these.
这是 TPU-V4,顺便说一句,我们内部把它称为 Pufferfish。这大概也不算什么秘密。我记得 Pufferfish 就是 V4,对吧?我从来搞不清外部的叫法。我们就这么叫它们。
I mean, these were the hottest thing to come by like a year or two ago. We've moved on to newer generations now. But we still do a lot of our work on these, so this is great.
我是说,一两年前这些可是最抢手的硬件。现在我们已经用上更新一代的芯片了,但我们仍有大量工作跑在这些设备上,所以这真的很棒。
Speaker 1:
Yeah, hopefully we'll get a bunch of these in the MK for the team.
是啊,希望我们能在 MK 给团队弄到一批这种东西。
Sergey Brin:
This is really cool.
这真是太酷了。
Speaker 1:
It's a real one. They had to take it out of a data center somewhere. It was not being used. We were not taking compute out.
这可是真货。他们得从某个数据中心把它拆下来,当时它没在用,我们可没有抢占计算资源。
Sergey Brin:
Really? Okay.
真的吗?好吧。
Speaker 1:
Be careful.
小心点。
Sergey Brin:
We definitely need these. Sometimes some of the early samples are a little bit faulty and maybe it's one of those. But I appreciate it. That's super nice.
我们确实需要这些。有时候早期样品会有点小毛病,这也许就是其中之一。但我很感激,这真是太贴心了。
Speaker 1:
Of course.
当然。
Sergey Brin:
I'll give you a little zoom in on there. Thank you.
我给大家近距离展示一下。谢谢。
Speaker 1:
Thank you. Thank you. Thanks for tuning in for folks who are listening. This is Release Notes. Thanks for watching.
谢谢,谢谢。感谢所有收听的朋友。本节目是 Release Notes,感谢观看。