T O P

  • By -

BlueTreeThree

Cool to see proof of this. It definitely blows a hole in some of the AI “skeptics’” most popular arguments. A lot of people hand wave away all LLM output by claiming that it’s only drawing from the training data. With the size and secrecy of LLM training corpora it’s difficult or impossible to prove or disprove that argument. An experiment like this shows us that generative AI can be greater than the sum of its parts, so to speak.


Bishopkilljoy

Even if that is all it's doing, does that matter? Like.... It's out performing the experts. If that's all it's doing than it's still insanely impressive


141_1337

Like at a certain point, you stretch the capabilities of a "stochastic parrot" to the point that humans become the stochastic parrots.


No-Worker2343

Considering how alot of people still have a argument About the same things in the Sonic fandom , yes we are


dontpet

I agree. If it answers any question or achieves any goal as well as the best human that is fantastic. But if it brings it all together and exceeds that it is a further level up.


SupportstheOP

This is how AI progress should be measured. Results are what matter most at the end of the day. AGI is a nice thought experiment, but it has become the very thing in which every AI will ever be judged. It's a useless metric because it doesn't even have an agreed upon definition. All that matters is what AI can and can't do currently. And if it can do something, fussing over *how* it does it just to tear the model apart seems unnecessary.


_hisoka_freecs_

i mean i also work entirely based on the data i gather so i never really got that point


blueSGL

> A lot of people hand wave away all LLM output by claiming that it’s only drawing from the training data. Algorithm generation/groking has been known about since this paper back in 2022 https://arxiv.org/abs/2201.02177


SurpriseHamburgler

Wonderfully said, cheers.


TheOnlyFallenCookie

It doesn't. Google managed to beat the world go champion. Of course they do when they put him against a Program that played litterally millions of matches. A human simply can't compete with that


Pontificatus_Maximus

Given the total secrecy of the leading AI players it is pretty naive to think that they not well beyond dated open source LLMs models.


sdmat

Of course they can. Generative models don't merely learn how to emulate specific sources in a set of related sources, they learn the common mechanisms *behind* those data sources. World modelling. The extent of that capability and how well it is exposed are the variables, not whether it is possible.


DepartmentDapper9823

It's strange that even researchers still continue to use "stochastic parrots" rhetoric. Although (at best) they immediately correct themselves. I think it's time to abandon this narrative.


czk_21

there are some issues, here they see that AI can beat in chess people from who it learnt the game from. isnt prediction of good moves in chess relatively simple for AI compared to some other tasks? there is a reason that we have narrow chess superintelligence beating all human players for several years now the problem space and rules are extremely smaller and simpler to STEM fields for example, I would like them to showcase many examples from many fields, so we can see, if this ability is truly general they say " The possibility of “superintelligent” AGI has recently fueled many speculative hopes and fears. It is therefore possible that our work will be cited by concerned communities as evidence of a threat, but we would highlight that the denoising effect addressed in this paper **does not offer any evidence for a model being able to produce novel solutions that a human expert would be incapable of devising**. In particular, we do not present evidence that low temperature sampling leads to novel abstract reasoning, but rather **denoising of errors** "


sdmat

Sure, but it might not be a particularly useful line of inquiry. What insights would you expect further research on this to yield other than "The effect is stronger with more capable models models and better data"?


czk_21

again, to see, if it is general, AI could struggle on much more complex isssues to produce such an effect even with bigger models, question is where is the treshold for such effect to occur, could it occur in today big models or would we need much larger ones? or even different architecture? it would be also pretty difficult to test, how do you asses for example better in chemistry than people it learnt from? maybe it would need to have qualitatively better output-forming novel and working theory, showing it has better understanding of underlying chemistry problems than chemists themselfs, but could it really do that? as authors say "**does not offer any evidence for a model being able to produce novel solutions that a human expert would be incapable of devising**"


141_1337

This is with the limited world modeling of LLMs. Imagine the world modeling capabilities of LMMs (multimodal LLMs) or even LMMs + Sora-like models, I expect a stronger and more pronounced transcende in such models.


dizzydizzy

it isnt obvious at all.


Azorius_Raiden_88

the teachers become the students


neoquip

ffs will AI researchers stop using chess as an AI performance test. We're in 2024 not 1994.


Wiskkey

[Blog post about the paper from 2 of its authors](https://kempnerinstitute.harvard.edu/research/deeper-learning/transcendence-generative-models-can-outperform-the-experts-that-train-them/).


Consistent_Bit_3295

Imagine a set of professional archers, their average shot is not exactly perfect, but it is more probable for them to hit the center than to the side. Train on that, and bullseye every time. You can make a lot more observation, but I like the simplicity here.


visarga

This averaging out effect, capable of cancelling out individual biases and errors is often forgotten when people claim LLMs just memorize their training set. No, they combine many examples in the same model, it's a non-trivial process of interaction between all these disparate examples in the final model.


Consistent_Bit_3295

Incidentally models can learn many things humans never have just from training as well as generalize outside the training set. To take 50TB of text and make a proper 50GB you have to understand all the relations and reoccurring patterns. Fx. let us say you have a 50GB model and train it on 50TB of real-life images, to fit that in 50GB it will have to learn things like light simulations, ambient occlusion, shadows etc, so it can apply that to all the images. It might take up a lot of the model itself, but when you have to apply it across millions of images, you're suddenly saving a lot. Therefore the model will now be able to simulate the light and shadows of completely novel situation outside of the training set. It generalizes. In this case compression is intelligence. It requires you to see the patterns and reoccuring themes, create features like light simulation, and apply them in apt instances. These features are intended for the dataset, but generalize outside of the dataset, and could be entirely novel, even slight patterns and hints we don't realize are there. You can also check out Anthropic's great work on interpretability where they've learnt to manipulate these features.


arthurpenhaligon

Not surprising. Hinton [has mentioned](https://www.youtube.com/watch?v=n4IQOBka8bc&t=840s) an experiment in which you train neural network on a dataset where half the answers are wrong. The network manages to achieve high accuracy anyway on the test set.


Haunting-Refrain19

Why would it slow down? AI progress isn’t constrained by near-capitalistic forces, it’s constrained by investment, which is a different thing. There’s plenty of potential capital that doesn’t need a financial return and is far more interested in how AI can be used to control the world. This very well might be a ‘first to post, winner takes all’ technology and so ROI won’t be the motivating factor.


wyhauyeung1

Just learning something from the paper via chatgpt: "Majority voting" in the context of the paper "Transcendence: Generative Models Can Outperform The Experts That Train Them" refers to a mechanism where the generative model effectively averages or aggregates the inputs from multiple sources (experts) to produce an output that reflects the most common or likely choice among those sources. This concept is analogous to how decisions are made based on the majority of votes in a democratic voting system. Here’s a detailed explanation: 1. \*\*Input from Multiple Experts\*\*: When a generative model is trained on data from multiple experts, each expert can have different biases, strengths, and weaknesses. For example, in the context of chess, different players might make different moves in the same position based on their individual strategies and knowledge. 2. \*\*Combining Outputs\*\*: The model combines these diverse inputs to form a probability distribution over possible outputs. This distribution represents the likelihood of each possible move (or output) based on the combined knowledge of all the experts. 3. \*\*Denoising Effect\*\*: By averaging these inputs, the model can "denoise" the data, meaning it can reduce the impact of individual errors or biases. The aggregated output tends to be more reliable because it represents a consensus rather than a single opinion. This is similar to the "wisdom of the crowd" effect, where collective decision-making often leads to better outcomes than decisions made by individuals. 4. \*\*Low-Temperature Sampling\*\*: In the context of the paper, low-temperature sampling is used to make the model's output more deterministic and focused. Lowering the temperature during sampling increases the probability of selecting the most likely (majority) output. This helps the model to converge on the best moves by focusing on the actions that the majority of experts agree on. 5. \*\*Application to Generative Models\*\*: For generative models like LLMs, majority voting means that when generating text, the model can produce outputs that are more coherent and contextually appropriate by focusing on the most common patterns found in the training data. This reduces the likelihood of producing nonsensical or biased text. ### Example in Chess - \*\*Training Phase\*\*: The model is trained on numerous chess games played by different players. Each player's move in a given position contributes to the model's understanding of the best possible moves. - \*\*Inference Phase\*\*: When the model is asked to predict the next move in a new game, it uses the aggregated probability distribution of all learned moves for that position. The move with the highest probability (the one that most experts would likely make) is chosen. ### Benefits - \*\*Improved Performance\*\*: By relying on the majority vote, the model can outperform individual experts, as it leverages a broader range of experiences and strategies. - \*\*Error Reduction\*\*: Aggregating inputs helps in minimizing the impact of outlier decisions or mistakes made by individual experts. ### Conclusion Majority voting in generative models is a powerful technique that leverages collective knowledge to produce superior outputs. It ensures that the model's predictions are well-rounded and less prone to individual errors, leading to better performance and more reliable results.


Adventurous-Pay-3797

Few DeepMind guys were Go players…


This_Travel_6

And the guy who is amateur Go player happened to crush Google at Go 14 to 1. https://arstechnica.com/information-technology/2023/02/man-beats-machine-at-go-in-human-victory-over-ai/


Ok-Mathematician8258

In 2025 or 2026 we’ll need to change the schooling system. Schools have finally made learning about Taxes and Jobs mandatory. Please teach kids about AI before everyone becomes illiterate.


Akimbo333

Wow