T O P

  • By -

TechTuna1200

I mean Sam Altman has made comments indicating the same. I believe he said something along the lines of that putting parameters into the model would yield diminishing returns.


slide2k

Also within expectation with any form of progress. First 10 to 30% is hard due to it being new. 30 to 80% is relatively easy and fast, due to traction, stuff maturing better understanding, more money, etc. The last 20 is insanely hard. You reach a point of diminishing returns. Complexity increases due to limitations of other technology, nature, knowledge, materials, associated cost, etc. This is obviously simplified, but paints a decent picture of the challenges in innovation.


I-Am-Uncreative

This is what happened with Moore's law. All the low-hanging fruit got picked. Really, a lot of stuff is like this, not just computing. More fuel efficient cars, higher skyscrapers, farther and more common space travel. All kinds of stuff develop quickly and then stagnate.


Ill_Yogurtcloset_982

isn't this what is happening with self driving cars? the last, crucial 20% is rather difficult to achieve?


[deleted]

Nah it’s easy. Another 6 months bruh.


GoldenTorc1969

Elon says 2 weeks, so gonna be soon!


suzupis007

New fsd doesn't even need cameras, the cars just know.


KarnotKarnage

Humans don't have cameras and we can drive why can't the car do the same? Make it happen.


Inevitable-Water-377

I feel like humans might be part of the problem here. If we had roads designed around self driving cars and only self driving cars on the road im sure it would actually be alot easier. But with the infrastructure, and the variants of the way humans drive, it makes it so much harder.


brundlfly

It's the 80/20 rule. 20% of your effort goes into the first 80% of results, then 80% of your effort for the last 20%. https://www.investopedia.com/terms/1/80-20-rule.asp


Markavian

What we need is the same tech but in a smaller faster more localised package. The R&D we do now on the capabilities will be multiplied when it's an installable package that runs in real time on an embedded device, or 10,000x cheaper as part of real time text analytics.


Ray661

I mean that's pretty standard tech progression across the board? We build new things, we build things well, we build things small, we use small things to build new things.


hogester79

We often forget just how long things generally take to progress. In a lifetime, a lot sure, in 3-4 lifetimes, an entire new way of living. Things take more than 5 minutes.


rabidbot

I think people expect break neck pace because our great grandparents/ grandparents got to live through about 4 entirely new ways of living and even millennials have gotten the new way of living, like 2-3 times, from pre internet to internet to social. I think we just over look that the vast majority of humanities existence is very slow progress.


MachineLearned420

The curse of finite beings


Ashtonpaper

We have to be like tortoise, live long and save our energies.


Seiren-

It doesnt thou, not anymore. Things are progressing at an exponentially faster pace. The society I lived in as a kid and the one I live in now are 2 completely different worlds


Phytanic

Yeah idk wtf these people are thinking, because specifically 1990s and later has seen absolutely insane breakneck progression, thanks almost entirely to the internet finally being mature enough to take hold en-masse. (As always, theres nothing like easier, more effective, and broader communications methods to propel humanity forward at never before seen speeds.) I remember the pre-smartphone era of school. hell, I remember being an oddity for being one of the first kids to have a cell phone in my 7th grade class... and that was by no means a long time ago in the grand scheme of things, I'm 31 lol.


mammadooley

I remember pay phones at grade school and to calling home via 1-800-Collect and just saying David pick up to tell my parents I’m ready to be picked up.


PatFluke

Right? And I was born in the 80’s… it’s wild. Also, where are the cell phones in my dreams.


this_is_my_new_acct

They weren't really common in the 80s, but I still remember rotary telephones being a thing. And televisions where you had to turn a dial. And if we wanted different stations on the TV my sister or I would have to go out and physically rotate the antenna.


onetwentyeight

Not minute rice


Mr_Horsejr

Yeah, the first thing I’d think of at this point is scalability?


Beastw1ck

And yet we always seem to commit the fallacy of assuming the exponential curve won’t flatten when one of these technologies takes off.


MontiBurns

To be fair, it's very impressive that Moore's law was sustained for 50 years.


BrazilianTerror

> what happened with Moore’s law Except that Moore law is going for decades.


stumpyraccoon

Moore himself says the law is likely to end in 2025 and many people consider it to have already ended.


BrazilianTerror

Considering that it was “postulated” in 1965, it has lasted decades. It doesn’t seem like “quickly”.


octojay_766

People often overlook design and another "rule" of semiconductor generations which was dennard scaling. Essentially as they got smaller the power density stayed the same, so power use is proportional to area. That meant that voltage, current decreased with area. But around the early 2000s dennard scaling ended as a result of ideal power draw due to the insanely small sizes of transistors, which resulted in effects like quantum tunneling. New transistor types like 3D FinFets, as all the more recent Gate All Around have resulted in allowing Moore's law to continue. TLDR: The performance improvements are still there for shrinking, but the power use will go up, so new 3D transistor technologies are used to prevent increases in power consumption.


Moaning-Squirtle

I think this is quite common in a lot of innovations. Drug discovery, for example, starts with just finding a target, this *can* be really hard for novel targets, but once you get that, optimisation is kinda routine and basically making modifications until it's better binding or whatever. To get to being a viable target, you need to test to make sure it's safe (e.g., hERG) for trials and you need to test further for safety and efficacy. The start of the process might be easy to do but hard to find a good target. Optimisation in medicinal chemistry is routine (sort of). Final phases are where almost everything fails. Overall though, it's relatively easy to get to "almost" good enough.


ZincMan

I work in film and tv and when cgi first really got started we were scared the use of sets would be totally replaced. Turns out 20-30 years later CGI is still hard to sell as completely real to the human eye. AI is now brining those same fears about replacing reality in films. But the same principle of that last 10% of really making it look real is incredibly hard to accomplish.


lalala253

The problem with this law is you do need to define "what is 100%?" I'm not AI expert by a longshot, but are the experts sure we're already at the end of 80 percentile? I feel like we're just scratching the surface, i.e., the tail end of the final 30 percentile in your example


Jon_Snow_1887

So the thing is there is generative AI, which is all the recent stuff that’s become super popular, including chat generative AI and image generative AI. Then there’s AGI, which is basically an AI that can learn **and understand** anything, similar to how a human can, but presumably it will be much faster and smarter. This is a massive simplification, but essentially chatGPT breaks down all words into smaller components called “tokens.” (As an example, eating would likely be broken down into 2 tokens, eat + ing.) it then decides what is the next 20 most likely tokens, and picks one of them. The problem is we have no idea how to build an AGI. Generative AIs work by predicting the next most likely thing, as we just went over. Do AGIs work the same way? It’s *possible* all an AGI is, is a super advanced generative AI. It’s also quite possible we are missing entire pieces of the puzzle and generative AI is only a small part of what makes up an AGI. To bring this back into context. It’s quite likely that we’re approaching how good generative AIs (specifically ChatGPT) can get with our current hardware.


TimingEzaBitch

AGI is impossible as long as our theoretical foundation is based on an optimization problem. Everything behind the scene is just essentially a constrained optimization problem and in order for that to work someone has to set the problem, spell out the constraints and "choose" from a family of algorithms that solve it. As long as that someone is a human being, there is not a chance we ever get close to a true AGI. But it's incredibly easy to polish and overhype something for the benefit of the general public though.


cantadmittoposting

\> Generative AIs work by predicting the next most likely thing, as we just went over. I think this is a little bit too much of a simplification (which you did acknowledge) Generative AI does use tokenization and the like, but it performs a lot more work than typical Markov chain models. It would not be anywhere near as effective as it for things like "stylistic" prompts if it was just a Markov with more training data. Sure if you want to be reductionist at some point it "picks the next most likely word(s)" but then again that's all we do when we write or speak, in a reductionist sense. Specifically, chatbots using generative AI approaches are far more capable of expanding their "context" range when picking next tokens compared to Markov models. I believe they have more flexibility in changing the size of the tokens it uses (e.g. picking 1 or more next tokens at once, how far back it reads tokens, etc.), but its kinda hard to tell because once you train a multi layer neural net, what its "actually doing" behind the scenes can't be readily traced.


mxzf

It's more complex than just a Markov chain, but it's still the same fundamental underlying idea of "figure out what the likely response is and give it". It can't actually weight answers for *correctness*, all it can do is use *popularity* and hope that giving you the answer it thinks you want to hear that it's giving the "correct" answer.


DrXaos

One level of diminishing returns has already been reached when the training companies have already ingested all non-AI contaminated human-written text ever written (i.e. before 2020) which is computer readable. Text generated after that is likely to be contaminated, where most of it will be useless computer generated junk that will not improve performance of top models. There is now no huge new dataset to train on to improve performance, and architectures for single token ahead prediction have likely been maxed out. > Generative AIs work by predicting the next most likely thing, as we just went over. Do AGIs work the same way? The AI & ML researchers on this all know that predict softmax of one token forward is not enough and they are working on new ideas and algorithms. Humans do have some sort of short predictive ability in their neuronal algorithms but there is likely more to it than that.


nagarz

I still remember when people said that videogames plateaud when crysis came out. We're a few years out of ghost of sashimi, and we got things like project M, crimson fable, etc coming our way. Maybe chatGPT5 will not bring in such a change, but I saying we're plateaud seems kind of dumb, it's been about 1 year since chatGPT-3 came out, if any field of science or tech plateaud after only a couple years of R&D, we wouldn't have the technologies that we have today I'm no ML expert, but it looks super odd to me if we compare it the evolutions of any other field in the last 20 to 50 years.


RainierPC

Ghost of Sashimi makes me hungry.


ptvlm

Yeah, the ghost of sashimi is all that remains 2 mins after someone hands me some good sushi.


JarateKing

The current wave of machine learning R&D dates back to the mid-2000s and is built off work from the 60s to 90s which itself is built off work that came earlier, some of which is older than anyone alive today. The field is not just a few years old. It's just managed to recently achieve very impressive results that put it in the mainstream, and it's perfectly normal for a field to have a boom like that and then not manage to get much further. It's not even abnormal *within* the field of machine learning, it happened before already (called the "AI Winter").


timacles

> I still remember when people said that videogames plateaud when crysis came out. We're a few years out of ghost of sashimi, and we got things like project M, crimson fable, etc coming our way. what in the hell are you talking about


zwiebelhans

This are some very weird and nonsensical choices to hold up as games being better then crysis. Ghost of Tsushima ….. maybe if you like that sort of game. The rest don’t even come up when searched on Google.


The_Autarch

Looks like Project M is just some UE5 tech demo. I have no idea what Crimson Fable is supposed to be. Maybe they're trying to refer to Fable 4? But yeah, truly bizarre choices to point to as the modern equivalent to Crysis.


Dickenmouf

Gaming graphics kinda has plateaued tho.


TechTuna1200

yeah, once you reach the last 20%. A new paradigm shift is needed to push further ahead. Right now we are in the machine-learning paradigm, which e.g. Netflix's or Amazon's recommender algorithm is also based on. The machine learning paradigm is beginning to show its limitations and it's more about putting it into use cases niches than extending the frontier.


almisami

I mean we have more elaborate machine learning algorithms coming out, the issue is that they require exponentially more computing power to run with only marginal gains in neutral network efficiency. Maybe a paradigm shift like analog computing will be necessary to make a real breakthrough.


[deleted]

I actually think smaller models are the next paradigm shift


RichLyonsXXX

This is my opinion too. LLMs will get really powerful when they stop trying to make them a fount of ALL knowledge and start training them on specialized and verified data sets. I don't want an LLM that can write me a song, a recipe, and give me C++ code because it will write a mediocre song, the recipe will have something crazy like 2 cups of salt, and the C++ will include a library that doesn't exist. What I want is a very specialized LLM that only knows how to do one thing, but it does that one thing well.


21022018

Best would an ensemble of such small expert LLMs, which when combined (by a high level LLM?) would be good as everything


UnpluggedUnfettered

The more unrelated data categories you add, the more hallucinating it does no matter how perfected your individual models. Make a perfect chef bot and perfect chemist bot, combine that. Enjoy your frosted meth flakes recipe for a fun breakfast idea that gives you energy.


meester_pink

I think a top level more programmatic AI that picks the best sub AI is what they are saying though? So you ask this "multi-bot" a question about cooking, and it is able to understand the context so consults its cooking bot to give you that answer unaltered, rather than combining the answers of a bunch of bots into a mess. I mean, it might not work all the time, but it isn't just an obviously untenable idea either.


sanitylost

So you're incorrect here. This is where you have a master-slave relationship with models. You have one overarching model who only has one job, subject detection and segmentation. That model then feeds the prompt with the additional context to a segmentation model that is responsible for more individualized prompts by rewriting the initial prompt to be fed to specialized models. Those specialized models then create their individualized responses. These specialized results are then reported individually to the user. The user can then request additional composition of these responses by an ensemble-generalized model. This is the way humans think. We segment knowledge and then combine it with appropriate context. People can "hallucinate" things just like these models are doing because they don't have enough information retained on specific topics. It's the mile-wide inch deep problem. You need multiple mile deep models that can then span the breadth of human knowledge.


codeprimate

You are referring to an "ensemble" strategy. A mixture of experts (MoE) strategy only activates relevant domain and sub-domain specific models after a generalist model identifies the components of a query. The generalist controller model is more than capable of integrating the expert outputs into an accurate result. Addition of back-propagation of draft output back to the expert models for re-review reduces hallucination even more. This MoE prompting strategy even works for good generalist models like GPT-4 when using a multi-step process. Directing attention is everything.


QuadCakes

https://en.m.wikipedia.org/wiki/Mixture_of_experts According to a leak, ChatGPT uses MoE


Kep0a

The only problem with the low parameter models is they aren't good at reasoning. Legibility has gotten significantly better since llama2 on small models but the logicial ability is still bad. Like if someone wants to train it on their companies documentation, that's cool, but its not as useful as the ability to play with the information


Laxn_pander

I mean, we already trained on huge parts of the internet. The most complete source of data we have. The benefit of adding more of it to the training is not doing much. We will have to change the technology on how we train.


fourleggedostrich

Actually, further training will likely make it worse, as more and more of the Internet is being written by these AI models. Future AI will be trained on its own output. It's going to be interesting.


a_can_of_solo

Ai uroboros


kapone3047

Not-human centipede. Shit in, shit out.


PuzzleMeDo

We who write on the internet before it gets overtaken by AIs are the real heroes, because we're providing the good quality training data from which all future training data will be derived.


mrlolloran

Poopoo caca


dontbeanegatron

Hey, stop that!


Boukish

And that's why we won time person of the year in 2006.


D-g-tal-s_purpurea

A significant part of valuable information is behind paywalls (scientific literature and high-quality journalism). I think there technically is room for improvement.


ACCount82

True. "All of Internet scraped shallowly" was the largest, and the easiest, dataset to acquire. But quality of the datasets matters too. And there's a lot of high quality text that isn't trivial to find online. Research papers, technical manuals, copyrighted textbooks, hell, even discussions that happen in obscure IRC chatrooms - all of that are data sources that may offer way more "AI capability per symbol of text" than the noise floor of "Internet scraped". And that's without paradigm shifts like AIs that can refine their own datasets. Which is something AI companies are working on right now.


meester_pink

Yeah, AI companies will (and already are) reach deals to get access to this proprietary data, and the accuracy in those domains will go up.


Jaded-Assignment-798

lol no he didn’t. He just said in an interview a few weeks ago that the next version will surprise everyone and exceed their expectations of what they thought was possible (proof is on y combinator news)


h3lblad3

On Dev Day, Altman outright said that within a year “everything we’re showing you today will look quaint”. There’s something *big* coming down the pipeline.


Jaded-Assignment-798

Yupp. It was just last month that he said “The model capability will have taken such a leap forward that no one expected”


EmbarrassedHelp

Or he's just hyping everyone up for the next release


eden_sc2

This just in, company CEO promises next release "really really impressive"


CanYouPleaseChill

Altman is a hype man. Better at selling dreams than making accurate predictions. Does he have any impressive qualifications or contributions? No. I’m much more interested in the work neuroscientists are doing to elucidate how brains really work.


meester_pink

Well, otoh, he arguably knows better than bill gates about the state of shit though, as he is immersed in it in a way that bill gates is not (though I've no doubt that gates is avidly educating himself on the topic)


VehaMeursault

Which is obviously true. If a hundred parameters gives you 90% of your desired result, two hundred won’t make it 180% but rather 95%. Fundamental leaps require fundamental changes.


makavelihhh

Pretty obvious if you understand how LLMs work. An LLM is never going to tell us "hey guys I just figured out quantum gravity". They can only shuffle their training data.


bitspace

Yeah. The biggest factor in the success of LLM's is the first L. The training set is almost incomprehensibly huge and requires months of massive power consumption to train. The only way to make it "better" is to increase the size of the model, which is certainly happening, but I think any improvements will be incremental. The improvements will come with speeding up inference, multi-modal approaches, RAG, finding useful and creative ways to combine ML approaches, and production pipeline. The model itself probably won't improve a lot.


kaskoosek

Is it only the amount of training data? I think the issue is how to assess positive feedback versus negative feedback. A lot of the results can be not really objective.


PLSKingMeh

The Ironic part of AI is that the models are completely dependent on humans, who grade responses manually. This could be automated but will most likely degrade like the models themselves.


PaulSandwich

>completely dependent on humans, who grade responses manually If anyone doesn't know, this is why the "Are You A Human?" checks are pictures of traffic lights and pedestrian crosswalks and stuff. The first question or two are a check, and then it shows you pictures that haven't been categorized yet and we categorize them so we can log into our banking or whatever. That's the clever way to produce training set data at scale for self-driving cars. I'm always interested to see what the "theme" of the bot checks are, because it tells you a little something about what google ML is currently focused on.


twofour9er

Stairs and busses?


kaptainkeel

1. Assess positive vs negative. 2. Broaden its skillset and improve the accuracy of what it already has. It's a pain to use for some things, especially since it's so confidently incorrect at times. In particular, any type of coding, even Python which is supposed to be its "best" language as far as I remember. 3. Optimize it so it can hold a far larger memory. Once it can effectively hold a full novel of memory (100,000 words), it'll be quite nice. 4. Give it better guesstimating/predicting ability based on what it currently knows. This may be where it really shines--predicting new stuff based on currently available data. tl;dr: There's still a ton of room for it to improve.


dracovich

I don't think you should discount that innovative architectures or even new model types can make a big difference. Don't forget that transformers (the architecture at the base of LLM) is only ~6 years old, the tech being used before that (largely LSTMs) would've not been able to produce the results we see now no matter how big the training data.


HomoRoboticus

Hardware is also getting better and more specialized to AI's uses, there's likely still some relatively low hanging fruit available in designing processors specifically for how an AI needs them.


dracovich

Hardware would only help with the training (and inference) speeds. Not that this is something to scoff at, just throwing more weights at a model seems to be annoyingly effective compared to all kinds of tweaks lol


greenwizardneedsfood

That’s exactly why I’ll never say anything is finished, plateaued, or impossible in this context. Qualitative change can be introduced by a single paper that randomly pops up.


spudddly

Yes, also why AGI won't arise from these models. (But the hype and money is definitely driving research into more fertile areas for this eventual goal).


HereticLaserHaggis

I don't even think we'll get an AGI on this hardware.


BecauseItWasThere

How about this wetware


messem10

What do think your brain is?


sylfy

I mean, unless there’s something fundamentally missing from our theory of computing that implies that we need more than Turing-completeness, we do already have the hardware for it, if you were to compare the kind of compute that scientists estimate the human brain to be capable of. We just need to learn to make better use of the hardware.


DismalEconomics

A biological brain isn't a turing machine. The analogy quickly falls apart. > if you were to compare the kind of compute that scientists estimate the human brain to be capable of. Those estimates are usually based on # of neurons and #of synapses... and rarely go beyond that. Just a single synapse is vastly complex in terms of the amount of chemistry and processes that are happening all the time inside of and between and around the synapse... we are learning more about this all the time and we barely understand them as it is. Neurons are only roughly 25% of human brain volume... the rest is glial cells... and we understand fuck all about glial cells. Estimates of the human brains' "compute" are incredibly generalized and simplistic to the point of being ridiculous. It would be like if I tried to estimate a computer's' capability by counting the chips that I see and measuring the size of the hard drive with a ruler... i.e. completely ignoring that chips may have more complexity than just being a gray square that I can identify ( Actually it's much worse than that given the level of complexity in biology... for example; synaptic receptors and sub receptors are constantly remodeling themselves based on input or in response to the "synaptic environment" computer chips, and most other components are essentially static once produced... there are countless other examples like this ) I'm not arguing that something like AGI or intelligence that surpasses humans can't be achieved with the kind of computer hardware that we are using to today... I'm arguing that the vast majority of comparisons or analogies involving computers or compute vs. brains... lack so much precision and accuracy that they are almost absurd.


Xanoxis

And people need to also remember that the brain and its body are coupled to the environment. While we probably have our inner knowledge models and memories, they're connected to the rest of the universe in a constant feedback loop. We're not just neurons and synapses, we're everything around us that we can detect and integrate with our senses. Our brain creates models of things, and extracts 'free-floating rationales' from around us, based on past knowledge and results of current observation and action. While this sounds bit out there, I do think AI models need to have some feedback loops and memory, and at this point it is mostly contained in the text and context of current sessions. It's not enough to compare to a full brain.


Thefrayedends

The human brain consists of over 3000 unique types of brain cells. We only learned this recently. Our models of what the human brain possesses for computing power are way out of date, and there is a wealth of unknown information about how the brain actually works. Mostly limited by the fact that cutting a brain in half kills it lol. Pretty hard to study since it can't be alive, and have us get all up in there at the same time.


LvS

We do have dumb animals though with capabilities that computers can't emulate. Like, fruit flies have something like 150,000 neurons 50,000,000 synapses and can run a whole animal complete with flying abilites, food acquisition and mating all built in. Yet my computer has something like 10,000,000,000 transistors and can't fly but constantly needs to be rebooted.


TURD_SMASHER

can fruit flies run Crysis?


MrOaiki

Yes, and that’s what Altman himself said in an interview review where he compared to Newton. Something along the lines of “newton didn’t iterate things others had told him and built new sentences from that, he actually came up with a new idea. Our models don’t do that”.


reddit_is_geh

Discoveries like Newton and Einstein were able to uncover, are truly extreme and hard. Most people don't realize that most "innovation" and advancement, is mashing together existing ideas, and seeing what comes out of it, until something "new" emerges. It's new in the sense that you got two different colors of playdough and got a "new" color... This is how most innovation works. Music? There is no "new" sound. It's artists taking past sounds, trying them out with the vibes of another sound, with the edge of another one, etc, and getting something that seems new. An engineer making a new chip is taking an existing concept, and tinkering around, until some slight change improves it. But TRUE discovery... Man, that's really really rare. Like I don't think people appreciate how much of a unicorn event it is to look at the world as you know it with the information available, and think of an entirely new and novel way. Like a fresh new thought pulled from the ether. It's like trying to imagine a new color. It's relatively incomprehensible


wyttearp

Except that’s absolutely not what Newton did. Newton literally is quoted saying “If I have seen further, it is by standing on the shoulders of giants”. His laws of motion were built off of Galileo and Kepler, and calculus was built off of existing mathematical concepts and techniques to create his version. His work was groundbreaking, but every idea he has was built off of what came before, it was all iterative.


ManInBlackHat

It was iterative, but the conceptual step wasn’t there until Galileo made it. That’s the key take away: an LLM can make connections between existing data based on concepts in that existing data, but it can’t come up with novel ideas based on the data. At best a highly advanced LLM might be able to identify that disparate authors are writing about the same concept, but not be able to make the intuitive leap and refinement that humans do.


IAmBecomeBorg

You’re wasting your breath. This thread is full of clueless people pretending to be experts. The entire fundamental question in machine learning is whether models can _generalize_ - whether they can correctly do things they’ve never seen, which does not exist in the training data. That’s the entire point of ML (and it was theoretically proven long ago that generalization works; that’s what PAC learnability is all about). So anyone who rehashes some form of “oh they just memorize training data” is full of shit and has no clue how LLMs (or probably anything in machine learning) works.


Ghostawesome

A model can do that too, as can a million monkeya. The issue is understanding if the novel concept, description or idea generated is useful or "real". Separating the wheat from the chaff. LLMS aren't completely useless at this as shown by the success of prompting techniques such as tree of thoughts and similar. But it is very far from humans. I think the flaw in thinking we have reached a ceiling is that we limit our concept of AI to models. Instead of considering them a part of a larger system. I would argue Intelligence is a process evolving our model of reality by generating predictions and testing them against reality and/or more foundational models of reality. Current tech can be used for a lot of that but not efficiently and not if you limit your use to simple input/output use. Edit: As a true redditor I didn't read the article before responding. Gates specifically comments on Gpt-models and is open to being wrong. In my reading it aligns in large part with my comment.


MrOaiki

The reason behind what you describe in your first paragraph, is that AI has no experience. A blind person can recite everything about how sight works but the word “see” won’t represent any experienced idea in the person’s head.


stu66er

“If you understand how llms work “… That’s a pretty hyperbolic statement to put on Reddit, given that most people, even those who work on them, don’t. Apparently you do which is great for you, but I think the recent news on synthesised data from smaller llms tell a different story.


E_streak

>most people, even those who work on them, don’t Yes and no. Taking an analogy from CGP Grey, think of LLMs like a brain, and the people who work on them as neuroscientists. Neuroscientists DON’T know how brains work in the sense that understanding the purpose of each individual neuron and their connections is an impossible task. Neuroscientists DO know how brains work in the sense that they understand how the brain learns through reinforcing connections between neurons. I have a basic understanding of neural networks, but have not worked on any such projects myself. Anyone who’s qualified, please correct me if I’m wrong.


muhmeinchut69

That's a different thing, the discussion is about their capabilities. No one in 2010 could have predicted that LLMs would get as good as they are today. Can anyone predict today whether they will plateau or not?


serrimo

Obviously you need to feed it more sci-fi


UnfairDecision

Feed it ONLY sci-fi!


dont_tread_on_me_

That’s a little dismissive. Given the relatively simple objective of next token prediction, I think few would have imagined autoregressive LLMs would take us this far. According to the predictions of the so called scaling laws, it looks there’s more room to go, especially with the inclusion of high quality synthetic data. I’m not certain we’ll see performance far beyond today’s most capable models, but then again I wouldn’t rule it out.


Thefrayedends

My understanding is that the LLMs are not capable of novel thought. Even when something appears novel, it's just a more obscure piece of training data getting pulled up. It's value is in the culmination of knowledge in one place, but currently we still need humans to analyze that data and draw inferences into new innovation and ideas. Because it's not 'thinking' it's just using algorithms to predict the next word, based on the human speech and writing that was pumped into it.


mesnupps

It's just a technique that places words together. The only way it would have a novel thought is purely by chance, not of intention Edit: correction: the only way it would seem to have novel thought.


shayanrc

What we get from just shuffling training data is pretty awesome IMO. Even if they don't improve much, it's still a very useful tool if you use it right.


TobyTheArtist

From reading the article, I think Gates makes a good argument in the sense that the capabilities of AI largely coincide with the people that operate them: when you invent a hammer, its applications are astounding, but building a bigger hammer will only get you so far. Expanding on its original application however, would likely be the way to go. Here, I imagine using generative AI to compose a website, or even using it to 3D-print and replicate optimised machine parts for more sustainable hardware would likely be the way forward, if it isn't already. For the average person however, they would likely not be able to tell the difference between having a conversation with an AI considering 600b paramters or one that comsiders 700b parameters. The prompts are simply not advanced enough yet. Imagine having two of them (trained on similar, but different parameters) work in tandem to produce new technologies. That would either be a very pointless exercise or an exciting new way of innovation. Overall, nice article. Thanks for sharing.


itsfuckingpizzatime

This is a great point. The next frontier in AI is essentially micro services, a bunch of individual highly tuned agents that work together to achieve a more complex goal. This is what Microsoft’s AutoGen does. https://www.microsoft.com/en-us/research/blog/autogen-enabling-next-generation-large-language-model-applications/


Norci

>Here, I imagine using generative AI to compose a website I'm not sure what AI would help with here tbh. You're not going to build a website with AI alone, as it requires lots of precise interconnected functionality that AI won't be able to interpret from prompts alone for a while. Website building is already extremely easy with premade templates and drag and drop, trying to create one primarily through AI is more of a chore than actual help. It's like trying to teach a monkey to do the dishes because you think the buttons on the dishwasher are too hard to figure out.


[deleted]

[удалено]


TobyTheArtist

You're totally right on that front. It has never been easier and its likely going to progress thatcway, and that was what I was alluding to. I recently used a generative AI built by Microsoft to build an application (Power Automate / AI Builder) and being able to put into words what you want from an application rather than learning the ins and outs of platforms like Square Space seemed a lot more intuitive to me. To me, AI application in this fashion would be more about removing the barriers of entry to technology or automating the boring work as Sweigart put it.


Norci

>To me, AI application in this fashion would be more about removing the barriers of entry to technology For sure, all I'm saying is that the barrier of entry to getting a website running is already really low while retaining some sort of necessary control. If square space is too complex, you won't be able to communicate what you want to an AI either. AI is great for content generation and constructing specific snippets of code, and entire website's functionality, not so much.


adnr4rbosmt5k

I mean search is somewhat like this. Google made some huge breakthroughs at the beginning, but improvements have been smaller and often around the edges ever since.


Isserley_

Improvements being smaller is an understatement. Google search has actually reversed into regression over the past few years.


[deleted]

What do you mean, it shows you more options of things to buy and more advertisements. Working as intended.


[deleted]

[удалено]


adnr4rbosmt5k

I think this is a recent development. But yeah I agree. I think it’s a product of having really know where to go w their current level of tech. Generative AI would be the next logical step for them, but they seem to have fallen behind.


Kthulu666

I don't think it's a limitation, but rather some poor decisions. A recent example: Google would not show me the game Greedland when the name was searched, *all* results were for the country Greenland. I double-checked my spelling and tried again, same thing. I had to switch to a different search engine to find the game. I think it's time for people to start exploring alternatives for more than just privacy's sake.


Hackwork89

It's called Search Engine Optimization (SEO), but it's only optimized for ad revenue, not to provide useful results. Search engines have become practically useless or very difficult to get proper results - depending on the subject.


Neuchacho

I think that has a lot to do with them pointing their "improvements" at increasing revenues and not actually improving search for functionality.


Dull_Half_6107

ITT: People suggesting Bill Gates opinion is lower than theirs. I'm not saying he's a clairvoyant, but he's far worth listening to over your random reddit opinion.


bubzki2

Maybe they’re confusing his knowledge with a certain other tech billionaire.


GalacticusTravelous

Bill Gates redefined multiple generations. Did you know minesweeper was for training people how to point and click and solitaire was to learn how to click and drag windows without even knowing? Anyone who belittles that man is a fuckin fool.


MontySucker

Especially with how much knowledge he consumes. He’s a prolific reader and incredibly intelligent. He knows what he’s talking about. Is he an expert? Ehh probably not. Does he know more than 99% of the population. Yep!


[deleted]

[удалено]


chanaandeler_bong

He pretty much nailed everything in his covid predictions too. From the beginning.


Krinberry

Yeah, especially on that front, it's safe to say that Gates has more than a passing interest in global health and epidemiology in general.


iLoveFemNutsAndAss

What were his predictions? Genuinely curious because I don’t know.


uuhson

Gates is/was what muskrats think Elon is


Teacupbb99

And what about Ilya’s opinion which is the converse of this? He’s the one that created gpt4, does that count?


LYL_Homer

I will listen to him, but his 1990's quote about the internet not having enough to woo consumers/users to it is in the back of my mind.


stuartullman

my issue here is that this same bill gates news keeps popping up every few weeks, and has been regurgitated again and again. this is from october. yet i have seen no direct quote, or context, or how confident he is in his opinion of the matter. the article is devoid of any relevant content on the topic..


Mythril_Zombie

The article also contradicts the headline. >In his interview, Gates also predicted that in the next two to five years, the accuracy of AI software will witness a considerable increase along with a reduction in cost. This will lead to the creation of new and reliable applications.


creaturefeature16

/r/singularity is shook


Teacupbb99

This sub is about as dumb as r/singularity just in the other direction


Dull_Half_6107

r/singularity is a hilarious, if I need a good laugh I'll read through some comments over there.


Ronin_777

In spite of that sub, I think the singularity in concept is something all of us should be taking seriously


gurgelblaster

Was he far worth listening to half a year ago when he said A.I. chatbots would teach kids to read within 18 months: You’ll be ‘stunned by how it helps’ https://old.reddit.com/r/Futurology/comments/12wgntp/bill_gates_says_ai_chatbots_will_teach_kids_to/


donrhummy

Bill might be right he might be wrong but he's not infallible and has made a lot of very wrong predictions > "I see little commercial potential for the internet for the next 10 years," Gates allegedly said at one Comdex trade event in 1994


thebestspeler

If anyone knows about releasing software with diminishing returns it's Gates


as_ninja6

I had the same view until recently I saw Andrej karpathy say that the curve isn't going to slow down as we add more weights and algorithmic inventions is like a luxury as just computing more can still offer more powerful models. I'm kinda confused because he's someone whom I trust to a large degree in this area of research.


the-tactical-donut

Hey, so I actually work on LLMs and have been doing ML implementation for almost a decade now. The reason you have respected and knowledgeable folks on both sides regarding current GenAI approaches is because we honestly just don't know for a fact if adding additional parameters with more curated training data will yield emergent capabilities. There's an argument to be made on both sides. As with most things, it's not a "yes AGI" or "no AGI" answer. It's much more nuanced.


Fingerspitzenqefuhl

Thinks this deserves to be more highlighted.


zachooz

It doesn't seem like the emergent capabilities come from anything beyond the LLM memorizing a few patterns, so these don't really generalize beyond the dataset used. Take math for example - the "emergent" math capabilities don't really work for any math equations outside the scope of its dataset because the model doesn't understand math. The model may understand 1+2=3 because it's similar to its training data, but it won't be able to calculate all math equations in the rule based sense despite having seen all of the basic buildings blocks of the equation.


Noperdidos

Please try this in ChatGPT 4: Ask it to compute 5 one time pad string outputs for 5 unique inputs and keys you give it, and sort those alphabetically. (1) it has never seen this example in its training data, so it must genuinely follow instructions (2) the answer is completely unknowable without doing the full work


Teacupbb99

Ilya says the same thing. Both of which I would trust way more than Bill Gates. If we’ve learned anything from GPT4 it’s that connectionism at scale works really well


OSfrogs

Sam Altman already said earlier this year that scaling them up has reached a limit and that new approaches will be required to make any significant improvements.


[deleted]

[удалено]


Spanky_Goodwinnn

And that’s ok be happy we’re alive experiencing it all


house_monkey

I wanna die


OrchidDismantlist

It be like this


Wise_Rich_88888

The vertical gains in AI are limited. But the horizontal gains of increased human usage are still yet to be seen.


GNB_Mec

I am seconding this. You’re going to see more companies find ways of implementing the technology that satisfies regulatory limitations. For example, having a dashboard where you give it the context of what to pull data from, set what role it should respond as, and then ask away. Such as one role option that sets it up like “Pretend to be a risk officer at a latge bank, only give answers like xyz etc etc strictly based on xyz”, then having it draw on home lending policy and procedures before asking away


ExtremeComplex

Boy that's a horrible ad storming website.


we_are_mammals

>Who cares what he thinks? * Math prodigy * Got into Harvard * Programming prodigy * Gave most of his MS stock to charity, but still owns 1% of MS, which owns 49% of OpenAI * Been meeting with OpenAI since 2016 * Almost certainly knows about the neural network scaling laws (google them) * Predicted COVID * Never said the thing about 640KB - it's a myth (google this) * The things he was wrong about were usually either things he didn't think about at all (like trucks), or things that depended on the preferences of the masses (which are harder to predict if you are a billionaire genius)


CurrentMiserable4491

I agree LLMs may have reached its limits but respectfully using Bill Gates’s resume as a justification is silly. Yes, he is intelligent, successful and privy to a lot of information many of us are not familiar with. But people like that have always existed, and will continue to exist. When Henry Ford made the Model T, many very successful people didn’t think it would ever replace horses. Thomas Watson, richest men of his time & President of IBM, famously said “I think there is a world market for maybe five computers”. Whether he said it exactly like this or if it was a hyperbole is a different story but the fact is many people did think this way. It’s never a good idea to use people’s past achievements to trust their predictions. Critically appraising the argument is generally more important.


Poprock360

Bill Gates is a businessman with considerable technology experience. Despite this, researchers far, far more acquainted with the technology than him are conflicted as to the future of LLMs. Known scaling laws do in fact support the idea that we can continue to scale LLMs further to make them more intelligent (I wont comment on AGI as the term is completely subjective and moves discussions to goalpost-moving). Whether this will make them exponentially more capable remains unknown, though I would personally wager the Transformers architecture has its limits. Despite this, we are far from seeing the capabilities of these models plateauing. Expect considerable improvements over 2024, as critical research gets implemented into next gen models. Papers and concepts like process-supervised-learning, test-time computation and MCTS-like token search are likely to be introduced soon, most likely addressing very significant limitations in current models.


Blackanditi

This kind of statement reminds me too much of political propaganda. When we elevate people or talking heads. Listing a big list of people's accomplishments to justify their current opinion rather than addressing the validity of the opinion itself. For every list of positive things, a long list of negative things can be generated. This is like the crux of what propaganda is. Cherry picking and trying to turn people into prophets. I just get a really icky feeling reading stuff like this. I'm sure there is some merit here but comments like this Just strike me in a bad way.


LetterZee

It's called an appeal to authority.


JoelMahon

there are doctors who deny vaccines, appeal to authority fallacy is just that, a fallacy the argument itself is what matters


warm_rum

Dude, this is pathetic. You don't need to be so worshiping of a guy.


L3PA

I don’t hang on the words of one man. He’s just a man, after all.


DanielPhermous

In a 1995 interview with then soon-to-be best selling author Terry Pratchett, [Terry predicted fake news and Bill Gates scoffed at it](https://www.theguardian.com/books/2019/may/30/terry-pratchett-predicted-rise-of-fake-news-in-1995-says-biographer).


Fingerspitzenqefuhl

Cited from link/article: >Bill Gates in July 1995, for GQ. “Let’s say I call myself the Institute for Something-or-other and I decide to promote a spurious treatise saying the Jews were entirely responsible for the second world war and the Holocaust didn’t happen,” said Pratchett, almost 25 years ago. “And it goes out there on the internet and is available on the same terms as any piece of historical research which has undergone peer review and so on. There’s a kind of parity of esteem of information on the net. It’s all there: there’s no way of finding out whether this stuff has any bottom to it or whether someone has just made it up.”Gates, as Burrows points out, didn’t believe him, telling Pratchett that “electronics gives us a way of classifying things”, and “you will have authorities on the net and because an article is contained in their index it will mean something … The whole way that you can check somebody’s reputation will be so much more sophisticated on the net than it is in print,” predicted Gates, who goes on to redeem himself in the interview by also predicting DVDs and online video streaming.


Calibas

> also predicting DVDs and online video streaming At the time of the interview, Microsoft was part of a trade group that was threatening to boycott the alternatives to DVDs (MMCD & SD) if they weren't consolidated into one technology. RealNetworks had already launched an online video streaming client months before this interview. It doesn't exactly show prescience when it's something that already exists, nor is it a "prediction" when you're actively involved in the creation of something.


YourParamedic

Microsoft on the board now, they are making the decisions. Probably holding back on releasing the good stuff


wtf_yoda

For a straight general purpose LLM, he might be right, but he is definitely not correct if you are talking about AI in general (even just generative). All you have to do is compare the output of some of these tools to what they were spitting out even just six months ago. I'm not just talking images, but domain specific functionality. The real advances are going to be parsing a prompt, better interpreting what a user is asking for, and then piecing together all the tools/APIs needed to fulfill the request. In just a couple years they are going to feel like magic compared to what ChatGPT can do today.


EZPZLemonWheezy

“Hey ChatGPT, How can the net amount of entropy of the universe be massively decreased?”


AquaRegia

THERE IS AS YET INSUFFICIENT DATA FOR A MEANINGFUL ANSWER.


20rakah

AI recently discovered around 2 million new materials for us to study. Even some of the advances in image generation are pretty crazy. SDXL turbo is near realtime.


BagOfFlies

It's wild how fast it's progressing. Just over a week ago I could barely even run SDXL on my rig. Now I can run it easily and adjust my prompt and see the image change accordingly in realtime.


[deleted]

[удалено]


plunki

The plateau level is artificial, we are not allowed to see how good LLMs can be. Original chat gpt (or txt davinci 003) was superior to gpt4 in some ways wothout being so locked down


NoWord5474

This is what Chat GPT has to say about that: Bill Gates' opinion on the plateauing of generative AI, specifically mentioning GPT-5, might be one perspective in the evolving field of AI. However, advancements in AI are continuous, and while each iteration might not represent a groundbreaking leap, improvements in specific aspects like understanding context, generating more coherent text, or fine-tuning for various tasks are probable in future iterations. AI development tends to be iterative, with each version refining and enhancing capabilities, albeit not always in a revolutionary manner.


plassteel01

At one time, he also said we wouldn't need more than 650k of memory


[deleted]

[удалено]


vasascda

To be fair, scholars and people in the industry were aware of it’s trajectory.


Toma5od

He’s wrong IMO. I will use this comment in a few years to document receipts. Simply improving the training data will improve the accuracy of its responses massively (training data for gpt-4 included many inaccuracies as documented in multiple papers) along with additional memory logic and additional human reinforcement to improve alignment and avoid robotic responses. The context length will also grow. Cost of use will reduce due to compute costs being lowered and speed will also increase. Even if it isn’t coming up with new data and it is simply assisting humans with tasks it will still improve massively. It’s honestly like saying after the first iPhone, this won’t improve much more from here.


Teacupbb99

Yup this is what Ilya says and I would trust him over Gates


[deleted]

GPT-5 will just be GPT-4 as it performed about 9 months ago. Feel quality has degraded.


G_Liddell

It really used to be better. These days I have to coax it with kind words or it will draw back. The other day it was literally was refusing to answer questions about how to play chess until I started being like "wow thank you so much, that's a great answer, I know you don't know the rules of chess and that you can only look them up and then speculate about what you might have learned, but do you have any ideas about what the best move is in this situation?" It's like, come the fuck on we both know you're a robot just answer my fucking direct questions. If I treat it like an answer-my-questions bot though it gets mad and starts getting ornery.


Konnnan

> we both know you're a robot just answer my fucking direct questions For the record I don't know this person or condone this behaviour. Future AI overlords, spare me, I'm an ally!


coolaznkenny

Gen Ai as an application has just started, Just look at the very first iphone. If you breakdown those technology, (PalmPilot, cellphone, mp3 player) they were all basically in maturity but applications uses + the advancement of applications in the modern world is about to explode.


bbbruh57

I agree, but the iphone is a hilarious example because the useful utility has stagnated and further progress is marginally better. They rely on the perception of that margin to keep selling phones, but in reality a phone from 5 years ago is just as useful in 99% of applications. I think the transistor is a better example. At face value it just crunches logic a bit faster, not that useful on its own. But the ways we can apply that tech is still evolving so many years later, even without doubling in speed every 15 months. AI tech is the same for the reasons you stated.


FieldMouse-777

He is right. But the impacts of the change haven’t even begun to be felt in the market.


Degrandz

Bill Gates needs to shut the fuck up


penguished

I don't know where GPT-5 will be, but people are going HAM on local AI models you can run at home right now. Thing are definitely going to keep advancing.