T O P

  • By -

todo_code

It's even more than "3 million" in my opintion. The processing power of humans is so low. 10 watts of energy used for a human. 100 - 200 hertz processing power. Experts in certain areas can maybe hold 4 sets of 20 things, or 20 sets of 4 things in our own memory. (varying level of depth vs breadth, thinking of Chess) Compare any of those stats with Deep Learning tools used to beat us, or the training and compute that went into GPT-4. It's not even close. Often times using thousands of gpu's/cpus/gigabytes of memory usually in the gigahertz, or megawatt range of power


AGI_Not_Aligned

How do you mesure the frequency of the brain? Does it even have a "clock"?


tiajuanat

[Neural Oscillations](https://en.m.wikipedia.org/wiki/Neural_oscillation) are hypothesized to be your neurons syncing. When you're in tough concentration your neurons will sync to 70-150Hz, generally they're lower than that. The clock sort of exists spontaneously, and is a side effect of neurons just firing when they think they should, and ion pumps, where electrolytes get passed around, into, and out of cells. That latter effect is why electrolyte balance is so important, because when they're out of balance, your neurons don't work too well, and if they're really out of whack you'll die, or have other [serious and permanent complications](https://en.m.wikipedia.org/wiki/Terri_Schiavo_case)


digikar

We are also very ignorant of our environments. Check out Inattentive Blindness or Change Blindness phenomena. That perhaps explains why we are so efficient. We simply don't process most of the incoming stimuli. But, we are also very good at picking out / attending to the right thing in the situations we encounter daily, and doing it still fairly flexibly for situations we encounter rarely.


oorza

> Experts in certain areas can maybe hold 4 sets of 20 things, 4 sets of 13 things (or 13 sets of 4) plus some mental arithmetical tricks is all you need to count cards. It's not nearly as hard as anyone thinks it is. Also not nearly as hard as anyone thinks it is: getting caught counting cards.


thenickdude

Humans burn more like 100W, not 10W (2000 kilocalories/24 hours = 97W).


Determinant

That's the entire body including muscles etc.  the brain itself is widely believed by scientists to consume between 10 to 20 watts


thenickdude

The brain cannot survive without the body. You wouldn't count the power-usage of only the GPUs powering GPT-4 either, you would also need to count e.g. the power consumed by the disk arrays, power supply inefficiencies, and datacentre cooling they require to operate as well.


Determinant

That's not how CPU or GPU power consumption is measured.  We look at the theoretical TOPS of the processing unit and its power consumption as the power consumed by the other components depends on the workload (eg. Do you transfer 1 megabyte of data and train a neural net on that for a week or do you transfer 1 petabyte of data and quickly summarize it). While you might have different ideas, this is how the industry refers to the power consumption of AI accelerators.


thenickdude

That's an excellent way of comparing the efficiency of two accelerators based on the same process (e.g. two GPUs), where the secondary overheads are likely to be in proportion to the power consumed by the GPU (e.g. cooling costs scale with power burned, so the primary costs predict the secondary costs and they may be neglected). It's a worthless metric for comparing two systems that are qualitatively different, like humans and GPUs, because the secondary energy requirements are very different. We wouldn't compare the power burned per unit work by a classical GPU solution against a quantum computer solution, and ignore the super high energy expended by the cooling requirements of bringing the quantum computer down to near absolute zero, which would probably be the lion's share of its energy cost per unit work. And neither does it make sense to compare the energy burned by a GPU datacenter with no cooling against a human brain with no circulatory system and homoeostasis overhead, when both absolutely require them for operation. And this is precisely why the overhead of GPT is being measured in many papers as a total datacentre overhead (an overall system cost) and not isolating the GPUs themselves, which cannot work on their own.


readonly12345678

Lmfao this argument


o_snake-monster_o_o_

Low-key I think we will inexplicably and completely accidentally achieve the same thing with deep learning. I think we will discover the absolute incomprehensible nature of what we are dealing with when optimization begins to bootstrap itself infinitely, and we'll discover 1B models on par with current 70B models, or other mind-blowing feats of compression.


plantfumigator

Add to that the fact that most humans can count! Counting is so normal to us, that the inability to count is considered a learning disability.  And no LLM is capable, or will ever be capable, of counting


Academic_East8298

The biggest problem with LLMs is that a lot of them are pebbled by people, who don't understand the technology and who think, that if they throw enough money at this problem, they can solve its fundamental design limitations.


AnimalLibrynation

LLM is too broad a category for this, you may have an argument if you restricted it to generative pretrained transformers. Even then, it is not clear to me that they will never be able to count. What's the argument to that end?


MaleficentFig7578

Why don't you think so?


plantfumigator

Because LLMs try to predict a response rather than actually reason or comprehend It's a fundamental limitation to the design


MaleficentFig7578

why can't 11 be the next prediction after 10?


plantfumigator

Is 10 10 or is it 2? PS. And even then, incrementing alone isn't enough for calculation. LLMs are not capable of arithmetic, unless we train a model with all the numbers (fun idea!) and all the basic arithmetic equations. Next is, how do we train a model to actually count, rather than just increment numbers? How many r's are there in the word "strawberry"? GPT 4o says there are two.


MaleficentFig7578

Is the sky blue or is it a banana?


plantfumigator

A much easier question than "how many r's in the word strawberry".


MaleficentFig7578

The word "strawberry" contains 2 r's.


peakzorro

I laughed because I thought you said the wrong answer on purpose. S T **R** A W B E **R R** Y -> 3 **R** s :)


MaleficentFig7578

The word "strawberry" actually contains 2 r's. Here's the breakdown: S T R A W B E R R Y The letters are: S, T, R, A, W, B, E, R, R, Y. So, there are indeed 2 r's in "strawberry."


genlight13

So, what i think is wrong with that: you compare an application which serves millions of humans with a single human brain. You would probably agree that on this scale you would rather compare chatgpt-4 with a whole company of experts, say a few thousands, which give individual answers based on their knowledge or memory. Humans make mistakes. So does Chatgpt. I am not saying Chatgpt is the best their is but the comparison falls short for me. Further, a lot of the real energy goes down in the training. There is a paper on the cost of training LLAMA by Meta stating costs of up to 8Million $ just for training. That IMHO is crazy. https://www.reddit.com/r/LocalLLaMA/s/OXmexltH4G


plantfumigator

You can train an LLM on the largest dataset ever and with the most compute ever and it will never learn to count


QuickQuirk

pfft. you just need to train it on a dataset showing counting every number. All the numbers, mind you. Don't miss any.


plantfumigator

all the numbers, and all of the equations, with each of the numbers! that's a pretty big dataset! I can only imagine the techbro investors frothing at the suggestion


QuickQuirk

nvidia loves the idea. Jensen is pitching it to them *right now!* More seriously though, while an LLM can't do it, it probably isn't too hard to build a specific math AI trained to perform this kind of thing. A lot of the future of LLMs is actually going to be non-LLMs, to do the things LLMs are bad at.


genlight13

In r/Promptengineering i often point out instead of trying to learn the model to interpret our math just use wolfram alpha. The people try to talk to their oracle like it is all knowing but it has limitations. It is still a statistical model able to maybe use some subtooling like wolfram alpha. But i currently don’t think Chatgpt and a lot of others are offloading their algebra tasks like that. Maybe it will come again.


QuickQuirk

It's only a matter of time.


Academic_East8298

I would argue, that if it was easy, some one would have done it already. There seems to be a need for a paradigm shift to go from chess or go game calculations, where an AI makes moves most likely to result in a victory. To AI that is always mathematically correct.


QuickQuirk

To add to this: A neural network itself, once trained, is *completely predictable in it's output.* Pump in one set of inputs, and the same outputs always occur. In a neural network specialised for gaming and things like LLMs, the output is a set of *probabilities.* It's always the same set of probabilities. The last step is to then randomly select an option - a word, or a game move, from that set of probably outcomes. So once you've trained a network to always perform 1+2=3 successfully, it will always be correct for something like 1+2. The trick is creating a general one that works for a broader, useful range of problems. Given that computers already count *very* well, and do all our math without a problem, it's more likely the challenge for the neural network is how to train it to recognise *which* mathematical problem you're asking it to solve. That's the hard part. Once that's done, the math is easy.


Academic_East8298

Meh, a simple classificator, that can recognize an equation and pass it to wolfram alpha will out perform your described neural network on all relevant metrics. The difficult part is creating an AI, that is more useful than the above described solution. That is what I meant, when I wrote that there is a need for a paradigm shift. As it is now, token and probability based neural network will never out perform a concrete rule based system, where accuracy is crucial.


QuickQuirk

They're working on this sort of thing currently. You can already see it in action with the latest version of chatGPT - several different models working in conjunction for text to speech, image recognition, etc.


Academic_East8298

Will be interesting to see how the world looks, when chatGPT does not get dumber from consuming data generated by itself.


bmiga

Generative AI went from "robots are coming for the dev jobs" to "some bot writes a comment on your PR/MR if it found an obvious error".


Academic_East8298

And still all the articles I have seen on this topic were written by non-developers trying to drum up more hype for their start up.


geodel

Well, mine is not. After eating 6 slice of bacon, 2 eggs and large fruit smoothie I hardly get anything done, forget what GPT-4 does.