https://twitter.com/random_walker/status/1791109550894178450 We are likely already in a plateau, no talk of gpt5 or Gemini 2 etc is a canary in the coalmine. No amount of nuclear power or GPU can change the outcome.
2024 Presidential Election
Yesterday
2116
What crime did trump commit?
Ask Blinders
Yesterday
312
Why would Universal Basic Income work ?
New York
Yesterday
1254
Got assaulted in subway
India
Yesterday
952
Did you know Indians have a racial slur towards white peoples they use all the time?
Tech Industry
Yesterday
726
Can everyone agree now that USA is a banana republic
DumbDum
When I heard that you could get LLMs to regurgitate some of their training data verbatim, I figured we had passed the point where additional parameters were likely to help. The big question IMO isn't how much bigger we can make the models, it's how much smaller we can make them without losing capability. If we teach them to memorize more stuff, we're just building seach systems with a natural language interface. But if we teach them to pack the same training data into a smaller network, without losing capability... Now we can do more with less. I like the direction Microsoft is going with Phi. Not saying Phi itself is better (or even equal), just saying that this is a more promising direction to investigate.
Arvind usually has good takes about ML. I agree with him about this - GPT4o might be the best model out there, but it still makes the same mistakes on inputs that can fool even the smallest LLMs. Despite that though, LLMs have a weird kind of intelligence that is very different to humans, and thereโs going to be a lot of new products that can come out of that
Indeed arvind is great. I'm really puzzled what sort of products are going to benefit from llms the most, but the stocks like nvda and msft and Google are priced as though they are going to run flights to the moon
The best analogy I heard is that LLMs are more like microprocessors than anything else, and are a foundational technology that enables things to be built on top of it. If we went back to the 1970s and told microprocessor architects that theyโd be used in your car, your watch, and headphones theyโd laugh at you. It remains to be seen what kinds of technology people will build on LLMs
Groq is enough, it is 10 times faster than any other gpu I have seen.
Can we add option Meta llama3 โ ๏ธโ ๏ธ
I think the next things that will follow up within few months/years are: 1. Video generation with voice 2. Breakthrough in input text limit 3. LLMs that can run on low power devices 4. Technologies that will protect content generated by LLM. Like we have for games and digital media (drm etc) 5. Breakthrough in robotics. Just provide the mechanical information to LLM (motors, parts, how they are connected) and the model will take care of moving the robot on its own. I think it's just the beginning.
I think people are just over hyping the limits of llms. The biggest problem with llms is their ability to produce similar results. We are humans not machines , we get bored of similar things. Its just a matter of time the organisations which are investing heavily in machine learning (ai is just bullshit coined up term), will fail to cover up the cost. This will make the organisations to cut on costs. That will blow up the pump the nvidia is getting and sooner or later this hype will fade away
LLMs are good search engines.
The competition isn't about making the models larger anymore. Now they're trying to get similar value out of smaller models, making them small enough to run on small devices/cellphones. Even after that, there will be more topics to cover and these companies will stay a main part of the game.
Can someone explain to me why we need models to run on small devices/cellphones when these devices are connected to the Internet? Whatโs wrong with running these models server-side?
The cost to run llm and serve to a big audience is huge right now. If llms can run ondevice, we can serve millions with a smaller infrastructure bill.
I do not think llms will be anywhere near "real" intelligence ever. Enjoy using them for what they are good for, like getting images for my children to color, or translate, but I think they are waaay overhyped.
Damn. I really disagree. Customer service over the phone , for example, is done for. Are you trolling?
Caveat - Canadian perspective. May not apply in other legal jurisdictions Customer service will change (and likely shrink), but it's not done for. Not over the phone, or anywhere else. The LLM propensity to hallucinate is a severe legal liability, and has been demonstrated as such in court. If a customer service bot invents a refund policy that doesn't exist, the company has to honour it.
Right now pigeons with tiny brains are more smart than LLMs with racks and racks of GPU. It is not the GPU/compute but rather the Architecture. We are going to need the next breakthrough Architecture after "Attention is all you need". Only peel is left now after squeezing all the juice out of that one.
Expecting to reach AGI with just GEMM is not particularly intelligent.
We need a better model for intelligence. reading a bunch of data and finding patterns is not sufficient to achieve agi unless you have infinite amounts of data and computational power
No!