Depends on what you want to do, the model, and optimization or quantization.
A lot of LLM stuff that seemed pretty amazing a few years ago - chatbots and the like that respond to questions in plain language - can run in comparatively light hardware. Coding agents can take more, but could also be optimized against a particular language and spit out useful snippets.
Image stuff can be pretty complex especially at higher resolutions and detail, and creating seamless video segments gets expensive on hardware, fast.
Quite true. The thing is, there aren’t billions and billions of dollars in chatbots. The billions are for the creative stuff and the code.
And that is where the reckoning / correction will come from, the bill has to come due eventually. When top end generative AI starts to have a real cost associated with it, then it’s no longer a blanket ‘everyone start using this immediately’ mandate, it prompts some consideration of cost versus output quality.
Even then, that’s quite small. Top of the line frontier models would be looking at hundreds of gigabytes of video memory, and just as much RAM.
A terabyte of VRAM/RAM needed for something like CoPilot is probably a fairly sensible estimate.
Depends on what you want to do, the model, and optimization or quantization.
A lot of LLM stuff that seemed pretty amazing a few years ago - chatbots and the like that respond to questions in plain language - can run in comparatively light hardware. Coding agents can take more, but could also be optimized against a particular language and spit out useful snippets.
Image stuff can be pretty complex especially at higher resolutions and detail, and creating seamless video segments gets expensive on hardware, fast.
Quite true. The thing is, there aren’t billions and billions of dollars in chatbots. The billions are for the creative stuff and the code.
And that is where the reckoning / correction will come from, the bill has to come due eventually. When top end generative AI starts to have a real cost associated with it, then it’s no longer a blanket ‘everyone start using this immediately’ mandate, it prompts some consideration of cost versus output quality.