https://linktr.ee/tomawezome Donations:

  • BTC: bc1qu73wa69ey6f4qjhpg0sdtkxhusvtf68946eg6x
  • XMR: 4AgRLXVNgMhTWsEjEtZajtULPi6964nuvipGXc6eNyFhWF9CSm7rRpFWQru8hmVzCkS5zBgA2ehhcbk86qLxM9MZ5pTEgYb
  • 0 Posts
  • 2 Comments
Joined 2 months ago
cake
Cake day: October 31st, 2024

help-circle

  • TinyLLM on a separate computer with 64GB RAM and a 12-core AMD Ryzen 5 5500GT, using the rocket-3b.Q5_K_M.gguf model, runs very quickly. Most of the RAM is used up by other programs I run on it, the LLM doesn’t take the lion’s share. I used to self host on just my laptop (5+ year old Thinkpad with upgraded RAM) and it ran OK with a few models but after a few months saved up for building a rig just for that kind of stuff to improve performance. All CPU, not using GPU, even if it would be faster, since I was curious if CPU-only would be usable, which it is. I also use the LLama-2 7b model or the 13b version, the 7b model ran slow on my laptop but runs at a decent speed on a larger rig. The less billions of parameters, the more goofy they get. Rocket-3b is great for quickly getting an idea of things, not great for copy-pasters. LLama 7b or 13b is a little better for handing you almost-exactly-correct answers for things. I think those models are meant for programming, but sometimes I ask them general life questions or vent to them and they receive it well and offer OK advice. I hope this info is helpful :)