Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

MOE but 120B range. Man I wish it was an 80B. I have 2 GPUs with 62Gib of usable VRAM. A 4bit 80B gives me some context window, but 120B puts me into system RAM
 help



Either some q3 or since it's a MoE, maybe a REAP version of q4 might work (or could be terrible, I'm not sure about REAP'd models).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: