MogVM *vm = mog_vm_new();
There’s a lot going on in this one; I really think you’re going to like it.
My first instinct was creativity. I had models generate poems, short stories, metaphors, the kind of rich, open-ended output that feels like it should reveal deep differences in cognitive ability. I used an LLM-as-judge to score the outputs, but the results were pretty bad. I managed to fix LLM-as-Judge with some engineering, and the scoring system turned out to be useful later for other things, so here it is:,详情可参考新收录的资料
Playing Against TeXCCChess,详情可参考新收录的资料
Что думаешь? Оцени!,这一点在新收录的资料中也有详细论述
发布会现场图 图片来源:客户提供