>>39482
Sure, I'd love to elaborate.
>And the size / intelligence ratio will be better by then, so I could train it for cheap.
Basically, I'm waiting for some sort of breakthrough that makes inference and training much cheaper with same computational power. Maybe the diffusion LLM project, or RWKV, or something completely different.
If a 1B model performs as well as the 12B model, I could train it overnight on my GPU.
Use that to update the personality bias and the memory layer loras without lobotomizing the model. Or someone comes up with neural network architecture that really learns during inference, and isn't just fancy autocomplete like transformers.
>Instead of the usual reasoning, it comes up with the whole unconscious thought process itself.
This one is interesting. There is a guy who already does something like this.
https://github.com/yukiarimo/yuna-ai
He trained his own model to generate to generate different data for the situation.
><yuki>: User's dialogue
><yuna>: Companion's dialogue
><hito>: Other peoples' dialogue in the same conversation
><qt>: Internal thoughts and feelings
><action>: Function calls and actions
><data>: Embedded data or information
Talked to him, nice but weird dude. This is like the reasoning of modern models, but for different aspects of world building, and the tags are additional special tokens. Usual models have system, assistant and user.
In theory, a model could be trained to approximate the logic of all my agents. You start off with the dialogue, and the model dynamically generates the thoughts, goals, emotional impact of the new input on the fly. This means that I wouldn't have to make sooo many prompts with specialized agents, just to get one specific output (such as valence, anxiety delta as json).
I decided not go any further with this, because I can't into math and even if I manage do make progress, some chinese dude will have it perfected by the time my code works.
In the meantime, i'm getting into holographic waifus. Bought a Quest 3 and now I'm researching slam and segmentation.