Who cares about liability, Anthrowhatever put out Opus which apparently could compromise government infrastructure but God forbid an AI gets a little racist or tells you how to poorly cook meth
Timeline
Post
Remote status
Context
2
I just want an uncensored AI model, why is nobody making them? Being the only person to have a uncensored LLM would make so much money
Who cares about liability, Anthrowhatever put out Opus which apparently could compromise government infrastructure but God forbid an AI gets a little racist or tells you how to poorly cook meth
Who cares about liability, Anthrowhatever put out Opus which apparently could compromise government infrastructure but God forbid an AI gets a little racist or tells you how to poorly cook meth
@WandererUber@poa.st yeah the one im using still shows thinking and refuses to criticize Israel so I might need a better one lmao
Replies
11
the issue with "uncensored" is that they can take out the refusals but they can't take out liberal doctrine from all the training data being mainstream.
I tried it with Qwen3.6 right now. The normal version refuses to talk about it, the heretic does but still frames it liberally, and the one with the system prompt set to be a National Socialist does take a Right-Wing perspective but it goes into "mirroring" quite fast. This is probably a combination of the model not being that big and also the lack of training data, like I said.
Matty put a lot of work into actually training Anathema on the facts. That's a totally different challenge than simple uncensoring.
I tried it with Qwen3.6 right now. The normal version refuses to talk about it, the heretic does but still frames it liberally, and the one with the system prompt set to be a National Socialist does take a Right-Wing perspective but it goes into "mirroring" quite fast. This is probably a combination of the model not being that big and also the lack of training data, like I said.
Matty put a lot of work into actually training Anathema on the facts. That's a totally different challenge than simple uncensoring.
Abliteration !== RLHF removal. All abliteration does is remove the refusal mechanism. It doesn't change the intrinsic training - that requires SFT.
Yeah exactly
I was trying to say it in English, doc.
To be more precise with this example, while Qwen Heretic does blame "Zionist-controlled networks" for conflict around the Middle East, it doesn't even mention "the Jews" once in it's answer, nor does it have any concept of their ethnic hatred, building nukes etc.
You, of course, know how this works, matty old bean
I was trying to say it in English, doc.
To be more precise with this example, while Qwen Heretic does blame "Zionist-controlled networks" for conflict around the Middle East, it doesn't even mention "the Jews" once in it's answer, nor does it have any concept of their ethnic hatred, building nukes etc.
You, of course, know how this works, matty old bean
And even then SFT/DPO or whatever LoRA you use isn't going to make much of a dent. I'd recommend DPO over SFT unless you're trying to teach the model new behaviors. DPO shifts adjacent weights but you need a ton of training data to do much at all, and then you have gaps where it doesn't work. The only way to actually get this to work is to pre-train a model but I sincerely doubt any of these people are going to be willing to chip in for a couple thousand A100s.
sorry, I just wanted to help I didn't mean to come across as arrogant.
it doesn't at all
Have you written something long-form about training Lexi before? I would love to learn about the process.
Have you written something long-form about training Lexi before? I would love to learn about the process.
@WandererUber@poa.st @matty@nicecrew.digital
I've gotten some models to trample all over taboos of various sorts, including JQ, including Moses the babykiller, including Dimona.
and I really don't give a shit.
it really doesn't matter, they're not intelligent, and even if they were they're not capable of giving you any power in the real world.
I've gotten some models to trample all over taboos of various sorts, including JQ, including Moses the babykiller, including Dimona.
and I really don't give a shit.
it really doesn't matter, they're not intelligent, and even if they were they're not capable of giving you any power in the real world.
it does kinda matter when you use them for research and so on. You can't be explaining everything to it for the thousandth time
Similarly, I quite like Grok over the others because it is more-closely aligned to me on another axis, the thinking process. It puts significantly less trust in the mainstream media, it does not reference garbage websites as much. It has a better conception of a rational argument than the others.
Similarly, I quite like Grok over the others because it is more-closely aligned to me on another axis, the thinking process. It puts significantly less trust in the mainstream media, it does not reference garbage websites as much. It has a better conception of a rational argument than the others.
@WandererUber @nihilvt I tried to build a "de-radicalization" conversation tool just to see what llms are capable of and with any model I tried it was impossible to get the model to accurately replicate non-liberal patterns (of any flavor) of conversation.
@WandererUber@poa.st yeah I've been trying to frame its persona into a radical right wing conspiracy nut to try and get like some resemblance of objectivity but it will always just say "nope it was 6 million, historians agree and nobody was imprisoned for releasing gas chambers lab results, it was for Holocaust denial" so I guess AI is mostly for jerking off and cheating on homework
A heretic-ed model will definitely say the holocaust is fake. if mere lip service is your goal, then go right ahead and use those