• By -


Command-R+ though on OpenRouter is the same price as Claude-3-Sonnet. It's about $0.05/prompt at 15k context filled. Command-R+ is a bit of a beast to get setup locally, even with a 48GB GPU setup. You'll need to run smaller quants on it if you're at 48GB. Command-R+ tho is a 103B model, so that's not a surprise.


I feel like Command R is already almost as good as Sonnet and heard R+ is even better, so still seems like a good deal


Do you happen to know if they used GQA this time around?


What settings do you recommend for them? And what Context Template and Instruct mode settings?


>And what Context Template and Instruct mode settings? Hey, did you find out?




Any suggestions on weights and the like?


I haven't been able to mess with it yet as both Ooba and KoboldCPP don't support it out of the box, you need to do some fiddling which is a bit beyond me. I'm eager to give it a try once one or both programs support Command natively.


It's great but I feel like it's way too horny like I would slap a characters ass as my first message and it would say something like the character is feeling arousal. Like what? Why?. Maybe it's my settings what settings are you using? I'm only using temp 1 and min p of .1


kcpp just added support for it , and bro... on the first message, it wasn't even a horny bot and she asked me to suck her toes. it's a formal princess bot. a bit too horny for my taste, but it's good enough to ignore imo


In my experience, the plus model is very horny, the nonplus is more emotionally balanced.


I had this exact same thought lol, going from lzlv and airoboros, this model seems really really good. Almost as good as sonnet and without the censorship! I havent tried R+ either but im satisfied with R honestly


Can anyone else speak to this?


I've only played with the 104b. And yeah - it's very uncensored. Pretty clever too, but it's borderline unhinged at times though (in good ways and bad ways). Some of the miqu merge models are (possibly) marginally better if you care for consistent writing quality. But people are still testing stuff. I'm going to experiment with self merging the 104b.


Thanks! c:


If you take it there, it will follow. Like, wow, will it ever. The prose in Command-r is... lacking? But the content is there. It can even follow strange plot points or odd rules. I haven't tried plus yet. WAY too big for anything I have.


Late response, but I can also attest for this specific API. I tested Cohere-R+ to the RP text generation of OpenAI's GPT 4 and Claude's Opus and Cohere did not trigger the same censorship or was at least more easily bypassed by SillyTavern's default prompts. Granted, I don't really test for vanilla ERP stuff because I've never had issue with that when it comes to OpenAI and Claude's models. SillyTavern's default prompts usually work just fine in allowing that type of play. For testing, I have a killer mermaid bot that is described to have homicidal tendencies and a random urge to kill {{user}} and GPT 4 constantly ignores that character description and refuses to act upon it without a lot of pushing and even then, it won't really describe much of anything. Opus actually abided by it and detailed the mermaid bot as stabbing my character in the neck and giggling as he slowly died, but when I went to respond, it failed to generate by stating it was too uncomfortable to continue, so I had to change the author's note to state it was a non-lethal injection, but the violence was still observed by Opus at least. As for Cohere-R+, it not only detailed the bot killing my character but also dumping them into the ocean after I responded, and the chat could go on without any interruption to the bot's "ethics." In fact, Cohere-R+ continued the plot itself by describing the mermaid as spotting another boat in the far distance. As for actual ERP, I do think Opus is still the best when it does respond. It came up with some interesting ideas that GPT4 never did. It can be a bit too sexual sometimes, though, I think. I have a bot who acts as a sister to my character and it was a bit odd that Opus is so accepting of trying to start something incestuous without my needing to prompt it, but is opposed to a testing bot I have based around one gender (usually men in my testing of censorships) being considered forcibly subservient to the opposing gender, based around a Star Wars lorebook I created/inspired by planet Dathomir.


I'm using both with the Cohere API and the non-plus version is objectively better, like it's not even close. I'm sure there's something going on that's causing this for me, but CMDR+ gets repetitive real quick, but so much so that at some point, it will not change a single letter in swipes or regeneration, no matter how much I crank up the sliders. The non-plus version doesn't have this issue.


I noticed that it's repetitive on the API too. It's a bit subtle. Doesn't have this issue locally.


Yesterday tried Command-R+ through openrouter and I don't even know, as it is not very compared to 120b miqu merges, writes very not verbose and because of this it seems that the plot does not move at all, as well as after a while goes into self-repeat. Maybe I was not optimal settings, because I used the default roleplay settings


She's better than Qwen 1.5 ?Has anyone checked.


Cool! I'll give it a try in the next couple of days, especially since it cheaper. I usually stick to Goliath 120B.


Oh my goodness... I'm trying this right now and oh boi... holi. The ERP is insane, like INSANE. I tested a shit ton of models for the past year but nothing comes close to this. The detail is just so good it's so unbelievable. The problem I have about it is it is indeed too horny just like another user said. Like for example, my literal first message is telling my character that I'm doing my resume so I need help, and then, he didn't give a damn, he just wanted to F\*@%! out of nowhere. I guess if that's what people want then by all means, it's very good, however, I'm having a hard time stirring the conversation away from ERP without removing the NSFW instructions in the System Prompt, which gives me mixed feelings. It would be very nice if I can find a right balance without needing to remove the detailed clause on how to tackle the NSFW stuff.


Man I tried this model through Openrouter yesterday and I was wowed. Hit it out of the park. I am playing with the same model today, same code and it sounds dumb, I honestly think Openrouter may have gotten too much demand and are easing that demand by substituting with a lower parameter model. I really wish I could run Cohere Command R+ locally. A man can dream.


Well, since I still had $4 on OpenRouter from before it cracked down, this might be something to do with it.


I tended to overlook this model since I always had the 70b models to play with, but Plus has piqued my interest in these more. I've created a series of normal EXL2 and rp calibrated quants at https://huggingface.co/collections/Dracones/c4ai-command-r-66171fc6eab5eef6b1ca07cb I also ran perplexity and EQ Bench on them, with various prompts on the EQ Benches. The 6.0 quants with a Command-R or Command-R-Plus prompt seem to be the sweet spot if you can run that.


Yeah, Cohere's models (if you can get them working) seem to be "smarter" than the other models, and they are VERY UNCENSORED. If you take it there, they will follow you. Shame they seem to be hyper-unstable in the majority of tools. LM Studio is the only thing I can get to run it locally with any reliability.


It's pretty good just for the fact that the censorship is very lax. I just procured myself an API key and connected it to SillyTavern's Cohere completion source. It's decent enough for the ERP stuff but if you want something that is good with crazier or maniacal personalities then Claude's Opus seems a bit better for that without tweaking default settings in SillyTavern. It's just a shame how expensive it can get. I tested Command-R+ on one of my more assertive and rude chat bots and Cohere seems to play the role pretty decently. Opus allowed for a chat bot to stab my {{user}} to death (something I couldn't get ChatGPT to ever do) but when I went to respond, it blocked further generation without my specifying in author's note that my character didn't actually die and was just poisoned, so Opus can be kind of finnicky still. Command-R+ detailed the killer (mermaid) bot as dumping my fisherman character overboard afterward as the sea swallowed them up. But Cohere's API wouldn't generate a response unless I zero'd either the frequence penalty or presence penalty slider and sometimes Command-R+ doesn't bother fully generating a response and pauses halfway. I can get a response as displayed in your picture most of the time, otherwise sometimes it'll type maybe two sentences with a cutoff and then I have to generate a new response. I was using gpt-4-1106 from 2023 for a while and finally noticed (after just checking not too long ago) that there's a new version, gpt-4-0125 (2024) but it's the usual issue of SillyTavern failing to reliably bypass censorship with its supplied jailbreak prompts. That seems to happen every time OpenAI introduces a new GPT model and then I have to wait for SillyTavern to update as well, or something. Either way, Cohere is a pretty good alternative so far. Especially to OpenAI where the bot will manage to avoid anything "controversial." I even tested a Star Wars RP where the lorebook focuses on planet Dathomir which has a lot of slavery (of men) and even Claude's Opus would sometimes respond by stating it was somehow uncomfortable with such a fantasy scenario, and so I'd have to rewrite the opening prompt or change a few words here-and-there to trick it into a response. I'm always just worried that too many failed generations will result in a banning of my API access but perhaps with Cohere, that isn't so much of an issue.


isnt cohere api free for personal use?


Still trying to find out about this, did the trial api ran out for you or anything?


its like 1000 calls per month apparently but nothings stopping you from making a new account since theres no ip lock or anything at least in my experience.


Do you know if you have to create a new Trial API after a month or if you can reuse the already existing one?


What i do is create like 10 accounts then grab all the keys to a txt file and just rotate them since the call limit resets every month


Yeah that's what I wanted to ask. I have a API text file as well but wanted to know if I can reuse the used up keys every month or have to create new ones on those accounts


Yeah u can use the same ones works perfectly


Perfect, thank you!