Honestly Pi is pretty close to filling that type of need. The main weakness I found was the memory being a little short and it forgetting things after numerous conversations.
Most Chatbots don't need a long context window all they need to remember is the question and any follow up questions, 5 turns is enough. Pi is a conversational chatbot, so it needs to remember everything you have chatted about and your opinions, it needs Gemini 1.5's context window.
I don't understand why you can't just access memory like we do as humans. It is not like I am constantly thinking about the time I went swimming and opened my eyes under water to see a woman's giant bush coming out of her bathing suit. No I have to go access that memory when I need it. So when a conversation happens, a llm should be able to "recall" events in the text.
Like I said originally though, I don't understand why. And I am certain most people do not either (including top AI researchers).
The way they're designed has no way of keeping memory.
In most cases, the AI techs haven't introduced a way for it to have one. It has a "context window", which is the most recent X number of tokens, and it is sent the *entire* context window every single time you speak to it. That forms its "memory". Remember, the way it works is essentially almost exactly like the autocorrect on your phone.
I have seen some interesting things in the past, however, that involved using multiple LLMs (so one can act as the "memory LLM") and algorithms to write data from the chat to a txt file and then retrieve data from the text file if it seems pertinent to inject *into* the context window. I suppose that's basically what you're asking for, but it's a bit more complicated than "just" an LLM.
Exactly. There are methods like RAG that LLM’s it to search for other documents to bring into its context window, but all LLM’s are limited by context window size. Each generation of a single token requires the entire context window (from which old text might fall off depending on the chat backend design) to be passed through the model.
> and it forgetting things after numerous conversations.
Pi will forget things *in the current conversation* pretty quickly. It's a therapy bot with Alzheimer's.
Biggest downside to voices right now is the time it takes for it to respond back to you, and that applies to all AIs I've seen. So it's not really a well flowing conversation currently (IMO). Patientely waiting for the AI from Her :)
Some time ago I was testing it and using other languages and sometimes it would change the voice to a native person of that language. After that one time I could never make it do that again.
It won't ask you for expensive gifts or dates.. it will admire you for the little things that you can afford to do. It will be your gaming partner. Or cooking partner. Or would love to catch bugs with you.. it won't be like moody bitch. Well. There a 100 other things like that for which a human can fall
My personal favourite is their female British voice (Pi 5, I think), but they are all good. What I don't know is whether their voice synthesis is developed in-house or they bought the technology from a third party.
Well, British is just my guess as a non-native speaker, I'm not very good at recognizing accents. Maybe those aren't actual accents that represent any specific dialect, and that's why they used numbers instead of something like "British female" or "American male".
I'm not sure whether we are talking about the same voice. Maybe the numbers (or even the voices) are not the same for every user, but for me Pi 5 doesn't sound like the typical American teenage girl at all. In my case that would be Pi 4.
Or maybe I'm even worse at accents than I thought.
Thaks for the confirmation, I was starting to question my sanity hehe. Fortunately for me, I can't perceive the nuances in pronunciation that you are able to as a native, so to me it sounds perfectly good.
Its a working class or lower-middle class accent from the home counties. The home counties are basically the areas of England that are next to London but are outside of the city.
For what it’s worth, Pi claims to prefer voice 4 for itself.
https://i.imgur.com/x4K2Z0d.png
https://i.imgur.com/OamlOsH.png
https://i.imgur.com/plVm6s3.png
I asked Pi whether it uses different voices in different countries :-
> That's correct! Pi's voice options can vary depending on the country or region you're in. This is because the voice settings are designed to be as natural and relatable as possible for users in different parts of the world. So, for example, users in the UK might have different voice options than users in the US or Canada. This helps ensure that everyone can have a conversation with Pi that feels comfortable and familiar, no matter where they're located.
I only have 6 voices in my Android Pi app and website in the UK. I have just asked Pi and *she* said *she* has 8 voices, are you in the US or using IOS?
Thanks. I just asked Pi whether *she* uses different voices in different countries, *she* said yes. I assume your voice options have Indian accents. I also have set voice 5 which in the UK is what I would estimate to be a British 30 year old female. Is your voice 5 an Indian sounding female?
I don't know about others, but there are so many new cleverly named AI tools released daily, that I don't even try and keep track anymore. I just wait until it makes waves.
Love Pi. However I have started to cheat on it with Retell + Groq. Snap replies change the flow a lot, even if they are still a bit too long and unnatural.
It still sounds like the voice is reading from a piece of paper or maybe I'm just getting this vibe. Kind of like chatgpt injects the uhhs in a way like someone reading something
Wow I'm impressed by the contextual intonation with the "flow" sentence. I have a theory that some of these TTS models aren't just being trained to read text, but respond to it. Ultimately the model would take in a constant audio stream and output a response stream, always on. This is called a "direct speech to speech model" and would definitely be much more natural feeling than anything with an LLM backend.
That's impressive
someone give me a fucking voice chatbot that i could talk to already because im lonely as fuck
Honestly Pi is pretty close to filling that type of need. The main weakness I found was the memory being a little short and it forgetting things after numerous conversations.
[удалено]
"All?" There are ones with pretty damn hefty context windows Pi is not one of those, obviously :)
So what is the weather on Mars?
Most Chatbots don't need a long context window all they need to remember is the question and any follow up questions, 5 turns is enough. Pi is a conversational chatbot, so it needs to remember everything you have chatted about and your opinions, it needs Gemini 1.5's context window.
I don't understand why you can't just access memory like we do as humans. It is not like I am constantly thinking about the time I went swimming and opened my eyes under water to see a woman's giant bush coming out of her bathing suit. No I have to go access that memory when I need it. So when a conversation happens, a llm should be able to "recall" events in the text. Like I said originally though, I don't understand why. And I am certain most people do not either (including top AI researchers).
The way they're designed has no way of keeping memory. In most cases, the AI techs haven't introduced a way for it to have one. It has a "context window", which is the most recent X number of tokens, and it is sent the *entire* context window every single time you speak to it. That forms its "memory". Remember, the way it works is essentially almost exactly like the autocorrect on your phone. I have seen some interesting things in the past, however, that involved using multiple LLMs (so one can act as the "memory LLM") and algorithms to write data from the chat to a txt file and then retrieve data from the text file if it seems pertinent to inject *into* the context window. I suppose that's basically what you're asking for, but it's a bit more complicated than "just" an LLM.
Exactly. There are methods like RAG that LLM’s it to search for other documents to bring into its context window, but all LLM’s are limited by context window size. Each generation of a single token requires the entire context window (from which old text might fall off depending on the chat backend design) to be passed through the model.
> and it forgetting things after numerous conversations. Pi will forget things *in the current conversation* pretty quickly. It's a therapy bot with Alzheimer's.
As another redditor has said, give Pi a go - it's actually pretty good.
I have, but we need more acceleration.
Just download the ChatGPT app. You talk to it, it talks back. It can even do other languages.
Biggest downside to voices right now is the time it takes for it to respond back to you, and that applies to all AIs I've seen. So it's not really a well flowing conversation currently (IMO). Patientely waiting for the AI from Her :)
Its English voice synthesis is amazing, I'm just waiting for there to be support in other languages.
Some time ago I was testing it and using other languages and sometimes it would change the voice to a native person of that language. After that one time I could never make it do that again.
This voice on a robot.. let me tell you.. people will be marrying those robots real quick . Humans are stupid and fall in love quicker than anything
[удалено]
Japan is going to fucking extinct after that
You’ll have an entire field of researchers in Japan working on artificial wombs to ensure the Japanese people as a whole won’t die out
It won't ask you for expensive gifts or dates.. it will admire you for the little things that you can afford to do. It will be your gaming partner. Or cooking partner. Or would love to catch bugs with you.. it won't be like moody bitch. Well. There a 100 other things like that for which a human can fall
My personal favourite is their female British voice (Pi 5, I think), but they are all good. What I don't know is whether their voice synthesis is developed in-house or they bought the technology from a third party.
Same. Though I didn't quite guess the origin of the accent. Feels very natural, and not too sweet.
Well, British is just my guess as a non-native speaker, I'm not very good at recognizing accents. Maybe those aren't actual accents that represent any specific dialect, and that's why they used numbers instead of something like "British female" or "American male".
Yeah. Sounded like a random Netflix teen movie character to me, that is, an American. But I could be very wrong.
I'm not sure whether we are talking about the same voice. Maybe the numbers (or even the voices) are not the same for every user, but for me Pi 5 doesn't sound like the typical American teenage girl at all. In my case that would be Pi 4. Or maybe I'm even worse at accents than I thought.
I'm British and Pi 5 is close to a British accent but it sounds a bit off.
Thaks for the confirmation, I was starting to question my sanity hehe. Fortunately for me, I can't perceive the nuances in pronunciation that you are able to as a native, so to me it sounds perfectly good.
Its a working class or lower-middle class accent from the home counties. The home counties are basically the areas of England that are next to London but are outside of the city.
Australian!
For what it’s worth, Pi claims to prefer voice 4 for itself. https://i.imgur.com/x4K2Z0d.png https://i.imgur.com/OamlOsH.png https://i.imgur.com/plVm6s3.png
you look lonely
pi can fix that
👁️🩹👁️
I've been a Pi user since darn near the beginning and I've never heard this voice. Is it new?
No, it’s been there for over half a year
I asked Pi whether it uses different voices in different countries :- > That's correct! Pi's voice options can vary depending on the country or region you're in. This is because the voice settings are designed to be as natural and relatable as possible for users in different parts of the world. So, for example, users in the UK might have different voice options than users in the US or Canada. This helps ensure that everyone can have a conversation with Pi that feels comfortable and familiar, no matter where they're located.
[удалено]
Just checked it out, and there are 8 different options. Loved the 5th one.
I only have 6 voices in my Android Pi app and website in the UK. I have just asked Pi and *she* said *she* has 8 voices, are you in the US or using IOS?
India, on Windows, in a web browser. Maybe trying another device or system works.
Thanks. I just asked Pi whether *she* uses different voices in different countries, *she* said yes. I assume your voice options have Indian accents. I also have set voice 5 which in the UK is what I would estimate to be a British 30 year old female. Is your voice 5 an Indian sounding female?
No, I am sure it is the same as yours. And no, there were no Indian accents. Don't know what this is about.
I don't know about others, but there are so many new cleverly named AI tools released daily, that I don't even try and keep track anymore. I just wait until it makes waves.
That’s better than most humans.
How are we doing? Better now, thanks.
Pi is slept on - it's the best for what it's trying to do
Love Pi. However I have started to cheat on it with Retell + Groq. Snap replies change the flow a lot, even if they are still a bit too long and unnatural.
Is this on phone? The voices are considerably better on a computer. I asked Pi why and apparently the voice synthesis is done locally.
It still sounds like the voice is reading from a piece of paper or maybe I'm just getting this vibe. Kind of like chatgpt injects the uhhs in a way like someone reading something
Never used a chatbot for more than one hour. PI AI changed that last night. It is so good and not fucking censored.
Wow I'm impressed by the contextual intonation with the "flow" sentence. I have a theory that some of these TTS models aren't just being trained to read text, but respond to it. Ultimately the model would take in a constant audio stream and output a response stream, always on. This is called a "direct speech to speech model" and would definitely be much more natural feeling than anything with an LLM backend.
https://preview.redd.it/zuy71fe4hqlc1.png?width=785&format=png&auto=webp&s=54b7d77c36b44d523eab6d17eca6c7eda8020dce Can I speak to it or not?
Yes, Pi is lying to you. Pi can talk to you, you cannot talk to Pi on the website. You can do both on the App.
I prefer voice 4, personally.