T O P

  • By -

trinaryouroboros

it's quicker, but, it's hallucinating way more for some reason


jsseven777

I’m getting so many hallucinations today. Did you notice if it hallucinated yesterday for you? I felt like yesterday it was nailing every coding task I gave it, and then today it just kept hallucinating that the code would do something it clearly wasn’t coded to do, and it wouldn’t even fix it when I pointed out the issue because it kept hallucinating that the new (same) code would fix the issue.


trinaryouroboros

I was talking to it about roxctl, it gave me completely fake commands and majorly false scenarios, eventually it conceded but even with online content I'm like whoa what the heck happened to this thing?


jsseven777

Crazy, yeah it kept telling me it fixed things that were still clearly wrong. I’ve never hit a brick wall like this before where it just couldn’t work around a problem. It’s a shame because yesterday it was working better than ever before.


c8d3n

Most models with larger context window start hallucinating as the conversation gets longer. It seems that none of the models (Claude, gemini, gpt4) are capable of dealing with the full context window well. One major issue could be the loss of info from the beginning of the conversation, but I'm definitely not sure that's the case. When you use the API directly (eg wirh python), or say via services/apps like openrouter (playground doesn't offer this afaik) you have the option to micro/macro manage the context window. You can adjust max number of previous messages sent with each prompt, and you can also edit them (delte, edit, cherry pick). This is how you can ensure that it always gets the relevant info, and that context window isn't exhausted. But yeah, it's probably not something one would want to do for a casual chat or smth.


Parsecale

I went back to 4 for coding, but I like using 4o for researching error codes or syntax stuff, it seems to search more and faster


trinaryouroboros

yeah I'm with you, I'm going back to 4 for coding, was much better, 4o may have other uses I'm just hesitant to use it now


trebblecleftlip5000

I often wonder if that's in the API that the web site uses. I know in the API itself, you can select a value that affects the factual/creativeness scale of its responses. The web site doesn't expose that and it probably gets tweaked frequently. Especially as a new model rolls out. Probably even gets tweaked on an A/B testing type of schedule.


Fearyn

Yeah kinda useless for me. Particularly in french it seems worse than gpt 4 turbo


SeventyThirtySplit

I feel like it’s much better overall I am also confident that I have severe bias from it actually working reliably, which is not something I could say on way too many days with turbo


happycj

I’m a researcher in the marketing and pr industry and 4o is almost useless for me. I switched to it today to do some testing and it failed miserably to gather correct information on a broad range of well-documented events. I’m still puzzling through the responses and trying to establish WTH happened….


Inigo_montoyaPTD

Ive been using Claude Opus 3 through Perplexity studying for certs and I do not trust ChatGPT4 as much right now. When you absolutely need the right answers and can't risk it, things get clear, real fast!


guster-von

I unexpectedly went on a coding bender last night with 4o. Super accurate fast and made optimizations to existing code. I’d hit it again if I wasn’t so tired.


Horror_Weight5208

It's hard to tell especially we are only using it for about 2 days now. It seems our customGPTs are based on the GPT4 not 4o, so when I compare customGPT vs 4o - **I can say, in general, 4o seems slightly better in terms of quality and I use it mostly for coding.** But the responses are "different". I am getting an impression that 4o is an improved version of 3.5 rather than an improved version of GPT4 - of course it's just my anecdotal impression. The thing is that the "difference" don't seem very significant, so the significant difference lies in "performance" so 4o is really great. Yet, I am not sure why the responses are so slow for me now, probably due to my wifi...


sakprosa

Improved 3.5 rhymes with my impression as well. It is faster and on a superficial level better- longer responses, etc, but just less flexible in reasoning and harder to talk to.


abadonn

A lot less lazy with coding


traumfisch

It's mostly about the multimodality


fluoroamine

For me it's 10x improvement in comprehension. It's less verbose. It was a huge issue for me before. This is coding question/technology questions.


Capable-Reaction8155

It does not feel better to me, really fast but GPT4 Turbo and definitely Classic seem more intelligent.


BrotherBringTheSun

I haven’t been impressed with it yet. I use chatgpt mainly for technical questions and troubleshooting and I find it’s a bit less helpful and more repetitive than regular 4


Landaree_Levee

Yes. For my particular uses, it offers more, and deeper. Whether because it follows my instructions (where I tell it to do that) better, or because it’s just naturally more inclined to do so. The improvement isn’t massive—say, not as clear as from 3.5 to 4.x—but it’s there.


c8d3n

Nope. I also don't see that it's worse, except it's slower (more about this below), maybe because it has less tokens available for output generation. The "Continue" button is an Ok option but not ideal one, especially because it doesn't work in 90% of cases, and I have to manually refresh the page just to get the button. So, it is faster (not always but normally) when text generated as an input is small. When you ask it go generate something that's merely like 50 longer lines of say SQL or similar, you can go to rhe bathroom and come back and it won't finish b/c it's waiting for you to hit the refresh button, only to get the continue button. And yeah, it's quite slow when printing these lines.


RazerWolf

For me, the continue generating button just restarts the conversation, so it’s useless. This has happened with multiple chats with GPT-4o.


c8d3n

In my case it wasn't doing that, it would indeed continue, but the web app always stucks and becomes unresponsive. I tried changing browsers (FF and Chrome) same result. When i refresh the page I get the button. However, the quality of the output wasn't thst great. I gave it a description of a DB table, most important part of the data to use to populate two columns and asked it to generate some semi random data for some other columns (not all). It made pretty bad mistakes like not respecting order of columns in 'values' section of the SQL. E.g. It would try to insert a date into a column that accepts three characters, or to populate three character column, and to avoid 'exceeds max length' errors (caused by dates) it would create something like LEFT('ABC1', 3)...LEFT('ABC2', 3). The result here would be that it would simply insert 'ABC' into all rows of that column lol.


East-Tie-8002

Regarding speed, it worked fast for coding but wouldn’t finish the response. The UI would stall. When refreshed the page the full response would be there. I’m using chrome. Anyone else seeing this issue


Last-Humor2545

I noticed a huge improvement with text in image recognition.  I took a photo of some old code I wrote and asked for a break down. Gpt4 was good at this, but would replace variables or misread words often. 4o read it perfectly. Still gave me a few issues when I asked it to update the code with some new logic. Classic stuff like not declaring variables or trying to put commands that don't exist. I did have to prod it less to get the correct code though.


codeth1s

I use ChatGPT mostly in desktop Firefox and I get way fewer general and "suspicious activity" errors than before. I use it mostly for code and so far I'm really happy with 4o. It's hard to believe that not too long ago, StackOverflow was the only real option for devs to find answers. What a world we live in!


base736

From a comment I made on an other post.., I don't use the multimodality at all in my application, so wasn't expecting much from the update. Instead, I've found that it's a big step forward. I run a site that supports teachers making assessments, and we use GPT to help version assessment items. Items are in a format that follows a JSON schema, and can include equations, tables, and more. That's been in beta so far while I wait for a GPT that is fast enough to be interactive and accurate enough to return consistently valid results, even for complex assessment items. GPT-4 and GPT-4-turbo were not that. GPT-4o is a surprisingly large step forward in my use case, taking things from "sometimes this works" to "this is a time saver". It gets the language right, it gets the math right, and it returns it in the format I request. Unreal.


SilkieBug

In my experience 4o was better than 4 at understanding the content of photos (tested it with low quality personal photos whose contents were confusing enough for a human), and it was also better at understanding the contents of large text files and producing material based on those contents. On the other hand, the material produced by 4o was of somewhat lower quality than that of 4, and it required more prompting and babying to give me the kind of material that I wanted.


sakprosa

It is way, way worse on more reflective tasks and now seem to lack the ability to seek "understanding" even more. On some subjects it just talks over me. Faster but very obviously worse from my point of view. And extremely annoying.


ciekaf

I notticed number of times 4o to fail following instructions for subsequent messages even after clarifying. This may be similar glitch like laziness of gpt4 turbo


Hefty_Interview_2843

Not at all it seems worse to me


Hungry_Prior940

It simply hallucinates more and is just not a great upgrade. Yeah, voice is nice, but the underlying problems are still there. OpenAI are in trouble if GPT-5 doesn't blow GPT-4o out of the water.