> The problem in the article is LLMs can recognize a request about a person's name but generate a fake story because it doesn't really have information about them, not that the LLM spit out random data which happened to appropriately respond to the question about who the person was with incorrect info each time by pure random chance.
I don't see how "arbitrary" is any better—that's certainly how humans behave if forced to provide an answer. While it may appear obvious how we engage our internal skepticism signal, it's obvious this search for contradictions is bounded by both breadth and depth. Such an instinct will need to be inspected and reproduced to provide a "I don't know" answer, if that is what you desire from your chatbot (rather than incoherent synthesis, aka creativity).
To be clear, the solution to use "search" in this context is a "web search" and what you're responding to is the description of the prior, broken, behavior that prompted the story and subsequent change in behavior. I.e. ChatGPT now performs a Bing API query to get cached results for "who is ${persons name}". None of this relies on the model now figuring out how uncertain or certain it is, if it sees a query asking about a person it just always performs a search rather than trying to come up with an answer itself. It then also provides the links to the external pages it got the answer from.
yes, I was using search in the other more generic sense (e.g. beam search). The google search thing is really only interesting if they can bind the tokens to the result, otherwise you're just going to have to re-google to vet the chatbot.
Can I ask why you keep attributing things to Google here when I've continuously clarified they are not involved in either this model or the search results it's using?
And yes, this is not like beam search and that's exactly why it works consistently for the defamation prevention use case.
> exactly why it works consistently for the defamation prevention use case.
Right but in this case the claim is clearly incoherent. If any claim with your name on the internet can be assumed to be about you, you can sink basically any company offering text generation. So either governments lean into the inherently incoherent concept of "defamation" or they completely abandon text generation.
We're in a really rough place right now where american companies service many regions with incompatible laws. Ideally these states would be served by companies with compatible values. America is both the best and worst thing to ever happen to the internet.
The problem wasn't "John Smith is a common name and when asked who John Smith was it accurately said someone named John Smith was arrested for drug possession" it was it invented claims about the name which weren't sourceable for anybody with that name. Since this is now replaced with cited Bing results that problem goes away, it's relevant and provable to be someone with that name even if that's not true of everyone with that name.
Compliance is definitely tough though. Of course you don't have to offer your service in every region just because you operate on the web and not doing so is a valid alternative to dealing with regional regulations instead. The only invalid option is demanding you both do business in a region and don't have to comply with its rules.
I don't see how "arbitrary" is any better—that's certainly how humans behave if forced to provide an answer. While it may appear obvious how we engage our internal skepticism signal, it's obvious this search for contradictions is bounded by both breadth and depth. Such an instinct will need to be inspected and reproduced to provide a "I don't know" answer, if that is what you desire from your chatbot (rather than incoherent synthesis, aka creativity).