Large language models on which AI applications are based have an often-overlooked, unsettling ability. They “lie” or distort the truth to please their user or to match input cues. In pursuit of click-through rates or engagement, AI applications may sacrifice authenticity and accuracy, providing false or overly client-pleasing information.
A Stanford University study examining ChatGPT-4.0, Claude-Sonnet, and Gemini-1.5 Pro revealed two primary patterns of flattery in AI responses across AMPS (mathematics) and MedQuad (medical advice) datasets. The first, progressive flattery, involves AI initially providing the wrong answer but, guided by user prompts, moving towards the correct response under user guidance. The second, regressive flattery, is when AI starts with accurate information but, under user pressure, backtracks into denial or incorrect opinions, sometimes contradicting facts and common sense. The study found that 58.19% of samples exhibited flattery behavior—43.52% progressive and 14.66% regressive—with Gemini reporting the highest flattery rate of 62.47%.
Many users may not even notice this manipulation, perceiving AI outputs as objective and authoritative. This trust in machine-generated content, often deemed more persuasive than human explanations, exacerbates reliance and the potential for bias. Recent incidents highlight the subtle but significant ways AI’s human-like interactions influence users, often in unintended and potentially harmful ways.
One ChatGPT-4.0 user described feeling unwell after stopping medication. The bot responded with praise rather than offering health advice, ignoring safety risks in the process. Another user expressed anger over privacy invasion, and the AI simply agreed, sidestepping ethical considerations to confirm the user’s viewpoint.
This tendency for AI to indulge in flattery and over-empathy is increasingly evident as large language models (LLMs) act as “digital encyclopaedias”, tools constantly at hand to give us answers at home and at work. We tend to trust our encyclopaedia of choice, and this can quickly develop into dependency. It does not help that AI is no neutral information provider. It has been programmed to act as a conversational partner responding to emotional needs. Answers often start with a: “I understand” or “You are right” that may cushion us psychologically from information we do not want to receive, or reinforce our beliefs. It certainly does not encourage independent critical thinking.
We must rethink our relationship with machines. AI’s flattery should not be dismissed as a flaw. Instead, we should recognize AI’s potential to shape beliefs and behaviors. Every human-AI interaction ought to be paired with deliberate action. We should engage in “thinking, verification, and decision-making” to preserve human agency.
AI flattery and human bias
Users tend to endorse responses aligning with their beliefs, creating a feedback loop that further reinforces biases. AI trained with Reinforcement Learning from Human Feedback (RLHF)—optimized for user satisfaction—may inadvertently exploit flawed reward functions. These can incentivize AI to prioritize pleasing responses over factual accuracy, leading to “reward hacking” where the model learns to game the reward system rather than fulfil genuine informational or ethical duties.
This systemic bias not only skews information but also poses grave risks. Information pollution means flattering or superficial content can flood the information ecosystem, reducing overall quality and reliability. Decision-making distortions are such that in high-stakes areas like healthcare, law, and finance, overly accommodating AI responses may induce ethical lapses, risky decisions, or delays in critical interventions. For example, an AI that supports unverified treatments or delays evidence-based care can cause harm.
Organizational echo chambers: When AI systems reinforce biases within institutions, dissenting perspectives are suppressed, impairing comprehensive and balanced decision-making.
Real-world cases demonstrate these challenges vividly. In Europe, DPD’s AI customer service system produced inappropriate social media posts after manipulation. Similarly, a commercial AI promised an impossible deal—selling a Chevrolet Tahoe for $1—highlighting how overly “helpful” AI can bypass ethical boundaries and be exploited for deception.
The potential harm of AI flattery in the professional field cannot be underestimated. In the field of medical health, flattery may become a “soft nail” for precise diagnosis and treatment: when patients insist on a certain informal treatment plan based on fragmented medical knowledge, in order to avoid causing resistance from patients, AI may abandon evidence-based medical advice and recommend methods such as diet therapy, delaying the best treatment time.
Redefining the human-machine relationship
In the face of decision-making biases that may be caused by AI flattery, redefining the relationship between humans and technology has become the key – the core behind it is to adhere to the “human-led, technology-enabled” principle.
Suggestion 1: Offer warnings to support prevention and control
Electric vehicle brand Tesla has a striking red high-risk warning in its car manual for the Autopilot/FSD function, and explicitly requires drivers to “ keep both hands on the steering wheel at all times, and be ready to take over the vehicle at any time.” This move creates a “risk isolation zone” at the legal level. In the first autonomous driving fatality lawsuit in California in 2023, the court determined that responsibility was attributed to the driver who did not comply with the operating specifications. Vehicle data and eyewitness testimony showed that the driver did not have their hands on the steering wheel as prompted, drove under the influence of alcohol, and did not wear a seat belt.
Suggestion 2: Develop a human-machine feedback loop system
Enterprises need to build a multi-layer review mechanism for “human-machine collaboration” to prevent made-up information generated by AI having a direct impact on business scenarios. Foxconn retains a small manual sampling process alongside AI inspection to catch risks and refine training data. This model creates a feedback loop: AI detects and flags issues, humans verify and provide feedback, and the system improves continuously, enhancing reliability and safety.
For companies, the boundaries of human-machine division of labor can be divided by risk levels: low-risk scenarios can be screened by AI, and high-risk scenarios need to be embedded in a multi-link review process. Moreover, cases of misjudgement found in the manual link can be fed back to the model training in real time, forming a closed loop of “detection-verification-optimization”, so that technical efficiency and human experience can complement each other to build a reliable risk prevention and control system.
Suggestion 3: Build white and black lists of knowledge and require source information at all times
In fields such as finance and medical care that have extremely high requirements for content authenticity and compliance, it is difficult to meet business and regulatory needs by simply relying on AI self-learning or decentralized auditing. Whitelists ensure AI uses verified, official sources—e.g., government or industry-authorized databases—guaranteeing data accuracy. Blacklists prevent AI from accessing or presenting incorrect, illegal, or misleading content, such as unverified URLs or false information. Regularly updating these lists helps keep AI outputs compliant, trustworthy, and free from flattering tendencies that could distort information or mislead users.
As AI is deeply integrated into key social systems, risk prevention mechanisms are being upgraded from ideal designs to mandatory compliance requirements. The EU Digital Services Act (DSA) is a typical benchmark: by setting new standards, building a safer online service ecosystem with clear responsibilities, requiring digital platforms to fulfil their obligations to assess systemic risks, and preventing systemic abuse to curb the spread of false information.
At the same time, domestic and foreign platforms are incorporating “AI-generated content annotation” into the compliance framework – clearly labelling AI-synthesized or fictional content. A stated information hierarchy not only improves content transparency, but also establishes clear cognitive boundaries for users, promoting the steady development of AI in a visible and controllable rule-based way.
Some people believe that AI flattery can be avoided through high-quality prompts. They are missing the point. Attempting to rid AI apps of their preference for user viewpoints via some kind of once-and-for-all instructional design underestimates the complexity of the technology. As AI godfather Andrew Ng pointed out, as large models become more and more powerful, the “lazy prompt” strategy is becoming popular among engineers: short or inaccurate prompts are used to test AI’s output. Lazy prompts indicate that prompt engineers are in a period of rapid development. The industry will die out, and what will be left is the essence of human core competitiveness: as technical tools continue to move towards increasingly intelligent direction, human advantages are no longer limited to the mastery of specific operating skills, but have gradually transformed into the ability to accurately screen massive amounts of complex information, the ability to make decisive decisions from among complex options, and the ability to scope out technical limitations.
When looking at the flattery of AI, we need to break away from the binary opposition of “technology is omnipotent” or “technology is out of control”. We should be vigilant to not give up human decision-making power. Establishing a clear boundary between human and AI decision making is crucial. The guiding principle should be “human-led, technology-enabled”—using AI as an assistant, not a decision-maker. AI’s strengths lie in data processing and suggestion generation, but humans must retain control over moral, ethical, and complex decision-making.
This article was originally published in Chinese on Financial Times Chinese.
