OpenAI’s o3/o4-mini Models Stir Mixed Reviews and Invisible Marking Debates

April 21, 2025, 7:20 am

OpenAI’s new o3 and o4-mini models are making waves by showcasing impressive coding and math capabilities while paradoxically suffering from increased hallucinations. Adding to the intrigue, observers have discovered the unexpected presence of invisible characters that hint at a built-in watermarking mechanism—because nothing says “cutting-edge” like secret invisible signatures in your output. It’s a curious blend of technical wizardry and quirky oversights that has both experts and skeptics raising an amused eyebrow.


simonwillison.net / OpenAI o3 and o4-mini System Card

OpenAI o3 and o4-mini System Card I'm surprised to see a combined System Card for o3 and o4-mini in the same document - I'd expect to see these covered separately. The opening paragraph calls out the most interesting new ability of these models (see also my notes here). Tool usage isn't new, but...

medianama.com / New OpenAI Models Hallucinating More Than Their Predecessor

OpenAI's new AI models are hallucinating more than their predecessor, according to an internal testing report released by the company. The post New OpenAI Models Hallucinating More Than Their Predecessor appeared first on MEDIANAMA.

simonwillison.net / AI assisted search-based research actually works now

For the past two and a half years the feature I've most wanted from LLMs is the ability to take on search-based research tasks on my behalf. We saw the first glimpses of this back in early 2023, with Perplexity (first launched December 2022, first prompt leak in January 2023) and then the GPT-4...

techspot.com / ChatGPT gets scarily good at guessing photo locations, sparking doxxing concerns

OpenAI released its latest o3 and o4-mini models last week, which can "reason" through uploaded images. This means it can crop, rotate, and zoom in on photos, even if they're of poor quality.Read Entire Article

techspot.com / OpenAI's newest o3 and o4-mini models excel at coding and math – but hallucinate more often

Historically, each new generation of OpenAI's models has delivered incremental improvements in factual accuracy, with hallucination rates dropping as the technology matured. However, internal testing and third-party evaluations now reveal that o3 and o4-mini, both classified as "reasoning models,"...

winbuzzer.com / OpenAI’s New o3/o4-mini Models Add Invisible Characters to Text, Sparking Watermark Debate

The discovery of non-standard space characters in OpenAI's o3/o4-mini output has raised questions about AI watermarking, though it remains unclear if it's intentional. The post OpenAI’s New o3/o4-mini Models Add Invisible Characters to Text, Sparking Watermark Debate appeared first on WinBuzzer.


permalink / 6 stories from 4 sources in 9 hours ago #ai #innovation #ml #dataprivacy #openai #software #analytics #google #anthropic #computervision #techpolicy #openai #artificial-intelligence #ai-ethics #generative-ai #llms


Related Tags


Artificial Intelligence


Amazon Pauses Global Data Center Expansion Amid Shifting AI Priorities (0 hours ago)

China warns nations amid escalating US–China trade war (2 hours ago)

Instagram Employs AI to Restrict Underage Profile Tricksters (7 hours ago)

more #ai


Innovation


Researchers Unveil New Color Beyond Human Vision (0 hours ago)

Andor Season 2 ignites fresh Star Wars discourse on Disney+ (3 hours ago)

Apple rolls out its iOS 18.5 beta cycle to developers (3 hours ago)

more #innovation


Machine Learning


China warns nations amid escalating US–China trade war (2 hours ago)

Instagram Employs AI to Restrict Underage Profile Tricksters (7 hours ago)

DOJ Antitrust Trial Challenges Google’s Market Dominance Amid Regulatory Fireworks (8 hours ago)

more #ml


Data Privacy


Signal scandal unsettles White House and GOP officials (1 hour ago)

FTC targets Uber’s shady subscription tactics (3 hours ago)

Bluesky Launches Official Blue Check Verification to Bolster Authenticity (4 hours ago)

more #dataprivacy


OpenAI


Claude Code “Ultrathink” Feature Boosts Agentic Coding Computation Capacity (46 hours ago)

Google Gemini Scheduled Actions Feature Update Announcement (2 days ago)

OpenAI o3/o4-mini Models Exhibit Hallucinations and Geolocation Prowess (2 days ago)

more #openai


Software


Andor Season 2 ignites fresh Star Wars discourse on Disney+ (3 hours ago)

Apple rolls out its iOS 18.5 beta cycle to developers (3 hours ago)

Bluesky Launches Official Blue Check Verification to Bolster Authenticity (4 hours ago)

more #software


Analytics


Russian forces shatter Easter ceasefire amid renewed strikes in Ukraine (5 hours ago)

DOJ Antitrust Trial Challenges Google’s Market Dominance Amid Regulatory Fireworks (8 hours ago)

Abrego Garcia’s Facility Transfer Sparks Political Controversy and VIP Detention Upgrade (8 hours ago)

more #analytics


Google


Pixel 9 Pro Fold vs Oppo Find N5: The Camera Comparison Showdown (7 hours ago)

DOJ Antitrust Trial Challenges Google’s Market Dominance Amid Regulatory Fireworks (8 hours ago)

Easter Ceasefire Clash in Ukraine: Accusations and No Extension (26 hours ago)

more #google


Anthropic


Claude Code “Ultrathink” Feature Boosts Agentic Coding Computation Capacity (46 hours ago)

Google Releases Gemini 2.5 Flash Update (3 days ago)

Anthropic Expands Claude AI with Voice and Workspace Features (5 days ago)

more #anthropic


Computer Vision


Instagram Employs AI to Restrict Underage Profile Tricksters (7 hours ago)

OpenAI o3/o4-mini Models Exhibit Hallucinations and Geolocation Prowess (2 days ago)

Google Releases Gemini 2.5 Flash Update (3 days ago)

more #computervision


Tech Policy


Signal scandal unsettles White House and GOP officials (1 hour ago)

China warns nations amid escalating US–China trade war (2 hours ago)

Andor Season 2 ignites fresh Star Wars discourse on Disney+ (3 hours ago)

more #techpolicy


openai


more #openai


artificial intelligence


more #artificial-intelligence


ai-ethics


more #ai-ethics


generative-ai


more #generative-ai


llms


more #llms



Disclaimer: The information provided on this website is intended for general informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the content. Users are encouraged to verify all details independently. We accept no liability for errors, omissions, or any decisions made based on this information.