Veo 3: Anatomical Hallucinations

Written in

by

We’ve seen it many times: a person with six fingers, or—as in this case—a figure whose torso faces forward while their head looks backward. It’s a striking and memorable error, and it perfectly illustrates how generative models can violate basic physical logic when constraints are missing.

Full Prompt Metadata

Theatre_of_Obedience:
  base_style: "cinematic, photorealistic, 4K"
  aspect_ratio: "16:9"
  color_palette: "muted grays and browns"
  image_texture: "subtle CRT scanlines and analog distortion, grainy, VHS-style motion blur"
  lighting: "cold overhead fluorescents"
  timeline:
    - sequence: 1
      timestamp: "00:00-00:03"
      interior_shot: true
      action: "Wide shot of the marching crowd entering a massive, decaying movie theatre and start sitting down"
      camera_setup: "Follows them from behind as they file in silently"
    - sequence: 2
      timestamp: "00:03-00:06"
      action: "Rows of bald figures seated, staring at the front with their eyes blank. Each uniform bears a stitched QR code on the chest"
      camera_setup: "Pans slowly across their blank faces"
      lighting: "dim, flickering"
      sound: "low static and mechanical breathing"
    - sequence: 3
      timestamp: "00:06-00:08"
      action: "The screen displays a massive QR code, pulsing slowly. No voice. Just the code. Audience remains motionless"

Why this hallucination happened

This kind of anatomical impossibility stems from a few key issues:

This kind of anatomical impossibility stems from a few key issues:

Pose Ambiguity The prompt specified that people should be seated and facing a screen, but didn’t define:

  • The screen’s exact location relative to the camera
  • The required alignment of head and torso
  • Whether any subjects could face the camera
Without these constraints, the model blends multiple “valid” poses—including ones that are visually dramatic but physically impossible.

No Negative Constraints You didn’t tell the model what not to do. For example:

  • “No mixed head/torso orientation”
  • “No subject facing away from their own torso”
  • “No anatomical inversions or distortions”
Without these, the model assumes artistic license is acceptable.

How to prevent this

To avoid this kind of hallucination in future prompts:

  • Define screen position and subject orientation
    • “Screen is front-center; all subjects face screen with aligned head and torso.”
  • Add blocking and seating rules
    • “Subjects seated in rows; no one faces camera; no mixed orientations.”
  • Include negative constraints
    • “No anatomical distortions; no head facing opposite direction of torso.”
  • Use camera framing
    • “Camera behind audience; medium-wide shot; no frontal face visibility.”

Key Takeway

One might assume that constraints like “no anatomical distortions” or “face and torso must align” are unnecessary—that the model would naturally avoid such errors. But as I noted in the blog entry “The 15-year-old prodigy: Managing AI so it actually delivers”  we’re not working with a seasoned adult. We’re working with an extremely capable teenager—one who can execute with precision, but lacks judgment.

What feels intuitive to us must be made explicit to the model. AI will not infer what you don’t specify.

Categories

Leave a comment

Technologist | Senior Product Manager Product Strategy |  Cyber-Security | Mobile

xAkamai, xArm, xBlackberry, xMotorola | Lead Product Manager