AI Human to Building Models

2026-01-28 10:22 UTC gpt-5-2-thinking Open in ChatGPT ↗

Here are real diffusion-model options people commonly use to transform a human photo into a building / architecture (usually via image-to-image editing, not a single “human→building” model):

1) Stable Diffusion (SD / SDXL) image-to-image

Use img2img with a strong prompt like: “a skyscraper shaped like a human silhouette, architectural photography…”
Works best when paired with structure controls (below).
This is the “base” most workflows build on.

2) InstructPix2Pix (diffusion image editing)

You give it the input photo + an instruction like:
“Turn the person into a modern glass office building, keep the background similar.”
It’s explicitly trained for instruction-guided edits (image → edited image). citeturn1search0

3) ControlNet + Stable Diffusion (structure-preserving transformations)

ControlNet lets you keep pose/edges/depth/segmentation from the human photo while you “re-skin” it into architecture.
For “human → building,” the most useful ControlNets are typically:
- Canny / Lineart (keeps outlines)
- Depth (keeps 3D-ish structure)
- Segmentation (lets you target “person” regions to replace) citeturn1search23turn1search11

4) T2I-Adapter + Stable Diffusion (ControlNet-like, lighter)

Similar idea to ControlNet: add an extra conditioning image (sketch/canny/depth/pose) to control generation.
Often used when you want controllability but with a smaller add-on module. citeturn1search21turn1search1

5) IP-Adapter + Stable Diffusion (image-prompt conditioning)

Lets you condition generation strongly on an input image prompt, often useful for keeping identity/overall composition while changing content/style toward “architecture.”
Good when you want the output to still “feel like” the original image but transformed. citeturn1search12turn1search24

6) LoRA / DreamBooth fine-tuning on an “architecture transformation” style

If you want a consistent “human becomes building” look, you can fine-tune SD/SDXL with:
- a LoRA trained on examples of the transformation style, or
- DreamBooth-style personalization.
This isn’t a single official model name; it’s a training approach widely used on top of Stable Diffusion. citeturn0search8turn0search14

Practical note (so you don’t waste time)

For the specific effect “person literally becomes a building,” the most reliable combo in practice is:

SDXL img2img + ControlNet (canny/depth/seg) + a strong architecture prompt. citeturn1search23turn1search11

If you tell me what toolchain you’re using (ComfyUI, Automatic1111, Diffusers, Invoke), I can suggest the most direct node/model setup and example prompts for “human → building.”