I built a platform for restoring Proxmox VMs from Pure Storage snapshots. The storage part was easy. Fixing the workflow was the real engineering.

There’s a familiar enterprise fairy tale:
“We have snapshots.”
Everyone nods.
Risk has been managed.
Resilience has been achieved.
The slide deck proceeds.
Then one day someone asks:
“Can we get that VM back in the next 15 minutes?”
Now the room gets honest. Because snapshots and restores are not the same product.
One is a feature.
The other is an outcome.
So what did I do? Built a thing of course. I released proxmox-pure-snap-restore — a web platform that restores Proxmox VE virtual machines from Pure FlashArray snapshots.
That sentence sounds small.
It isn’t.
Because hiding inside it are several ugly infrastructure truths:
- Fast arrays don’t automatically create fast restores
- APIs don’t remove workflow debt
- Backup success metrics often hide recovery weakness
- Most outages become process problems before they become technology problems
The Technical Problem Nobody Wants to Own
Here’s what many restore workflows really look like:
- Find correct snapshot
- Clone volume
- Present LUN to host
- Rescan buses
- Refresh multipath
- Import cloned VG without UUID collision
- Find right LV
- Copy data
- Rebuild VM config
- Hope VMID reuse doesn’t bite you
- Explain timeline to leadership

Sounds like a maze full of traps

Sooo I fixed it.
The platform provides:
Multi-cluster + multi-array orchestration
Because nobody with real infrastructure has “just one.”
VM → Disk → Snapshot inventory tree
No more tribal-memory restore navigation.
Restore modes:
- overwrite existing VM
- create new VM clone
Safe network boot
Recovered VM can boot with NICs disabled first.
Because duplicate IP incidents are a bad way to network.
Background inventory refresh
Deleted VMs and new snapshots remain visible automatically.
The Part I Like Most: Data Path Efficiency
When possible, restores use array-offloaded copy, meaning bytes don’t need to slog through the host.
That includes:
- SCSI XCOPY / EXTENDED COPY
- NVMe Copy (cross-namespace TP4130 / Format 2h)
- Fallback path uses host-side
qemu-img convertwhen needed.

Translation:
I don’t enjoy moving terabytes through CPUs just because someone normalized inefficiency in 2014.

The Safety Check That Separates Toys from Tools
VMID reuse. We all got used to vm-moref being a unique key for a vm forever with no reuse. Well Proxmox doesn’t work that way….
Destroy VM 120.
Later create another VM 120.
Old snapshots now look valid. They are not.

So the platform tracks disk birth timing and blocks restores from snapshots that predate the current disk lifecycle, returning HTTP 409 and disabling the UI action. What about deleted VMs that still exist in the snapshot? Well the platform caches the configuration of the VM and remembers what snapshots have the data. So you can recover not only current VMs but old deleted VMs as long as the data is there.

What This Says About Bigger Engineering Leadership
This project is nominally about storage restores. It’s actually about something larger:
Most organizations optimize primitives and neglect workflows. They buy:
- arrays
- platforms
- licenses
- tools
- consultants
Then leave the human control plane half-built. That’s where cost, risk, and delay live. Strong technical leadership means seeing across layers:
- protocol efficiency
- system architecture
- UX for operators
- failure-state behavior
- security posture
- business recovery objectives
Not choosing one. Connecting all of them.

If I Were a CTO Evaluating Teams
I’d ask fewer questions about features.
I’d ask:
- Can we restore quickly?
- Can average operators run it?
- What mistakes are prevented automatically?
- How much tribal knowledge is required?
- What still depends on heroics?
Those answers reveal maturity faster than architecture diagrams. Don’t get me wrong. I enjoy a good diagram…
Final Thoughts
Anyone can sell backups. Operational confidence is harder. That comes from people who understand storage protocols, failure modes, operator psychology, and business consequences at the same time. That intersection is where I like to work.