Inference shadow fleet
Side-by-side PyTorch vs ONNXRuntime serving with traffic mirroring; automated diff on logits and business KPIs.
AI talent profile
PyTorch Production Engineer — serving & compilers
Profile active · Apr 5
On the marketplace since Mar 15, 2026
I specialize in taking PyTorch models from training graphs to low-latency inference: torch.compile, ONNX/TensorRT paths, and GPU memory tuning for recommender and ranking towers.
Why proof-first?
Ganloss profiles highlight real projects and tools—not buzzwords—so you can evaluate AI talent faster than a generic CV.
Side-by-side PyTorch vs ONNXRuntime serving with traffic mirroring; automated diff on logits and business KPIs.
Principal Software Engineer, ML Serving
Gulf Meridian Commerce · 2019 — Present
Principal software engineer, ML serving: torch.compile and custom CUDA audits cut p99 latency ~40% on a ranking tower; shadow traffic vs ONNXRuntime with automated logit/KPI diffs. Canary rollouts for quantized student models beside full-precision teachers; rollback playbooks and SLO dashboards. GPU memory tuning for large embedding tables; batching strategies for peak retail traffic.
Sign in to message this profile
Only signed-in employer accounts can send directory messages to marketplace talent.
·
Skills listed
3
Projects
1
With links
0
Shareable proof
Use cases
2
Experience rows
1
Bio
24 words
178 characters
Skill depth
3 expert
Post a role with the skills and tools you need—candidates apply in one step and you manage everything in your employer dashboard.
Post a job