BM · 2026 · AUCKLAND
VLA / MULTIMODAL AI//DECK 01 — INTRO
LAT −36.848° · LON 174.762°
LIVE
BOB MA · 马博

Perception, on the edge.

PhD, Vision-driven Semantic SLAM
Principal ML Engineer — Resideo / Honeywell Home
Vision · Language · Action — shipped on devices
SUBJECT — BOB MA
02 / THE ARC
KERNEL → EMBODIMENT//~16 YEARS
TRAJECTORY — SINGLE ENGINEER, MANY LAYERS
02/10
TRAJECTORY

From kernel drivers to embodied AI.

I've spent 16 years climbing the stack — packet filters, industrial vision, cloud data, SLAM, now VLMs on SoCs. Every layer informs the next.
2007
Kernel & Network Security
Topsec
2010
Pattern Recognition
Huawei Symantec
2017
Data Engineering
Woolworths · ASX:WOW
2019
PhD · Semantic SLAM
AUT, New Zealand
2022
ADAS · Vehicle AI
EROAD · ASX:ERD
2024 —
Principal ML Engineer
Resideo / Honeywell · NYSE:REZI
03 / NOW — RESIDEO / HONEYWELL HOME
MAR 2024 — PRESENT//NYSE: REZI
PRINCIPAL ML ENGINEER · EDGE / MULTIMODAL
03/10
CURRENT ROLE

Multimodal LLMs,
on edge SoCs.

TARGET · Ambarella CV25 / CV72 · Oclea
RUNTIME · ONNX → TensorRT → .cvimodel
LLaMA3
Multimodal
on-device
CV25
CV72
Ambarella
edge SoCs
INT8
Quantised
distilled
END-TO-END PIPELINE
Train ONNX Clavary On-chip Real-time Action
  • Designed multimodal inference (LLaMA3, CLIP, YOLO) for real-time video understanding on Ambarella SoCs.
  • Rewrote backbone operators for Clavary (.cvimodel) conversion and hardware acceleration.
  • Perception-to-action: triggers tracking, sensor & motor control from visual scene interpretation.
  • Contributed a general image/video understanding framework to the Oclea SDK.
04 / EROAD · COMMERCIAL FLEETS
MAR 2022 — MAR 2024//ASX: ERD
ADAS + LANE-LEVEL NAV + VSLAM
04/10
PRIOR · SENIOR ML ENGINEER

Lane-level perception
for commercial fleets.

YOLO · SSD vSLAM Camera↔SLAM calib GPS fusion AWS SageMaker C++ / PyTorch
CAR 0.96
▸ LANE DETECTOR · PROJECTED INTO SLAM FRAME
▸ WAYPOINTS · LANE CENTER
▸ OBJECT DETECTION · YOLO
FPS 32.4 · POSE ±0.11m · IoU 0.83
05 / PhD · AUT · 2019 — 2023
VISION-DRIVEN SEMANTIC SLAM//AUCKLAND
DOCTORAL RESEARCH
05/10
DOCTORATE

Robust localisation
in dynamic scenes.

Vision-driven Semantic SLAM for scene understanding and autonomous systems — mapping that survives motion, occlusion and scale change.

DEGREE · Doctor of Philosophy
INSTITUTION · Auckland University of Technology
THESIS · Semantic SLAM + Perception
PRIOR · NPU, China · Robust control
KF_00 KF_N · LOOP CLOSED dyn · person static · road
06 / RECENT RESEARCH
VLM · PRIVACY · DISTILLATION//2024 — 2026
SELECTED PUBLICATIONS
06/10
RESEARCH

Privacy-aligned vision,
under real constraints.

REVIEWER · AAAI · VLDB · IEEE TCSVT · TETCI · Access
2026 · ARXIV
Bodhi VLM: Privacy-Alignment Modeling for Hierarchical Visual Representations in VLM Encoders
Bottom-up + top-down feature search
2026 · ARXIV
CT-to-X-ray Distillation Under Tiny Paired Cohorts
Evidence-bounded reproducible pilot
2026 · ARXIV
PPEDCRF — Privacy-Preserving Enhanced Dynamic CRF for Video Sequences
Minimal detection degradation
2026 · ARXIV
TSDCRF — Balancing Privacy & Multi-Object Tracking via Time-Series CRF
Normalized control penalty
2026 · ARXIV
REAEDP — Entropy-Calibrated Differentially Private Data Release
Formal guarantees + attack-based eval
2024 · MTAP
Privacy-preserving word-embedding text classification via DBN-constructed privacy boundary
Multimedia Tools and Applications, 83(10)
07 / TECHNICAL RANGE
FULL STACK · SILICON TO CLOUD//
YEARS OF HANDS-ON, NOT BUZZWORDS
07/10
RANGE

From silicon
to cloud orchestration.

VLA · Multimodal
  • VLA pipelines
  • VLM encoders
  • Multimodal LLM deploy
  • Embodied AI
  • Perception→action
Perception · Vision
  • YOLO / SSD8y
  • CLIP
  • DeepSORT
  • vSLAM 6y
  • Sensor fusion
Edge · Systems
  • ONNX / TensorRT
  • Quantization · Distillation
  • Ambarella CV25 / CV72
  • Oclea SDK
  • Yocto Linux
Languages · Cloud
  • Python10y
  • C / C++8y
  • C# / .NET3y
  • AWS · Azure · GCP
  • Docker · CI/CD
08 / PATENTS & SERVICE
IP · PEER REVIEW//
RECOGNISED CONTRIBUTIONS
08/10
PATENTS · CN

Three patents
granted.

  • PAT · 01Telecom fraud prevention via voice-based semantic content analysis.
  • PAT · 02Access safety detection & isolation for virtualized users.
  • PAT · 03Multichannel network intrusion detection & defence.
  • SW · ×3Three software copyrights granted by the Copyright Protection Center of China.
ACADEMIC SERVICE

Reviewer for
top venues.

  • ▸ AAAIAssociation for the Advancement of Artificial Intelligence
  • ▸ VLDBVery Large Data Bases
  • ▸ IEEETCSVT · TETCI · Systems Journal · Access
09 / WHAT I'M LOOKING FOR
NEXT ROLE//AUCKLAND · REMOTE · RELOCATION
ROLE PROFILE · SIGNAL OUT
09/10
NEXT

Embodied AI teams shipping
real systems, not demos.

▸ 01 · PROBLEM
VLA & embodied perception
Robotics, XR, wearables, automotive — anywhere a model must perceive, reason and act on real hardware.
▸ 02 · SURFACE
On-device & edge-first
Compute-bound environments where quantization, distillation and systems thinking decide whether a model ships.
▸ 03 · TEAM
Principal / tech lead IC
Small, senior, product-aligned. Comfortable owning a pipeline end-to-end — research → SDK → silicon.
10 / CONTACT — END OF DECK
BOB MA · AUCKLAND//THANK YOU
SIGNAL ACQUIRED · AWAITING RESPONSE
10/10
LET'S TALK

If this
sounds like you.

  • ▸ EMAILamabo1215@gmail.com
  • ▸ PHONE+64 21 110 8064
  • ▸ BASEDAuckland, New Zealand · Permanent Resident
  • ▸ STATUSOpen to principal / tech-lead roles · VLA / embodied AI
BM