▸ Interactive ▸ Index
BOB MA · CURRICULUM VITAE · v2026.04
AUCKLAND · NZ · OPEN TO ROLES

Dr Bob Ma 马博

Machine Learning Engineer · Computer Vision · Embedded AI · Vision–Language–Action
Emailamabo1215@gmail.com
Phone+64 21 110 8064
BasedAuckland, NZ · Permanent Resident
§ 01

Summary

Senior Vision–Language–Action (VLA) and Multimodal AI engineer with deep experience in real-time perception, multimodal reasoning and embodied system deployment on edge devices. Proven track record designing VLA pipelines, optimising VLMs and multimodal LLMs for real-time edge inference, and integrating perception with downstream action/control — from Ambarella SoCs to large-scale commercial vehicle fleets.

§ 02

Experience

Principal Machine Learning Engineer
Mar 2024 — Present
Resideo Inc. (Honeywell Home) · NYSE:REZI
▸ Vision understanding & multimodal large-model inference on edge SoCs
  • Designed and optimised multimodal LLM inference (LLaMA3, GPT, YOLO, CLIP) on edge SoCs for real-time video understanding and mobile deployment.
  • Rewrote backbone operators for Clavary (.cvimodel) conversion and hardware-accelerated inference on Ambarella CV25 / CV72.
  • Model compression, quantisation and knowledge distillation — SOTA inference under strict compute/memory constraints.
  • Contributed a general image/video understanding framework to the Oclea SDK; integrated pipelines with multimodal LLM capabilities.
  • End-to-end ML (train → ONNX → Clavary → on-chip → real-time → post-processing); perception-to-action triggering tracking, sensor and motor actuation.
  • Yocto-based embedded builds, cross-compilation, CI/CD for embedded reliability.
Stack: Multimodal LLMs · C++ · Python · Ambarella CV25/CV72 · Oclea · Yocto · Azure · YOLO · ONNX · PyTorch · CI/CD
Senior Machine Learning Engineer
Mar 2022 — Mar 2024
EROAD Ltd · ASX:ERD
▸ ADAS perception, lane-level navigation & multi-sensor vehicle intelligence
  • Real-time perception pipelines for commercial dashcams — YOLO/SSD object detection, tracking, road segmentation.
  • Lane-level navigation combining lane detection with vSLAM; lane-centre waypoint generation.
  • Camera-to-SLAM extrinsic calibration — lane detections projected into global spatial coordinates.
  • Multi-modal processing (video + GPS + telemetry); road condition + collision warning deployed on edge devices.
  • Scalable AWS infrastructure with SageMaker; ETL pipelines for continuous training and evaluation.
Stack: C++ · Python · OpenCV · vSLAM · GPS Fusion · PyTorch · AWS SageMaker · Docker · Android
Machine Learning Engineer
Jun 2021 — Feb 2022
StayinFront Ltd
▸ Computer-vision framework for grocery / shelf analysis
  • CV framework for grocery analysis — stock forecasting, price-tag / SKU recognition from shelf & packaging images.
  • AWS S3/EC2 integration, DynamoDB warehouse workflows, Python/Flask services.
Stack: Python · Keras · TensorFlow · PySpark · DynamoDB · Flask · AWS
Machine Learning Engineer
Sep 2020 — Apr 2021
BuildingEstimates.com
▸ ML for architectural plan detection
  • Building information extraction & plan recognition — elevation, floor, roof plan analysis.
  • Framework for material analysis and quantity-survey information extraction; Python/Flask on GCP.
Stack: Python · Pandas · NumPy · Flask · GCP · Kubernetes
Data Engineer
Sep 2017 — Sep 2019
Woolworths Group · ASX:WOW · World 500
▸ Data warehouse, logistics platform, cross-border payments
  • Enterprise platform integration with SAP and third-party ERP systems, including payment interfaces.
  • Data virtualisation + .NET interfaces; secure data procedures, stock management & logistics/payment systems.
Stack: C# · Python · Docker · Oracle · SAP · MS SQL
Full Stack Developer
Jun 2014 — May 2017
New Image International Ltd
▸ E-commerce platform & ERP system
  • Order-processing & customer-management workflows; payment-gateway integration; SEO/SEM/analytics.
Stack: C#/.NET · MS SQL · HTML5 · CSS3 · React Native · Alicloud
Software Developer
Dec 2010 — Aug 2013
Huawei Symantec Technology Ltd · World 500
▸ Pattern recognition & detection systems
  • Industrial software with SPC, SOA and multithreaded Linux; radiation detection + spectrum analytics (image & signal processing).
Stack: C/C++ · Linux · OpenCV · GDB · SVN
Software Developer
Oct 2007 — Mar 2010
Topsec Network Security Co., Ltd. · 002212.SZ
▸ Intrusion-detection systems
  • Linux-based IDS and kernel-level packet analysis; DDoS detection and defence.
Stack: C/C++ · Linux · GDB · CMake · Valgrind
§ 03

Technical Skills

VLA / Multimodal
Vision–Language–Action · VLM encoders · Multimodal LLM deployment · Embodied AI · Real-time decision pipelines
Perception & Vision
YOLO · CLIP · Object Detection · DeepSORT · Face Recognition · Scene Understanding · Sensor Fusion
Robotics & Spatial
Visual SLAM · Localisation · Spatial Mapping · Trajectory Understanding · Extrinsic Calibration
Edge AI & Systems
ONNX · TensorRT · Quantisation · Knowledge Distillation · Ambarella CV25/CV72 · Oclea SDK · Yocto
Programming
Python (10y) · C/C++ (8y) · C#/.NET (3y) · SQL
Infrastructure
Docker · CI/CD · REST APIs · AWS · Azure · GCP
§ 04

Education

Jan 2019 — Feb 2023
Doctor of Philosophy (PhD)
Auckland University of Technology · New Zealand
Research: Vision-driven Semantic SLAM for Scene Understanding and Autonomous Systems — visual perception, semantic mapping and robust localisation in dynamic environments.
Sep 2010 — May 2014
Doctoral Research / Prior Research Study
Northwestern Polytechnical University · China
Topic: Learning-augmented robust control for complex dynamic systems.
Sep 2007 — Jul 2010
Master of Science — Computer Software & Theory
Sichuan Normal University · China
Sep 2003 — Jul 2007
Bachelor of Engineering — Data Science & Electronic Commerce
Sichuan Normal University · China
§ 05

Selected Publications

2026 · arXivBodhi VLM: Privacy-Alignment Modeling for Hierarchical Visual Representations in VLM Encoders. arXiv:2603.13728.
2026 · arXivCT-to-X-ray Distillation Under Tiny Paired Cohorts — Evidence-Bounded Reproducible Pilot. arXiv:2603.29167.
2026 · arXivPPEDCRF — Privacy-Preserving Enhanced Dynamic CRF for Sequence Videos. arXiv:2603.01593.
2026 · arXivTSDCRF — Balancing Privacy & Multi-Object Tracking via Time-Series CRF. arXiv:2603.13667.
2026 · arXivREAEDP — Entropy-Calibrated Differentially Private Data Release. arXiv:2603.13709.
2024 · MTAPPrivacy-preserving word-embedding text classification based on DBN-constructed privacy boundary. Multimedia Tools & Applications, 83(10): 30181–30206.
2024 · ICNJudPriNet — Video transition detection via semantic relationship and Monte Carlo sampling. Intelligent and Converged Networks, 5(2): 134–146.
2021 · GLOBECOMPPDTSA — Privacy-preserving deep transformation self-attention framework for object detection.
§ 06

Patents · Academic Service

PAT · 01
Telecom telephone-fraud prevention via voice-based semantic content analysis — CNIPA
PAT · 02
Access safety detection & isolation for virtualised users — CNIPA
PAT · 03
Multichannel network intrusion detection & defence — CNIPA
SW · ×3
Three software copyrights granted by the Copyright Protection Center of China
REVIEW
Reviewer — AAAI · VLDB · IEEE TCSVT · TETCI · Systems Journal · Access
BOB MA · CV · v2026.04
SIGNAL · AMABO1215@GMAIL.COM · +64 21 110 8064