Hi, I'm Javier Huang rocket_launch

I am a dedicated and experienced Software, Data, and AI/ML Engineer
Scroll down to see my work!

About Me

I am an undergraduate student in the University of Toronto Computer Engineering program, specializing in Software Engineering, Data Science, and Artificial Intelligence. I have over five years of hands-on experience in building industry applications at the intersection of software engineering and AI/ML, with strong expertise in computer science and engineering principles. My work focuses on building systems that are scalable, reliable, and intelligent to solve complex problems across diverse domains and deliver measurable impact. I thrive in both team settings and independent projects, continuously learning new technologies and building robust, effective solutions.

Photo of me at Scotia Plaza as an intern at Scotiabank

Experience link

View All chevron_right

Work and professional experiences I've had throughout my career.

9 experiences

business_center Featured Experience

Platform Engineer Junior

Scotiabank arrow_outward Internship May 2026 – Present

Apache KafkaConfluent CloudApache Flink SQLPowerBIBackstageNext.jsReactNode.jsAzure OpenAIPostgreSQLGCPGoogle Cloud SQLGoogle Cloud Storage (GCS)TerraformGitHub ActionsGitHubBitBucketJira

Data Platform Engineer Intern on the Data Platforms Event-Driven Services (EDS) team, building event-driven data platform solutions to streamline self-service, improve developer experience, and expand data access through the Event Exchange platform.
Engineered Catalog 2.0, an automated reporting platform leveraging Power BI, PostgreSQL, distributed platform APIs, and GCP services (Cloud SQL, Cloud Storage) to consolidate Kafka ecosystem metadata, ownership information, usage analytics, and operational metrics into centralized dashboards, reducing executive reporting effort by 60× from 1 hour to 1 minute.
Developed reusable Backstage plugins using React, Next.js, and Node.js to integrate Event Exchange self-serve portal capabilities into the Scotia Developer Portal (SDP), establishing standardized platform extension patterns; automated plugin publishing workflows with GitHub Actions CI/CD and Terraform repository templates, improving developer access to internal data products.
Designed and developed an AI-powered real-time financial crime detection PoC using Apache Kafka, Confluent Cloud, Apache Flink SQL, and Azure OpenAI to implement low-latency stream processing, anomaly detection, LLM-powered alert enrichment, and AI-assisted case triage; implemented data governance controls and optimized data pipelines for data performance, consistency, and reliability to support real-time risk detection; proposed the architecture as a foundation for future bank-wide real-time AI applications and presented the solution to CTO, EVP, SVP, and VP-level leaders as well as engineering teams across Scotiabank.

Team Lead

Scotiabank × IMI BIGDataAIHUB arrow_outward IMI Big Data & AI Competition 2025-2026 December 2025 – April 2026

PythonPandasNumPyscikit-learnimbalanced-learnSHAPLIMENext.jsReactJupyter NotebookAnacondaGitHub

Led Team 33 to develop an interpretable AI Anti-Money Laundering (AML) platform combining machine learning (ML), explainable AI, and regulatory intelligence, and won 1st Place with a $15,000 prize among 430 competitors across 90+ teams, including PhD, Masters, and Undergraduate students.
Built and optimized anomaly detection pipelines using Python, Pandas, NumPy, scikit-learn, and imbalanced-learn; extended a 2-model framework with 14 transaction-specific models across 7 transaction types for 61,000+ individual and small business transactions, capturing behavioural patterns and identifying 6.5× more AML critical transactions on largely unlabeled datasets.
Built explainable AML decision workflows combining SHAP, LIME, and LLM-generated explanations with regulatory knowledge sources including FINTRAC, FINCEN, and FLSC, improving transparency, auditability, and investigator confidence.
Developed a full-stack investigation platform using Next.js and React to integrate anomaly detection results, AI-generated case explanations, and a knowledge library of AML red flags and suspicious activity patterns into a unified system.

Project Manager & Architect

Agentiiv arrow_outward Multi-Agent AI Platform Startup December 2025 – March 2026

MCPFastAPINext.jsReactPostgreSQLAWSSlackGoogle WorkspacePrometheusGrafanaDockerJiraGitHub

Architected and delivered MCP Gateway, a production-grade orchestration layer enabling secure AI agent-to-MCP server communication through an industry collaboration with UTMIST, defining system architecture, technical requirements, and Agile delivery milestones using Jira.
Designed and developed a containerized AWS-hosted gateway using FastAPI and Docker, integrating 5 MCP servers (134 tools) across PostgreSQL, Slack, and Google Workspace; implemented JWT SSO authentication, RBAC, centralized PostgreSQL logging, and rate limiting.
Built platform observability infrastructure using Prometheus, Grafana, and a custom React dashboard to monitor request traffic, server utilization, system health, and failures, enabling reliable multi-agent workflow execution through centralized tool access.

Software & Machine Learning Engineer

BuildingAssets arrow_outward AI Energy Auditing Startup October 2025 – March 2026

PythonFastAPIOpenRouter APINext.jsReactFlutterAWS EC2GitHub

Developed AuditMate, an AI-powered web and mobile platform for automating building energy audits at BuildingAssets. Performed data analysis and image preprocessing, and built backend APIs using FastAPI to support audit workflows.
Integrated OpenRouter API with Google Gemini agents for computer vision-based fixture identification, manual retrieval, and energy improvement recommendations.
Built Next.js and Flutter frontends, and deployed on AWS EC2, enabling both professional auditors and self-serve clients through automated and guided audit experiences.

Technology Director

GenAI Genesis arrow_outward October 2025 – March 2026

Next.jsReactSupabasePostgreSQLREST APIsZodGitHub ActionsVercelJestFigmaGitHub

Organizer of GenAI Genesis 2026, Canada's largest AI hackathon, leading technology development for participant and judging platforms supporting 2,000+ applicants, 800+ hackers (30% YoY growth), 250+ projects, and 90+ judges; built and maintained platforms that facilitated 10,000+ interactions throughout the event.
Engineered scalable full-stack infrastructure using Next.js, React, Supabase, and PostgreSQL, implementing secure REST APIs, Zod validation, database schemas, and role-based access control to support participant workflows and judging operations.
Built and deployed automated CI/CD delivery pipelines using GitHub Actions and Vercel, improving release reliability through continuous integration and streamlined deployment workflows while achieving zero-downtime operations throughout the hackathon.

View All Experience chevron_right

Projects link

View All chevron_right

Explore my projects across three categories: AI & Machine Learning, Business & Education, and Online Web Games.

22 projects

View most of my projects on GitHub arrow_outward

AI & Machine Learning

Projects focused on artificial intelligence, machine learning, and data science applications.

Explore 14 projects arrow_forward

Business & Education

Applications designed for business management, educational tools, and collaboration platforms.

Explore 5 projects arrow_forward

Online Web Games

Interactive games and simulations.

Explore 3 projects arrow_forward

rocket_launch Featured Projects

Canopy: AI Course Generation & Mastery Platform

OpenAI Build Week Hackathon

July 2026

GitHub arrow_outward

GPT-5.6CodexPythonTypeScriptNext.jsReactFastAPIPydanticSupabasePostgreSQLpgvectorpgmqDockerOpenAIAzure OpenAIAWS BedrockMaterial UIMonaco EditorpytestAddressSanitizerPyTorchscikit-learnNode.jsGoC++

Canopy: AI Course Generation & Mastery Platform thumbnail 2

Canopy: AI Course Generation & Mastery Platform thumbnail 3

Canopy: AI Course Generation & Mastery Platform thumbnail 4

Canopy: AI Course Generation & Mastery Platform thumbnail 5

Canopy: AI Course Generation & Mastery Platform thumbnail 6

Canopy Devpost arrow_outward

Canopy turns a source, a research paper, documentation, or personal notes, plus a stated learning goal into a complete course of source-grounded lessons, worked examples, quizzes, and coding labs, tracking per-concept mastery as separate Understand (quiz) and Apply (lab) estimates via Bayesian Knowledge Tracing, with prerequisite-aware review for weak concepts.
Coding labs run in a Monaco-based editor and are verified before reaching learners through a bounded generate, execute, repair, and re-verify loop: reference solutions run against hidden tests in fresh, network-disabled, non-root sandbox containers, with real tracebacks fed back into regeneration.
Built as a five-service platform, Next.js web app, FastAPI service, ingestion/generation worker, LLM gateway, and sandbox runner, backed by Supabase for auth, storage, Postgres, and vector search. GPT-5.6 is routed by stakes: flagship Sol for planning and lab repair, faster Luna for high-volume authoring, grading, and the in-lesson helper.
Built by Javier Huang and Ethan Qiu for the OpenAI Build Week Hackathon; the demo course was generated end to end from Vaswani et al.'s "Attention Is All You Need" (arXiv:1706.03762).

Quant Sandbox

Local-First Quantitative Trading Research Platform

July 2026

PythonFastAPISQLAlchemySQLitePandaspandas-taWebSocketsscikit-learnPyTorchSHAPHydraMLflowOANDA APIAlpaca APIInteractive Brokers APINext.jsReactTypeScriptMaterial UITanStack QueryGitHub

Designed and built Quant Sandbox, a local-first quantitative research platform for developing, backtesting, and paper-trading systematic strategies (rule-based and ML-driven alpha models) with full decision-level attribution, utilizing a Python/FastAPI engine layer paired with a Next.js/TypeScript research dashboard.
Architected a shared Event schema (OHLC, indicator state, buy/sell/hold/close signal, and a mandatory human-readable trade rationale, e.g. "RSI(14) 28.4→31.2, crossed above 30 → oversold bounce") so vectorized backtests and live paper-trading sessions emit identical event traces, replayed through one candle chart, indicator pane, decision log, and equity curve for both historical and real-time analysis.
Built a pandas-vectorized backtesting engine with configurable transaction cost modeling (spread, slippage, commission), persisted per-run event traces and P&L/drawdown stats broken down by month/quarter, and a plugin strategy framework covering trend-following (SMA/EMA crossover), mean-reversion (RSI), and momentum (Bollinger Band breakout) signal logic.
Implemented two capital allocation schemes for multi-strategy/multi-asset portfolio construction with dynamic leader-takes-capital rotation by trailing performance, and conviction-weighted capital splitting across simultaneously traded instruments, plus versioned, snapshot-safe trading system configs reusable across backtests, portfolios, and live runs.
Wired live paper-trading execution to OANDA, Alpaca, and Interactive Brokers (plus a synthetic broker for credential-free testing), enforcing bar-by-bar strategy evaluation identical to backtest semantics, with real-time candle aggregation from streamed tick data for sub-minute granularities.
Built an ML alpha-research pipeline (Hydra-configured, MLflow-tracked) spanning signal generation, regime detection, volatility forecasting, meta-labeling/bet sizing (López de Prado-style), and SHAP-based model diagnostics, with single-run, hyperparameter grid search, and rolling walk-forward validation launchable from the dashboard.

Eyas: AI Security Camera Agent arrow_outward

2026 Hugging Face Build Small Hackathon, Best Agent Award with a $1,000 Prize

June 2026

GitHub arrow_outward

PythonYOLO11nMiniCPM-VNemotronllama-cpp-pythonGradioReactTypeScriptViteMUIRechartsFramer MotionComputer Vision

Eyas: AI Security Camera Agent thumbnail 3

Eyas: AI Security Camera Agent thumbnail 4

Eyas: AI Security Camera Agent thumbnail 5

Build Small Hackathon Winners arrow_outward Eyas HuggingFace Article arrow_outward Social Media Post arrow_outward

Won the Best Agent Award with a $1,000 prize out of 946 apps created by hackers worldwide. Eyas is an offline CCTV intelligence agent that turns raw security footage into a structured event log using a chain of small, locally-running models with no cloud APIs, built to help convenience store owners detect shoplifting in real time instead of reviewing footage after the fact.
Runs a four-stage pipeline entirely on CPU: YOLO11n detects and tracks people frame-by-frame; MiniCPM-V 4.6 (1.3B VLM) analyzes sub-sampled crops per tracked person and returns structured JSON observations; a heuristic event structurer converts observations into typed, zone-tagged events with timestamps using configurable evidence buffers; Nemotron 3 Nano 4B (Q4_K_M GGUF via llama-cpp-python) reasons over the structured event log to produce summaries, risk assessments, and natural-language Q&A with grammar-constrained JSON output. TinyAya handles Korean translation and VoxCPM2 generates audio briefings on CUDA-capable machines.
Built a custom React SPA (Vite, MUI, Recharts, Framer Motion) served as static files through Gradio Blocks, replacing the default Gradio UI with a multi-camera review interface featuring resizable split panels, annotated video playback, a scatter-chart event timeline, and live pipeline progress tracking.
Social media video filmed at Joy Convenience Store using mock camera angles. Demo footage sourced from publicly available CCTV clips, renamed and run through the full pipeline end-to-end.

Mycelium: Autonomous GitLab Knowledge Continuity Agent arrow_outward

Google Cloud Rapid Agent Hackathon

May 2026 – June 2026 GitHub

GitHub arrow_outward

PythonFastAPIGoogle CloudVertex AIGeminiGoogle ADKGitLab MCPMongoDBMongoDB MCPMCPDockerCloud RunNext.jsReactTypeScript

Mycelium Devpost arrow_outward

Mycelium is a fully autonomous GitLab agent that prevents engineering knowledge loss by continuously modeling ownership, expertise, and structural fragility across a codebase and acting directly inside GitLab to stabilize it.
Infers real ownership from commits, code reviews, module interaction patterns, and temporal signals. Identifies concentrated expertise risk, orphaned subsystems, and hidden fragility from evolving codebases and forks. When issues are detected, the agent autonomously creates targeted GitLab issues, generates onboarding context packs for new contributors, and produces handoff artifacts from historical activity when engineers go inactive.
Runs a fully autonomous Observe → Model → Analyze → Decide → Act → Reflect → Persist → Summary pipeline. Webhook-driven triggers (push, merge request, issue, and member events) activate the pipeline in real time. A CLI enables direct operation with live streaming, on-demand onboarding and offboard generation, knowledge graph inspection, and event replay. A Next.js monitoring dashboard provides full visibility into live agent traces, per-stage reasoning, knowledge graph topology, bus factor analytics, and a complete action history log.
Built on Google Cloud Agent Builder using Gemini via Vertex AI as the reasoning model and ADK as the agent runtime on Vertex AI Agent Engine. Execution layer uses the official GitLab MCP server alongside a custom GitLab MCP server for extended continuity workflows, and MongoDB MCP for persistent graph queries. Backend is a FastAPI server deployed on Cloud Run via Docker, handling webhook ingestion, pipeline orchestration, REST APIs, and SSE streams for real-time UI updates.

Sentinel AML: Real-Time AML Intelligence Platform

2026 Confluent Data Streaming World Tour AI Day Toronto Hackathon, First Place Winner, Most Impactful AI App Award (MacBook Pro Prize)

May 2026

Confluent CloudFlink SQLAzure OpenAIFlink Streaming AI AgentsKafkaPythonFlaskNext.jsData Engineering

Sentinel AML: Real-Time AML Intelligence Platform thumbnail 2

Winner Announcement arrow_outward

First place winner at the 2026 Confluent Data Streaming World Tour AI Day Toronto Hackathon, with the Most Impactful AI App Award (MacBook Pro Prize). Built Sentinel AML, a real-time anti-money laundering (AML) intelligence platform that continuously analyzes financial transaction streams to detect suspicious activity as it occurs.
Designed for compliance teams, financial crime investigators, and risk operations units in banks and fintechs, Sentinel AML replaces batch-based compliance with a continuous, autonomous, and auditable decision pipeline.
Detection runs in Confluent Flink SQL using tumbling-window aggregations and statistical anomaly detection to surface behaviors such as structuring, velocity spikes, and cross-border anomalies within minutes.
Alerts are enriched via Azure OpenAI (gpt-5-mini) using Confluent's ML_PREDICT function to produce concise analyst-ready explanations inside Flink, and routed to a streaming AI agent that triages cases to escalate, review, or dismiss with recommended Suspicious Activity Report (SAR) decisions.
The system substantially reduces detection latency and manual investigation workload while improving response speed, operational efficiency, and auditability, lowering false positives compared to batch workflows.

IMI Big Data & AI Competition 2025-2026 arrow_outward

1st Place Winner with a $15,000 prize - Hosted by Scotiabank × IMI BIGDataAIHUB

December 2025 – April 2026

PythonPandasNumPyscikit-learnimbalanced-learnSHAPLIMENext.jsReactJupyter NotebookAnacondaGitHub

IMI Big Data & AI Competition 2025-2026 Website arrow_outward Winning Team Announcement arrow_outward

Team lead of team 33. Led the team to win 1st place with a $15,000 prize among 430 competitors across 90+ teams, including PhD, Masters, and Undergraduate students, in the IMI Big Data & AI competition hosted by the Institute for Management & Innovation (IMI) UTM BIGDataAIHUB in partnership with Scotiabank.
Engineered a data-intensive, interpretable AI Anti-Money Laundering (AML) system, with data analysis pipelines using 16 machine learning models across individual and small business entities, trained on customer-level features aggregated over 7 transaction types plus an all-types view in largely unlabelled financial data, capturing overall and transaction-specific anomalies.
Designed and developed a source-compliant AML knowledge library (FINTRAC, FINCEN, FLSC, etc) mapping regulatory red flags to data features via AML patterns, enabling SHAP-generated, LLM-enhanced explanations that are human-readable, distinguish fraud vs. AML, and preserve auditability and factual consistency for compliance use.
Built a full-stack Next.js web application integrating detection models, explainability, and the AML knowledge library into a unified investigator-facing workflow platform.

Clover: Drone Propulsion System Health Diagnostics

U of T ESC102 Praxis II Course Project

January 2026 – April 2026

PythonPyTorchSupabaseFlutterNext.jsReactNumPyPandasMatplotlibJupyter NotebookGitGitHub

Clover: Drone Propulsion System Health Diagnostics thumbnail 1

Clover: Drone Propulsion System Health Diagnostics thumbnail 2

Clover: Drone Propulsion System Health Diagnostics thumbnail 3

Clover: Drone Propulsion System Health Diagnostics thumbnail 4

ESC102 Praxis II Course arrow_outward Clover Poster PDF arrow_outward Clover One-Pager PDF arrow_outward

Built Clover, a portable AI-powered pre-flight diagnostic system for D. Vision Aerials' FPV drones. Clover replaces manual motor spin checks with automated go/no-go assessments, improving consistency in pre-flight decisions and reducing operational risk in dense urban environments.
The system evaluates drone health by combining acoustic motor analysis with onboard flight controller blackbox telemetry, comparing expected motor behavior against observed performance to detect and localize faults prior to flight.
Clover runs on an edge platform (Raspberry Pi), producing real-time, time-stamped diagnostic outputs with full audit logs, and syncing results to a Supabase cloud database. Data is accessible remotely through a Flutter mobile app and a Next.js web dashboard for monitoring and review.
For acoustic fault detection, it uses a multitask 1D CNN-ResNet architecture based on sound classification research. The deep learning model identifies fault types (motor, propeller) and infers flight direction. Audio inputs are standardized via preprocessing and domain normalization so that recordings from different drone units are mapped into the same feature distribution as the training dataset, ensuring compatibility and stable anomaly detection.
The model was trained for 100 epochs on 324k audio samples, reaching 97.4% accuracy and F1 score on the reference dataset. In field testing, Clover achieved 81% accuracy/F1 on acoustic data and 87% on blackbox telemetry-based diagnostics.

View All Projects chevron_right

Research link

View All chevron_right

Research spanning AI/ML, engineering, math, and science.

9 research items

psychology

AI & ML Research

Explore 3 entries arrow_forward

science

Science Research

Explore 6 entries arrow_forward

science Featured Research

SceneClarity: A Unified Framework for Scene Reliability Estimation and Classification in Autonomous Vehicle Perception

UTMIST Machine Learning Project ML Research Project August 2025 – April 2026

PyTorchTensorFlowScikit-learnPandasNumPyDockerREST APIsNext.jsReactGitHubVSCodeJupyter NotebookGoogle ColabJira

SceneClarity: A Unified Framework for Scene Reliability Estimation and Classification in Autonomous Vehicle Perception thumbnail 2

SceneClarity: A Unified Framework for Scene Reliability Estimation and Classification in Autonomous Vehicle Perception thumbnail 3

SceneClarity: A Unified Framework for Scene Reliability Estimation and Classification in Autonomous Vehicle Perception thumbnail 4

SceneClarity: A Unified Framework for Scene Reliability Estimation and Classification in Autonomous Vehicle Perception thumbnail 5

SceneClarity: A Unified Framework for Scene Reliability Estimation and Classification in Autonomous Vehicle Perception thumbnail 6

UTMIST SceneClarity Presentation arrow_outward SceneClarity Research Paper Abstract arrow_outward

Led the development of the SceneClarity machine learning project, a modular framework for estimating scene-level reliability in autonomous vehicle perception, addressing degradation under adverse conditions such as fog, rain, snow, and glare where failures often co-occur and are difficult to diagnose at the system level.
The architecture separates perception, environmental inference, and aggregation modules through a fixed interface, allowing components to be replaced without redesigning the aggregation logic.
Introduces a framework that aggregates perception outputs and environmental signals into a global reliability score with attribution to likely degradation factors, representing reliability as a decomposition over semantically interpretable scene-level components, unlike per-prediction uncertainty methods.
Implemented as a real-time system producing structured outputs and visualizations to support failure analysis, safety monitoring, and debugging.

Vision Transformer (ViT-B/16) Architecture Implementation arrow_outward

Independent Research December 2025 – January 2026 GitHub

GitHub arrow_outward

PythonPyTorchTorchvisionTorchinfoNumPyMatplotlibPILKagglehubJupyter NotebookGoogle ColabGitGitHub

Vision Transformer (ViT-B/16) Architecture Implementation thumbnail 1

Vision Transformer (ViT-B/16) Architecture Implementation thumbnail 2

GitHub Repository with Code and Documentation arrow_outward

Implemented the Vision Transformer (ViT-B/16) architecture from scratch in PyTorch, following the paper "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale." Manually built all core components, including convolutional patch embeddings, class and positional embeddings, Multi-Head Self-Attention (MSA) and MLP blocks with Layer Normalization (LN) and residual connections, as well as the final classification head.
Used the equations and architectural definitions from the original paper to reason about data flow and tensor transformations throughout the model, explicitly tracking tensor shapes step-by-step from input images to output classification in order to ensure correctness and deepen understanding of the model structure.
Validated the implementation end-to-end by training the model from scratch on a 5-class weather image classification dataset sourced from Kaggle. Documented training simplifications relative to the paper and compared the custom implementation with PyTorch's built-in ViT.

Autonomous Vehicle Path Planning, Deep Learning & Ethics

International Baccalaureate Programme IB Extended Essay November 2023 – February 2025

PythonTensorFlowPyTorch

Autonomous Vehicle Path Planning, Deep Learning & Ethics thumbnail 1

Autonomous Vehicle Path Planning, Deep Learning & Ethics thumbnail 2

Autonomous Vehicle Path Planning, Deep Learning & Ethics thumbnail 3

Autonomous Vehicle Path Planning, Deep Learning & Ethics thumbnail 4

IB Extended Essay Abstract arrow_outward

Produced a 4000-word research paper evaluating the societal, ethical, and regulatory impacts of autonomous vehicles (AVs) through empirical research and academic literature. Analyzed deep learning applications in AV perception and path-planning systems, assessing both technical capabilities and ethical limitations.
Designed and conducted a primary survey on public perceptions of AV safety and adoption, generating quantitative insights through data analysis. Synthesized primary and secondary sources to develop evidence-based predictions on future AV regulation and adoption trends.

View All Research chevron_right

Certifications link

View All chevron_right

My professional certifications across AI/ML, data science, data engineering, cloud platforms, and investment.

45 certifications

workspace_premium Professional Certifications

workspace_premium

Hi, I'm Javier Huang rocket_launch

About Me

Experience link

business_center Featured Experience

Platform Engineer Junior

Team Lead

Project Manager & Architect

Software & Machine Learning Engineer

Technology Director

Projects link

AI & Machine Learning

Business & Education

Online Web Games

rocket_launch Featured Projects

Research link

AI & ML Research

Science Research

science Featured Research

SceneClarity: A Unified Framework for Scene Reliability Estimation and Classification in Autonomous Vehicle Perception

Autonomous Vehicle Path Planning, Deep Learning & Ethics

Certifications link

workspace_premium Professional Certifications

University Courses link

school Featured Courses

Awards link

emoji_events Featured Awards

Skills link

AI/ML & Data Science

Full-Stack Development

Databases & Data Engineering

Cloud & DevOps

Dev Tools & Design

psychology Featured Skills

Contact