Sean (Yusheng) Han

Boston, MA | 737-600-1907 | sean.yusheng.han@gmail.com | LinkedIn | GitHub | Portfolio

Education

Northeastern University

Sep 2023 - Dec 2025

MS, Information Systems

  • Coursework: Prompt Engineering & AI, Program Structure & Algorithms, Application Engineer & Development, Data Science Engineering Methods, Data Management & Database Design

Texas Tech University

Jan 2018 - May 2022

BS, Computer Science and Technology

  • Coursework: Data Structures, Theory of Automata, Computer Networks, Design/Analysis of Algorithms, Operating System

Work Experience

Phicil-itate Change | Data and AI Development Intern

Jan 2025 - Aug 2025

  • Engineered a HIPAA-compliant, AI-powered voice agent capable of conducting complex phone surveys in multiple languages with ElevenLabs; successfully handled real-world call interruptions with a 99% survey completion rate.
  • Designed a multi-agent AI workflow with LangGraph that automatically extracted and structured data from thousands of unstructured patient surveys; accelerated data processing by over 500% and generated insights that led to a 15% improvement in consultant services.
  • Developed a secure full-stack web portal with FastAPI and React.js for patients and consultees using JWT authentication; architected to handle 1000+ concurrent users with RabbitMQ, delivering sub-2-second page load times with zero security breaches.
  • Implemented a complete CI/CD pipeline from scratch using GitHub Actions, automating build, testing, and deployment processes; cut manual deployment effort by 95%.
  • Architected a scalable HIPAA-compliant database solution on AWS to securely manage sensitive health information for over 1,000 patients while ensuring 100% data integrity and availability.

Shenzhen Clou Electronics Co., Ltd | Backend Software Engineer

Jul 2022 - Dec 2022

  • Built Spring Boot backend supporting real-time monitoring for 200+ facilities and engineered JUnit testing that cut post-deployment defects by 50%.
  • Optimized MySQL + ETL pipelines, improving DB latency by 40% and reducing ETL runtimes by 65% while maintaining 99.9% uptime.

Personal Projects

Distributed Large-Scale LLM Training & HPC Optimization

Sep 2025 - Dec 2025

  • Fine-tuned a 1.9B-parameter GPT-style model on an 8x NVIDIA H100 cluster using BF16 mixed precision and fused optimizers, achieving approximately 108K tokens/sec throughput.
  • Implemented custom ZeRO-style state and gradient sharding that reduced GPU memory overhead by approximately 7x compared to standard Distributed Data Parallel training, enabling larger batch sizes and better utilization.
  • Architected a high-throughput distributed data loader with manual sharding, masked loss computation, and task-aware batching to support multi-task training across terabyte-scale datasets.
  • Deployed a hybrid optimizer workflow combining Muon and AdamW with a GRPO-based reinforcement alignment strategy, improving reasoning accuracy in coding-focused benchmarks.
  • Identified and resolved NCCL communication bottlenecks and data loader inefficiencies, achieving near-linear scaling efficiency across nodes.
  • Built an automated evaluation harness for LeetCode and HumanEval benchmarking with sandboxed execution, enabling secure Pass@1 scoring and reproducible model comparisons.

OPT-imize, AI Powered Website to Help Student OPT

Sep 2024 - Jan 2025

Northeastern University

  • Developed a LangChain-powered conversational AI chatbot deployed via Streamlit that answered complex OPT immigration questions and improved information accessibility.
  • Engineered an AI-powered RAG system using Crawl4AI to ingest 200,000+ USCIS documents into MongoDB, enabling real-time access to critical immigration information.

Skills

  • Applied AI: LangChain/LangGraph, FastMCP, ElevenLabs, CrewAI, Hugging Face, RAG, Unsloth, PyTorch
  • Backend: Python, FastAPI, Node.js, C/C++, Go, Git, Bash, Java, RabbitMQ, Flask, Redis, vim/neovim, GNU/Linux
  • Database: PostgreSQL, SQLite, Pinecone, MongoDB, ChromaDB, MySQL
  • DevOps: GitHub Actions, AWS, Docker, Argo CD, Nginx, Cloudflare, Kubernetes
© Sean Han 2026