Weiming Chen

Sydney, Australia

Personal Profile

A software engineer, an innovator, a knowledge sharer.

Strong commercial experience on leading teams to build high profile enterprise systems with quality built-in, as well as to build complex data processing and machine learning production systems, leveraging AWS cloud infrastructures and mature open source frameworks.

My current interest is to turn latest Machine Learning (esp. Deep Learning) research results into smart products and services.

Work Experience

Senior Software Engineer, WiseTech Global

November 2019 - Present

Highlighted project: On-Premises Big Data Platform
  • Talked to internal stakeholders to understand the business requirements for an on-premise big data platform
  • Introduced, set up and maintained a Spark 2.x cluster (later upgraded to Spark 3.x) on the in-house Kubernetes cluster, using Ceph as a backend data storage
  • Presented and Demoed how to run interactive Spark apps using Scala in Zeppelin and Python in Jupyter Notebooks
  • Set up Jupyter Hub + SparkMagic + Livy for Data Analysts to run PySpark apps
  • Introduced, deployed and managed Airflow, as well as to implement and performance tuning various batch ETLs
  • Evaluated and compared Delta Lake storage layer for storing ETL raw data with parquet files
  • Set up Kafka connectivity with Spark Streaming for more real-time analytics
  • Key techs:
    • Spark, Kubernetes, Airflow, Ceph S3, Docker, Jupyter Hub, Livy, Zeppelin, Delta Lake, Kafka
Highlighted project: Recommender Engine for Software Features
  • Talked to internal stakeholders to understand customer activities on the system, and helped to identify business opportunities with the data available
  • Implemented a prototype to use Collaborative Filtering with Spark MLlib for making recommendations to users
  • Conducted and optimised long-running random search and heuristics-based global optimisation of hyper parameters
  • Key techs:
    • Spark, Spark MLLib, PySpark, feature engineering, collaborative filtering, global optimisation

Senior Software Engineer, Hyper Anna

September 2017 - October 2019

  • developed a client facing chat bot using Rasa with SpaCy / BERT / Tensorflow components
  • designed the architecture and Led the development of a data ingestion pipeline and feature, which enabled the business to serve the SMB market
  • implemented analytics logic and row-based data access control using Scala / Spark
  • improved team development & release process to shorten feedback loop and boost quality:
    • presented test automation and continuous integration
    • using Ansible/Docker and CircleCI, introduced and set up automated deployment and testing in an integration environment, to surface integration issues/regression early
    • implemented an automated smoke & integration testing suite using Cucumber/JavaScript
  • prototyped a semantic search engine using Google BERT NLP model for sales FAQs
  • implemented a chat bot (based on Hubot) in the team's messaging software for common DevOps and deployment commands (and for fun)
  • introduced Deep Learning (with emphasis on recurrent neural networks) to the company

Data Engineer, Invoice2go

November 2016 - September 2017

  • managed and maintained company's data pipeline using Python / Apache Airflow
  • made significant improvements on performance and reliability of the data pipeline
  • proposed and architected the new data pipeline architecture using DataBricks / Apache Spark
  • promoted and implemented automated smoke testing on the new microservice architecture of the company's V2 project

Classical Chinese Poetry Generator, Personal Project

August, 2015 - October 2016

  • a chat bot to demonstrate the power of recurrent neural network (RNN) on natural language generation (NLG)
  • backend based on open-source Deep Learning frameworks: Python / TensorFlow and (superseded) Lua / Torch7 (code: open source implementation using TensorFlow)
  • compared and A/B tested GRU vs. LSTM cells
  • iPython / Pandas for data analysis
  • managed everything by myself: product, marketing, analytics, development, devops
  • with minimal marketing, tens of thousands of users have used the bot via WeChat to generate hundreds of thousands of poems
  • (in Chinese) Twitter accounts

Senior Software Engineer, Fairfax Media

September 2012 - August 2015

Highlighted project: News article recommender system - ReWire
  • built an item-based recommendation engine using Collaborative Filtering algorithms, with Java / Apache Mahout, processing billions of records per run
  • deployment leveraged on-demand Amazon Elastic MapReduce (EMR) Hadoop clusters, EC2 and S3
  • incorporated business rules into the engine with techniques including boosting with metadata and soft recency filtering
  • A/B tested to be proven 36% better than a third party provider, in terms of email click-through rates
  • recommendations have continued serving the massive SMH and The Age newsletter audience base
Project: digital mastheads paywall
  • worked on the high profile paywall project for the Mastheads, using Ruby on Rails and Scrum
  • was responsible for implementing the flow of online activation for print subscribers, guarded the whole flow with RSpec, as well as completed backend system integration.
Project: traffic monitoring system
  • summarised and visualised the daily and weekly site traffic using D3.js graphs and tables
  • built the batch processing backend using Apache Spark / Scala (with RDD) on Amazon EC2
  • backend was Meteor.js and frontend was D3 for interactive graphs
  • meshed public geographic data with private demographic data to create strategic insights
Project: Ignition
  • An idea for company's innovation day, then became the official innovation day platform
  • a kickstarter clone to host and keep track of project/ideas for internal innovation days, it encourages collaborations and participation in the form of pledging time/resource to a project/idea.
  • Site built using Ruby on Rails

Development Team Lead, SAI Global

July 2011 - September 2012

  • led the development team with 10 developers, to complete a joint high-profile software project on time, with one of the big 4 banks in Australia.
  • promoted and enforced the following coding practices: coding standard, peer code review, unit testing with NUnit, continuous integration
  • promoted Behaviour Driven Development, written automated integration tests using SpecFlow / Cucumber, that represented existing and new business processes and workflows, which were communicated and signed off by the stakeholders.
  • promoted NHibernate and Fluent NHibernate as the Object/Relation mapping tool
  • recruited and trained new/junior developers

Previous Experience

2005 - 2011

  • Software Engineer in OzForex, Avaya and Appen
  • Application development using .NET / C# / ASP.NET / NHibernate
  • team player, always going the extra mile to promote and share best software practice
  • also used Python for pre- and post-processing of natural language transcription

Education

Master of Technology Management, University of New South Wales

2006 - 2009

  • achieved a Weighted Average Mark (WAM) of 73.9%
  • chosen this degree as its goal is to fill in the gaps between the management, commerce & engineering disciplines
  • studied computing subjects such as Machine Learning and Human Computer Interaction (HCI)

Bachelor of Software Engineering, University of New South Wales

2002 - 2005

  • achieved a Weighted Average Mark (WAM) of 81.7% - Honours Class 1
  • final year group thesis project involved programming artificial intelligent robots to play soccer, participated in the RoboCup 2005 world competition in Osaka, Japan, achieved 3rd place in Four Legged League. (see: undergraduate thesis report)
  • president of the faculty student society ACM@UNSW, 2004 - 2005
  • faculty of Engineering Dean’s Award, and Inaugural Computer Science and Engineering Undergraduate Performance Award, 2003

Key Skills

Programming

  • strongest languages: Python, Ruby
  • also proficient at: C#, Javascript, Java

Technologies and frameworks

  • Big Data Processing:
    • Scala / Apache Spark with RDD, SparkSQL and Streaming
    • Java / Apache Hadoop MapReduce
    • Redshift
  • Deep Learning Frameworks:
    • Python / TensorFlow
    • Lua / Torch7
  • Web:
    • Ruby on Rails with MySQL, JavaScript / HTML / CSS
    • Meteor / Node with MongoDB
    • C# / .NET / ASP.NET with SQL Server, NHibernate, Fluent NHibernate
    • Angular 1/2, React
    • D3.js
  • Desktop:
    • Java with AWT

Speaking language skills

  • Mandarin Chinese - proficient
  • Cantonese Chinese - native

Hobbies

  • jogging / swimming
  • play and design board games
  • work on my own hobby programming projects

Personal programming projects

SimUniversity

  • a variance of the board game Settlers of Catan, with a University/Campus theme
  • implemented AI using ExpectiMaxN tree search
  • written in C#, with a simple console UI, can be run using Mono

TCG game simulator with AI

  • written in Java and Lua
  • AI implemented using Monte Carlo tree search