Marc Andre Chen marcchen2

Hello! I'm Marc Chen.

I'm currently a Machine Learning Engineer at Columbia University's Emerging Tech group. We develop ML workflows for researchers, and serve as a general ML resource to the Columbia community at large.
Since graduating from my MMath program, I've also been a visiting ML researcher at the University of Waterloo, where I support the computational finance research of Profs. Yuying Li and Peter Forsyth.
I recently graduated from uWaterloo with a Master of Mathematics in Data Science, and completed my thesis, which develops a robust RNN model to solve for optimal strategies in portfolio and risk management. I wrote a short article that summarizes this research. I am unable to share the code of the entire RNN framework, but a piece of it is highlighted below.
My co-authored ML finance research has been submitted to the Journal of Computational Finance, and is available as a pre-print.
On this personal Github, I share select data science projects from outside my professional work.

Highlighted Side Projects:

Yahtzee Q-Learning Agent

This project implements a Deep Q-Network (DQN) agent to see if it can approach optimal Yahtzee play with a model-free approach. My implementation uses a dueling network to decompose the Q-value into state and action values, which is especially helpful in Yahtzee since it can learn the values in the large state space independent from the randomness of roll outcomes. My best model so far achieves a median score of 211 over 1000 games. The repo includes a gradio UI to observe the agent play, calculate Q-values, and calculate performance statistics.

Gear Gleaner

Gear Gleaner is a Django webapp that leverages LLMs to aggregate posts from Reddit buy/sell groups and parse them into a standardized database, allowing users to easily browse and search items.

Market Data Bootstrapper

A common problem in ML finance research is the availability of asset return data. It is often necessary to generate synthetic time series data with similar statistical properties as the historical market for training and rigorous strategy testing. One of the methods most widely accepted to be the gold standard by finance practioners is stationary block bootstrapping (Patton et al., 2009) of historical return data. This tool was implemented as part of a larger ML framework I implemented to create optimal portfolio management strategies for my Masters research.

Diagram credit to El Anbari, Abeer, and Ptitsyn (2015)

Mapping transit accessibilty for the Nashville Metro Planning Authority

For my undergraduate economics thesis, I consulted for the Nashville MPO to help them understand the impact of transit access on labor market participation in the Nashville region. I leveraged the Bing Maps API to create an index of transit accessibility in urban areas. I then developed a segmented regression to analyze this index's impact on labor force participation rates within disparate income groups. Thesis. Vanderbilt University News Article.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly