Skip to content
View marcchen2's full-sized avatar

Block or report marcchen2

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
marcchen2/README.md

Hello! I'm Marc Chen.

  • I'm currently a Machine Learning Engineer at Columbia University's Emerging Tech group. We develop ML workflows for researchers, and serve as a general ML resource to the Columbia community at large.
  • Since graduating from my MMath program, I've also been a visiting ML researcher at the University of Waterloo, where I support the computational finance research of Profs. Yuying Li and Peter Forsyth.
  • I recently graduated from uWaterloo with a Master of Mathematics in Data Science, and completed my thesis, which develops a robust RNN model to solve for optimal strategies in portfolio and risk management. I wrote a short article that summarizes this research. I am unable to share the code of the entire RNN framework, but a piece of it is highlighted below.
  • My co-authored ML finance research has been submitted to the Journal of Computational Finance, and is available as a pre-print.
  • On this personal Github, I share select data science projects from outside my professional work.

Highlighted Side Projects:

This project implements a Deep Q-Network (DQN) agent to see if it can approach optimal Yahtzee play with a model-free approach. My implementation uses a dueling network to decompose the Q-value into state and action values, which is especially helpful in Yahtzee since it can learn the values in the large state space independent from the randomness of roll outcomes. My best model so far achieves a median score of 211 over 1000 games. The repo includes a gradio UI to observe the agent play, calculate Q-values, and calculate performance statistics.

Gear Gleaner is a Django webapp that leverages LLMs to aggregate posts from Reddit buy/sell groups and parse them into a standardized database, allowing users to easily browse and search items.

A common problem in ML finance research is the availability of asset return data. It is often necessary to generate synthetic time series data with similar statistical properties as the historical market for training and rigorous strategy testing. One of the methods most widely accepted to be the gold standard by finance practioners is stationary block bootstrapping (Patton et al., 2009) of historical return data. This tool was implemented as part of a larger ML framework I implemented to create optimal portfolio management strategies for my Masters research.

Diagram credit to El Anbari, Abeer, and Ptitsyn (2015)

Mapping transit accessibilty for the Nashville Metro Planning Authority

For my undergraduate economics thesis, I consulted for the Nashville MPO to help them understand the impact of transit access on labor market participation in the Nashville region. I leveraged the Bing Maps API to create an index of transit accessibility in urban areas. I then developed a segmented regression to analyze this index's impact on labor force participation rates within disparate income groups. Thesis. Vanderbilt University News Article.

Pinned Loading

  1. market_data_bootstrap market_data_bootstrap Public

    Tool to generate customized market simulation data with the stationary block bootstrap methodology, as per Patton et al., 2009.

    Python