Personal Statement

My decision to pursue an MS degree started with my experience in both computer engineering and music. As a content creator on Bilibili, I realize adequately well the appeals of songwriters:

how they dream of turning simple hums and mouth beats into beautifully arranged tracks,
how they struggle to design a timbre that matches a sample or their sole imagination,
how they strive to create ear-catching rhythms from gestures (since nothing is more intuitive than fingers tapping along the beats),
etc.

Therefore, I dream about a large model that (1) takes multimodal inputs, (2) respects the copyright of training data, and (3) grants the content creator complete control over every fine-grained aspect of the final product.

Blessed by the opportunity to research the two areas I love and am skilled in, I cannot help but reflect on how my interests and experiences can make a social impact.

“In many cases, academia steers changes to enhance people’s well-being.”

For example, my 2021 “oral mesh” model was meant to be adopted by hospitals and clinics to assist in teeth defect detection.
My 2022 time series analysis model, on the other hand, identified power-stealing behavior and revealed underdeveloped areas in need of electricity, thus promoting infrastructure construction and social equality.

However, since the era of AIGC (AI-generated content) in 2022 marked by the release of large generative models like Stable Diffusion, Midjourney, and ChatGPT,

“The impacts of technological breakthroughs are increasingly being driven by the
mass public using the tools.”

The DISSCO digital instrument I helped develop in the summer of 2023 counts as such a tool, but it still requires considerable expertise in music theory to navigate around the software.

A tool must be intuitive enough to gain popularity in the current UGC ecosystem and to fulfill the universal need for free creation and self-expression. If we take the difference in AI knowledge ("proficiency") and access to computing power ("flops") into account, then this is a new inequality problem the research and development community must shed light on.

And I sympathize with such inequality.

During my social practice in Jingdong, Yunnan in June 2021, I led my team in researching the local food industry. Thanks to the refreshing environment in the mountains, agricultural products like tea and mushrooms were of extraordinary quality, but the locals had no experience in advertising. Thankfully, two years later, large models such as Ernie (by Baidu Inc.) that automate video generation are coming to the rescue, empowering individual businesses to produce compelling advertisements and boost sales.

It amazes me how my attitude toward large models has transformed in the last year.

I once regarded them as a threat to humanity’s monopoly of zero-to-one creativity, but eventually realized their artistic value of automating the mundane parts and providing a twerk just outside a composer’s comfort zone to boost freshness. More importantly, an intuitive, human-oriented, and publicly available generative model might be a solution to the inequality in digital proficiency.

I will continue to fight for social equality in my own way, and I hope my research in the graduate years will empower everyone to create, express, and thrive.

Ziyuan Chen

December 2023

Let it be code or music, do it like you never did before.

Personal Statement

On Large Generative Models and Social Equality

“In many cases, academia steers changes to enhance people’s well-being.”

“The impacts of technological breakthroughs are increasingly being driven by the mass public using the tools.”

And I sympathize with such inequality.

It amazes me how my attitude toward large models has transformed in the last year.

“The impacts of technological breakthroughs are increasingly being driven by the
mass public using the tools.”