Personal Statement

On Large Generative Models and Social Equality

My decision to pursue an MS degree started with my experience in both computer engineering and music. As a content creator on Bilibili, I realize adequately well the appeals of songwriters:

Therefore, I dream about a large model that (1) takes multimodal inputs, (2) respects the copyright of training data, and (3) grants the content creator complete control over every fine-grained aspect of the final product.

Blessed by the opportunity to research the two areas I love and am skilled in, I cannot help but reflect on how my interests and experiences can make a social impact.

“In many cases, academia steers changes to enhance people’s well-being.”

However, since the era of AIGC (AI-generated content) in 2022 marked by the release of large generative models like Stable Diffusion, Midjourney, and ChatGPT,

“The impacts of technological breakthroughs are increasingly being driven by the
mass public using the tools.”

A tool must be intuitive enough to gain popularity in the current UGC ecosystem and to fulfill the universal need for free creation and self-expression. If we take the difference in AI knowledge ("proficiency") and access to computing power ("flops") into account, then this is a new inequality problem the research and development community must shed light on.

And I sympathize with such inequality.

During my social practice in Jingdong, Yunnan in June 2021, I led my team in researching the local food industry. Thanks to the refreshing environment in the mountains, agricultural products like tea and mushrooms were of extraordinary quality, but the locals had no experience in advertising. Thankfully, two years later, large models such as Ernie (by Baidu Inc.) that automate video generation are coming to the rescue, empowering individual businesses to produce compelling advertisements and boost sales.

It amazes me how my attitude toward large models has transformed in the last year.

I once regarded them as a threat to humanity’s monopoly of zero-to-one creativity, but eventually realized their artistic value of automating the mundane parts and providing a twerk just outside a composer’s comfort zone to boost freshness. More importantly, an intuitive, human-oriented, and publicly available generative model might be a solution to the inequality in digital proficiency.

I will continue to fight for social equality in my own way, and I hope my research in the graduate years will empower everyone to create, express, and thrive.

Ziyuan Chen

December 2023

Let it be code or music, do it like you never did before.