How to Migrate to a Monorepo

@Hannes Egler

Introduction

There are many articles that analyze the pros and cons of using a monorepo. However, if you have already decided to migrate from a polyrepo to a monorepo, it can be hard to find tutorials or discussions about the process. I want to share my recent experience and discuss the challenges encountered during the migration. This will provide insights into the steps I took and the considerations made in achieving this goal.

Current Status and Motivation

Our team uses .NET for developing the company's business applications. We have adopted a polyrepo management approach to handle services such as web APIs and MQ. These services often need to communicate with each other.

The first challenge we encounter is the difficulty in maintaining the external contracts of these services. Checking all endpoints becomes necessary whenever we need to modify contracts.

Secondly, numerous similar or identical business logics are dispersed across various repositories. Managing changes to business logic incurs significant costs. Moreover, the polyrepo approach is not conducive to cohesive domain logic, especially considering that these services collectively serve one product.

Challenges

The most important thing is to keep our product available as possible as we can. Therefore, we must switch to a new approach in a very short duration. Unfortunately, our team currently uses git-flow with complex feature branch structures. Moreover, new requirements keep emerging, which means we can't afford to pause development while transitioning to a new repository approach. In summary, we need a plan to help us transfer smoothly. It may take some time, but it should be predictable and easy to roll back if we get in trouble.

Migration Plan

We will break down the migration procedure into several steps. I will provide a brief introduction to these steps, followed by a more in-depth discussion in the subsequent sections.

Git

Git is a very helpful tool in this procedure. When it comes to managing several repositories, we can use submodules or subtrees. Submodules are very easy to use. However, submodules don't really incorporate code into the superrepo, and one submodule can't reference another. Our goal is to reuse the codebase in each repo, and submodule is obviously not suitable for this purpose. Subtrees, on the other hand, integrate the subrepo as part of the superrepo. But syncing changes in the subrepo is not easy. In the migration procedure, I believe a subtree is a suitable choice.

About CI/CD

When modifying the GitLab CI script, I encountered two problems. The first is understandable; each repository has its settings, and after merging into one repo, it takes time to integrate all scripts. The second problem arises because we recently changed our workflow. We used to trigger pipelines manually to build and deploy projects. Recently, we modified the setting to automatically trigger pipelines when a pull request is merged. After migrating to a monorepo, every project gets built and deployed whenever a merge happens, which is not our desired behavior.The solution to the second problem is to use GitLab rules and detect changes in each project folder.

Discuss Monorepo Tool

Currently, I don't have plans to use a monorepo tool as our project isn't large enough to necessitate it. While build and deploy remain challenges, we can solved them by modifying the CI script. Nevertheless, I've been investing some time in understanding NX.

One issue with a monorepo is that not every change affects every project in the repository. As the repository grows larger, I doubt we can effectively manage it solely through more CI script settings. NX seems to be a solution. It's easy to use—creating a workspace and installing the NX/dotnet plugin. NX could automatically detect affected projects under the workspace folder and configure builds or deployments.

However, as a .NET developer, I find that the developer experience of NX is more familiar to JavaScript developers. Moreover, if we decide to use NX as our CI tool, it would be more efficient to have it handle all CI/CD as NX tasks. This transition could be challenging as it requires abandoning the GitLab CI script template we are currently using. Currently, there's no urgent need to ask our team members to adopt NX.