Mono-repo vs Multi-repo: different strategies for organizing repositories
- Short version
- Mono-repo and Multi-repo are different strategies for storing code.
- None is better than the other, and the use we give to each one will depend directly on the context in which we find ourselves.
- Longer version
Introduction
As software consultants and during the early stages of a new project, it is common to ask, what would be the best or most appropriate strategy to manage our architecture at the code repository level? This with the aim of addressing a solution in the present, but looking to the future, where something needs to be scaled, integrated and/or modified efficiently.
The choice in software is not definitive, because there is always the opportunity to adjust to changes, whether internal (changing a database, choosing a new cloud provider) or external (updating dependencies, fixing outdated systems, etc). But the fact that it is not definitive does not mean that it does not imply a cost (trade-offs are always present in software engineering), so it is extremely important to make a good decision in early stages, since it will help us to be better prepared for the challenges in the future.
In my experience, I have been able to see myself faced with these decisions in the past. That is why I have decided to reflect in this article, about what my learnings have been, based on the different contexts that I have lived, and when I would recommend using one or the other. But first let's start from the basics...
What is a repository?
A repository is the place where all the files of a project are stored, organized and maintained. The change history of each of these files is kept in them, different versions of them can also be controlled, among other characteristics.
Repositories are commonly used with Git, through providers like Github, GitLab, and Bitbucket. These can store a wide variety of file types.
What is a Mono-Repo?
It is not a monolith. It is a software development strategy in which the code of an entire company or of some of the same projects, are stored in a single repository.
Benefits and challenges of using Mono-Repo
Benefits
- The almost instant way to start writing clean and maintainable code.
- Refactoring solutions and documenting flows so that when a new member joins the team it is very easy for them to integrate into the practices and way of doing things carried out by the team.
- Running our automated code tests, whether Unit, Integration, and/or End-to-End (e2e), is often much easier to perform and maintain.
- If an adequate way of managing the code is chosen, chaos is not generated in the code, an example of this is using domain-driven design (DDD), for the easy administration of each business domain that works on the code.
- At the team level, as there is centralized collaboration, communication is fluid because everyone knows the needs and priorities of the team.
- By having easy access to the dependencies (libraries) used, it is easier to manage them.
- There is a common practice, which I have been able to experience, but which will depend on the context and organizational culture of each company, which is not to leave the teams behind, because the members know when there is a difficulty and the support actions are relatively easy to manage.
Challenges
- Performance is one of the main disadvantages, as keeping all the different scopes of code in the same repository can slow down code pull operations.
- If we have several sub-projects in our repository, it is important to consider that continuous integration (CI) and continuous delivery (CD) type integrations can become a challenge, and it is important to know in advance if we have DevOps teams or people in the team who can facilitate this management.
- Security can be an instant difficulty, because unlike other types of repositories, here you are granting all the access to a person who may only work on part of the project, so it is an important disadvantage to consider for standards and organizational culture. Ideally, both the independence between the mono-repo projects and the versioning of each one of these should be established from the beginning to be able to deal with issues such as integrations and team coordination regarding the maintenance of different code bases.
- Evaluate scalability aspects, in view of the business of the project, because this will help us establish early paths, to satisfy or start working on a need of this type, a pattern that can be useful here would be to use a hexagonal architecture at the code level.
- If for some reason you or the business want to make the decision in the future to migrate from a mono-repo to, for example, a multi -repo, this would imply a high cost in time and effort, which would depend directly on the magnitude of the project(s) contained in the mono-repo.
What is a Multi-Repo?
It is a software development strategy in which the code of the different projects of a company are stored and maintained independently. Ideally a multi-repo approach forces teams to communicate and agree to accommodate each other.
Benefits and challenges of using Multi-Repo
Benefits
- Security is one of the main benefits, because unlike mono-repos, in this strategy it is possible to grant the specific and necessary permissions for a concrete scope in the development of a project.
- Scalability is something feasible and manageable from the beginning, without much effort, because of the early abstraction of the projects.
- Versioning is also almost instantaneous because it doesn't require much coordination, having separate environments in multiple repositories.
- The ownership is much easier to apply since, for example, each repository could satisfy a specific business aspect of the project.
- The independence and performance provided by multi-repos is usually an important aspect of their use for organizations, because it is much easier for them to distribute their resources and evaluate results.
- Migrating our multi-repos to other ways of managing our codebase is much easier to do, unlike mono-repos.
- It is much easier to achieve a CI/CD. Because there is a basic separation, which makes it easier to facilitate resources between teams.
Challenges
- Applying clean code and refactors is difficult, although there are early workarounds to mitigate the way things are done between each team, code duplication is highly likely.
- Running Unit and Integration tests are viable, but running End-to-End (e2e) tests is also a challenge to consider, because having much of the code separated makes it difficult to perform e2e tests by requiring output from different repositories. This can be mitigated by having Quality Assurance (QA) people on the team who are dedicated to testing the full flow of an application.
- Managing dependencies, for example as libraries, can cause friction between the teams if a periodic management strategy is not established to obtain and implement the latest versions between the teams.
- Having separate teams has its advantages as we saw before, but also its disadvantages, among them are:
- Isolation between equipment can affect good engineering practices and good communication between people.
- Without a good onboarding culture, it will be much more difficult to incorporate new members to the team, in a successful and healthy way.
- Prioritization will inevitably be present in each team, this can be somewhat complex to manage, since each team will want to achieve its objectives, but as frequently in the software industry, problems are presented as bugs or differences of opinion, a good management of priorities are aspects that make a difference between what is chaos and rivalry vs. co-creation and empathy.
Summary comparing Mono-repo and Multi-repo
The first thing I want to say is that none is better than the other, and that the use we give to each one will depend directly on the context in which we find ourselves.
Some larger companies using Mono-repo include Google, Meta, Microsoft and Twitter. Among those that use a multi-repo approach we can find Netflix and Amazon.
Seeing it this way, it is clearly visible that it does not necessarily depend on the size of the company, opting for one or the other, it is more a matter of contexts, culture and objectives that each of them pursues.
However, the following is a comparison that I make between the technical and cultural characteristics, which I think could help us decide on one or the other, when we find ourselves immersed in this type of choice.
Technical aspect by concept | Mono-repo | Multi-repo |
---|---|---|
Clean Code | ✅ | ❌ |
Refactor code | ✅ | ❌ |
Security | ❌ | ✅ |
Scalability | ❌ | ✅ |
Tests (Unit, Int, e2e) | ✅ | ⚠️ |
Module versioning | ⚠️ | ✅ |
Clear Ownership (DDD) | ✅ | ✅ |
Integration (CI/CD) | ⚠️ | ✅ |
Dependency Administration | ✅ | ⚠️ |
Migrating to another type of repository management | ❌ | ✅ |
Cultural aspect by concept | Mono-repo | Multi-repo |
---|---|---|
Onboarding | ✅ | ⚠️ |
Independence for maintenance | ⚠️ | ✅ |
Good engineering practices | ✅ | ⚠️ |
Prioritization of needs (features, bugs, etc) | ✅ | ⚠️ |
Isolation and good communication between teams | ✅ | ⚠️ |
Symbology
- ✅ (Benefit) = Represents something obtainable without much effort.
- ⚠️ (Attention) = Represents something not easily obtainable.
- ❌ (Challenge) = Represents something difficult to obtain.
Conclusions
As we have seen previously, the choice will depend on the context and the needs that we need to cover in the present and also considering possible situations in the future.
My experience tells me that there are patterns that we can identify, to help us make a better decision.
Here you can find other patterns that are more generic, such as divide and conquer, in the case of this sentence, dividing the code initially, it could be easier to obtain our objectives, and also considering that it is easier to pass from a multi-repo to a mono-repo. But as I said before, this will always depend on several factors. At a corporate level some questions that could be asked are:
- How many people do we have to manage different aspects of a project (Frontend, Backend, Design, DevOps, QA, etc) together or separately?
- How developed is my organization's culture to work collaboratively across different teams on either a mono-repo or multi-repo?
- What tools and technologies do the different teams in the organization know how to use?
- Do we have the knowledge to integrate our work?
- If we are forced to use a multi-repo approach because for example the product is a superapp:
- How many packages (repositories/libraries) do we intend to manage?
- How do we intend to manage the periodicity of each new package?
- Do we have a defined strategy according to the maturity and workload of my teams, to handle different versions in parallel effectively?
- On the other hand, if the product, for example, is a mobile application with React Native using Nx and it leads us to have to opt for a mono-repo:
- How mature is our team in establishing quality code, with a low duplication rate?
- Do we have the DevOps knowledge necessary to effectively perform a CI/CD pipeline?
Recommendations
Finally, I would like to share with you some recommendations for both scenarios, based on my experience:
mono-repo
- Today there are tools that help us to manage generic web applications, or as micro front-end and/or mobile applications from a mono-repo in an easier way, some of them are Nx and Lerna.
- Establishing a culture of good documentation from the start will go a long way towards successful onboarding.
multi-repo
- Generate standards and good practices from the beginning, to reduce inconsistencies in how the code is written and implemented, this can be done through a wiki (for example: confluence), to leave the agreements of the team.
- Implementing a consistent and dynamic design system early on for your projects can go a long way if you manage it well.
Thank you for reading 🙂