Recently I was working on cortex optimization for prometheus ecosystem, and I came across a rather interesting go mod problem, which I’ll share here.
Why do I call the title: How to cheat Go mod? This is quite interesting, so I’ll sell it here, but it does break Go mod-related features.
Before we start this topic, we need to briefly introduce the cortex and thanos projects.
Limitations of Prometheus
When it comes to business development, you can’t do without a monitoring system. Prometheus is a cloud-native favorite, with excellent design and flexible usage, it graduated from CNCF with flying colors, and is the first choice for many companies to do monitoring.
However, Promethues also has its own limitations, the most influential of which is its data high availability solution and clustering solution. Monitoring is also one of the most important aspects of the business system, and the alarm cannot be issued in time because the monitoring system is down.
Prometheus official also proposed a federal solution to solve the clustering problem, but this solution is extremely complex and many problems still can not be solved, so the creation of two other CNCF sandbox projects: cortex and thanos. both projects are to solve the clustering of Promethues, high availability.
Since the two projects have the same goal of solving the problem, there are many features that can be reused with each other, and then something interesting happens.
cortex
That said, I had to change the thanos code due to some requirements. I replaced the thanos dependency of cortex when I was debugging locally.
|
|
And then when I compile it, it fails to compile.
Don’t worry, let’s see who this kuberesolver depends on.
Let’s look at it before it gets replaced:
|
|
You can see that in the normal version, kuberesolver@2.4.0 is dependent on thanos and kuberesolver@v2.1.0 is dependent on weaveworks.
After replace:
Isn’t it amazing that the version of kuberesolver@v2.4.0 has disappeared? Since v2.1.0 and v2.4.0 of kuberesolver are incompatible, it won’t compile after replace.
Gomod replace semantics
It’s not magic, it’s about Go mod’s replace semantics, but it’s also an easy feature to ignore.
replace directives: (https://golang.org/ref/mod#go-mod-file-replace)
|
|
In fact, it’s very simple: replace is only valid for the main module (i.e. your current project). This can be summarized as follows.
- the replace of the main module does not work for the dependent module
- The go.mod replace of the dependent module is also not valid for the main module
So, after replace, the thanos replace of the cortex dependency does not take effect. Let’s look at the dependency tree.
- main module cortex => require github.com/weaveworks/common v0.0.0-20210419092856-009d1eebd624
- weaveworks => requre github.com/sercand/kuberesolver v2.1.0+incompatible
- So overall kuberesolver is now only v2.1.0
This logic is consistent with gomod’s replace semantics, i.e., replace is compiled correctly.
Spoofing gomod
It’s even more amazing how cortex compiles directly by requiring thanos, which is correct according to the gomod replace semantics.
This is because according to the documentation we know that replace only works on the main module, it does not work outside of it.
I did an experiment on https://github.com/georgehao/gomodtestmain, for those interested, to verify that gomod is following the gomod replace semantics and the MVS (Minimum Version Selection) algorithm.
The problem is basically at an impasse, so how do we break it?
Go ahead and use the go mod graph function to see the dependency tree of the cortex dependency thanos.
|
|
Since this dependency tree is too long (700+ lines), I won’t post it, but basically you can see that cortex depends on more thanos N versions, and we found an interesting thing in go.mod in the last version.
That is, because of a very old version of thanos gomod require kuberesolver@v2.4.0, gomod mistakenly thought that the cortex-dependent thanos still required kuberesolver@v2.4.0. Although thanos has been changed to repace kuberesolver, cortex is compiled without any problem.
Is this a gomod bug?
Why does cortex depend on so many versions of thanos? This goes back to the opening question about the reuse of cortex and thanos functionality.
Currently, the two projects, cortex and thanos, basically depend on each other as follows:
The cross-reference between cortex and thanos, like Russian nesting dolls, is a nightmare for gomod. go mod replace semantics, surprisingly, let these two nesting dolls to crack.
How to solve
The problem of how to cortex replace thanos, in fact, know the root of the problem, the solution will be very simple, there are two ways it.
- due to the gomod MVS algorithm, we directly specify the kuberesolver version as v2.4.1 in the main project cortex.
- option 1 is only applicable for backward compatible projects, if a project is not responsible for this, this may be a problem, so the more direct solution is to modify thanos go.mod directly, moving the kuberesolver that thanos depends on from replace to require.