First of all : do not mislead the number of lines in the code, most of them are dependencies in the vendor folder, which does not take into account the main logic (utilities, client libraries, gRPC, etcd, etc.).
Original LoC analysis with cloc
To put things in perspective, for Kubernetes :
$ cloc kubernetes --exclude-dir=vendor,_vendor,build,examples,docs,Godeps,translations 7072 text files. 6728 unique files. 1710 files ignored. github.com/AlDanial/cloc v 1.70 T=38.72 s (138.7 files/s, 39904.3 lines/s) -------------------------------------------------------------------------------- Language files blank comment code -------------------------------------------------------------------------------- Go 4485 115492 139041 1043546 JSON 94 5 0 118729 HTML 7 509 1 29358 Bourne Shell 322 5887 10884 27492 YAML 244 374 508 10434 JavaScript 17 1550 2271 9910 Markdown 75 1468 0 5111 Protocol Buffers 43 2715 8933 4346 CSS 3 0 5 1402 make 45 346 868 976 Python 11 202 305 958 Bourne Again Shell 13 127 213 655 sed 6 5 41 152 XML 3 0 0 88 Groovy 1 2 0 16 -------------------------------------------------------------------------------- SUM: 5369 128682 163070 1253173 --------------------------------------------------------------------------------
For Docker (and not for Swarm or Swarm mode, as this includes more features, such as volumes, networks, and plugins, which are not included in these repositories). We do not include projects such as Machine, Compose, libnetwork, so in practice the entire docker platform can include much more LoC:
$ cloc docker --exclude-dir=vendor,_vendor,build,docs 2165 text files. 2144 unique files. 255 files ignored. github.com/AlDanial/cloc v 1.70 T=8.96 s (213.8 files/s, 30254.0 lines/s) ----------------------------------------------------------------------------------- Language files blank comment code ----------------------------------------------------------------------------------- Go 1618 33538 21691 178383 Markdown 148 3167 0 11265 YAML 6 216 117 7851 Bourne Again Shell 66 838 611 5702 Bourne Shell 46 768 612 3795 JSON 10 24 0 1347 PowerShell 2 87 120 292 make 4 60 22 183 C 8 27 12 179 Windows Resource File 3 10 3 32 Windows Message File 1 7 0 32 vim script 2 9 5 18 Assembly 1 0 0 7 ----------------------------------------------------------------------------------- SUM: 1915 38751 23193 209086 -----------------------------------------------------------------------------------
Please note that these are very rough estimates using cloc . This may be worth a deeper analysis.
Roughly speaking, it seems that the project takes into account half of the LoC ( ~ 1250K LoC ) mentioned in the question (regardless of whether you are dependent or not, which is subjective).
What is included in Kubernetes, what makes it so big?
Most bloating comes from libraries supporting various cloud providers to facilitate loading on their platform or to support certain functions (volumes, etc.) through plugins. It also has Lot Examples to reject a row count. A fair assessment of LoC should eliminate a lot of unnecessary documentation and sample directories.
It also has much more functionality compared to Docker Swarm, Nomad or Dokku, to give a few. It supports advanced network scenarios, has a built-in load balancing function, includes PetSets , Cluster Federation , volume plugins or other functions that are not yet supported by other projects.
It supports several container engines , so it does not work only with docker containers, but can run other engines (for example, rkt ).
Most of the core logic involves interacting with other components: key value repositories, client libraries, plugins, etc. that extend far beyond simple scripts.
Distributed systems are known to be complex, and Kubernetes seems to support most of the tools from key players in the container industry without compromise (where other solutions make such a compromise). As a result, the project may look artificially inflated and too large for its main mission (deployment of containers on a scale). In fact, these statistics are not so surprising.
main idea
Comparing Kubernetes with Docker or Dokku does not really work. The project is much larger and includes many more features, as it is not limited to the Docker family of tools.
While Docker has many features scattered across several libraries, Kubernetes has everything under its main repository (which significantly increases the number of lines, but also explains the popularity of the project).
Given this, LoC statistics are not surprising.