SVN performance after many changes

My project currently uses the svn repository, which receives several hundred new patches per day. The repository is located on the Win2k3 server and is serviced through Apache / mod_dav_svn.

Now I'm afraid that over time, performance will degrade due to too many changes.
Is this fear reasonable?
We are already planning to upgrade to 1.5, so having thousands of files in one directory will not be a problem in the long run.

Subversion stores the delta (differences) between the two revisions, so it helps to save a lot of space, especially if you only execute code (text) and there are no binary files (images and documents).

Does this mean that to check version 10 of the file foo.baz svn will have revision 1 and then use deltas 2-10?

+49
performance repository svn diff delta
Sep 24 '08 at 15:00
source share
9 answers

What type of repo do you have? FSFS or BDB?

(Now suppose FSFS, as this is the default value.)

In the case of FSFS, each revision is saved as a difference from the previous one. So you think that yes, after many changes, it will be very slow.

However, it is not. FSFS uses the so-called “missed deltas” to avoid the need for too many searches at previous revs.

(So, if you use the FSFS repository, Brad Wilson's answer is incorrect.)

In the case of the BDB repo, the HEAD (latest) version is full-text, but earlier versions are built as a series of differences with the head. This means that the previous revolutions must be recounted after each fixation.

For more information: http://svn.apache.org/repos/asf/subversion/trunk/notes/skip-deltas

PS Our repo is about 20 GB, with approximately 35,000 revisions, and we did not notice any performance degradation.

+58
Sep 25 '08 at 1:54
source share

Subversion stores the latest version as full-text with backward differences. This means that updates in the head are always quick, and what you gradually pay for looks further and further back in history.

+16
Sep 24 '08 at 15:14
source share

I personally have not dealt with Subversion repositories with codebases larger than 80K LOC for the actual project. The largest repository that I actually had was about 1.2 concerts, but this included all the libraries and utilities that the project uses.

I do not think that daily use will be greatly affected, but everything that needs to be viewed in different versions may slow down a little. It may not even be noticeable.

Now, from the sys admin perspective, there are a few things that can help you minimize performance bottlenecks. Since Subversion is basically a file system, you can do this:

  • Put the actual repositories on another drive
  • Make sure that applications to block files other than svn do not work on the disk above
  • Make discs at least 7,500 rpm. You can try to get 10,000 rpm, but this can be excessive.
  • Upgrade your LAN to gigabit if all are in the same office.

This may be redundant for your situation, but this is what I usually did for other file intensive applications.

If you ever outgrow Subversion, then Perforce will be your next step. It conveys the fastest source management application for very large projects.

+5
Sep 24 '08 at 15:17
source share

We are launching a subversion server with gigabytes of code and binary files and up to more than twenty thousand versions. There are no slowdowns yet.

+4
Sep 24 '08 at 15:20
source share

Subversion only saves the delta (differences) between the two revisions, so it helps to save a lot of space, especially if you only execute code (text) and there are no binary files (images and documents).

In addition, I saw many very large projects using svn and never complained about performance.

Are you worried about the timing of discharge? then I think it will be really a network problem.

Oh, and I worked on CVS repositories with 2Gb + material (code, imgs, docs) and never had a performance problem. Since svn is a big improvement on cvs, I don't think you should worry.

Hope this makes your mind a little easier;)

+3
Sep 24 '08 at 15:04
source share

I do not think that our subversive activity slowed down aging. Currently we have several TeraBytes data, mostly binary. We check / record daily up to 50 gigabytes of data. In total, we now have 50,000 versions. We use FSFS as the type of storage and bind either directly SVN: (Windows server) or through Apache mod_dav_svn (Gentoo Linux Server).

I can’t confirm that this leads to a slowdown over time, since we set up a clean server to compare performance with which we could compare. We were not able to measure significant degradation.

However, I must say that our subversive game is unusually slow by default, and, obviously, this is subversive activity itself, as we tried to use another computer system.

For some unknown reason, subversion seems to be completely server limited. Our rates for checking / fixing are limited between 15-30 megabytes / s per client, because then one core server core is completely used up. This is the same for an almost empty storage (1 GigaByte, 5 versions) for our full server (~ 5 TeraByte, 50,000 versions). A setting similar to setting compression to 0 = off did not improve this.

Our high throughput (delivers ~ 1 GigaByte / s) FC-Array is idle, the remaining cores are idle and the network (currently 1 GigaBit / s for clients, 10 GigaBits / s for the server) is also idle. Okay, not very cold, but if only 2–3% of the available capacity is used, I call it idle.

It's not very fun to see how all the components are idling, and we need to wait until our working copies are checked or passed. Basically, I have no idea what the server process is doing, completely consuming one CPU core all the time during check / commit.

However, I'm just trying to find a way to set up subversion. If this is not possible, we may need to switch to another system.

Therefore: Answer: SVN does not degrade performance; initially it is slow.

Of course, if you do not need (high) performance, you will not have a problem. Btw. all of the above applies to the earlier stable version 1.7.

+3
Nov 18 '13 at 16:54
source share

The only operations that can slow down are things that read information from several versions (for example, SVN Blame).

+2
Sep 24 '08 at 15:10
source share

I'm not sure ..... I am using SVN with apache on Centos 5.2. Works fine. The revision number was 8230, something like this ... And on all client computers, Commit was so slow that we had to wait at least 2 minutes for a 1kb file. I am talking about 1 file that does not have a large file size.

Then I created a new repository. Started with version. 1. Now it works fine. Quickly. used by svnadmin create xxxxxx. did not check whether it is FSFS or BDB .....

-one
Apr 17 '09 at 18:52
source share

Perhaps you should consider improving your workflow.

I don’t know if there will be performance problems in the repositories under these conditions, but you can return to a reasonable version.

In your case, you can enable the verification process so that the team takes responsibility for the repo, and each of them passes the repository to the team manager who completed the transaction of pure read-only companies. You have a clean choice at this point that the fix should go up.

Thus, anyone can return to a clean copy, easily view the story. Merging is much easier, and developers can still make their mess as much as they want.

-2
Feb 26 '10 at 9:07
source share



All Articles