Subversion Network Protocol De-constructed
So I have been chewing the FUD on Subversion site lately. Now don't get me wrong I think Subversion has many a positives going for it (more on this latter) but what beats me is the following - for an SCM whose sole objective was to be a replacement for CVS, how can it regress on things that CVS does well (I am not making this up see - http://subversion.tigris.org/faq.html#why :
For instance :
If the goal was to do better CVS, the Subversion team should have started with looking at how the network protocol is dealt with in CVS and learned from it.
More ramblings on Subversion to follow soon.
Why does this project exist?
To take over the CVS user base. Specifically, we're writing a new
version control system that is very similar to CVS, but fixes many
things that are broken. See our front page.
For instance :
- CVS opens a single client-server connection for any operation. In contrast Subversion is TCP connection hungry. Using the SVN RA protocol, a checkout can trigger 4-5 new TCP connections. On a WAN if you have a Subversion repository being accessed from say China to USA, this can be a killer. Each TCP connection can take 1.5 Round trips. The SVN DeltaV Http protocol fares better but it too can open multiple TCP connections for operations.
- Chatty protocol. And I thought CVS was chatty, wait till you sniff the Subversion network protocol. I noticed for the SVN RA protocol, repeated invocation on new connections and on each connection the authentciation happens again (if you are using SSH it really slows you down), commands like get-latest-rev, set-path are sent multiple times. Sometimes the same command is sent multiple times on the same connection. Harmless ? Its not an error yeah but why be so wasteful ? I mean its not like this is legacy code that was written 20 years ago. Why not have a clean network protocol model
- What's with this lisp like list syntax on the wire :
( 2 ( edit-pipeline ) 14:svn://zen/mod )
Why not use a more compact representation like CVS. I used to complain the CVS protocol does 4-5 Round trips to execute a command, Subversion can do litereally 14-15 Round Trips in a checkout. Agreed you don't checkout the whole repository often but why the sloppiness in initial design.
( CRAM-MD5 ( ) )
38:rachel 6195af60930ad5367267948ddacf30ca
( get-latest-rev ( ) )
( update ( ( 19 ) 0: true ) )
( set-path ( 0: 16 false ( ) ) ) ( set-path ( 11:dir1/f1.txt 19 false ( ) ) ) ( set-path ( 9:dir1/foo2 19 false ( ) ) ) ( finish-report ( ) )
( success ( ) )
If the goal was to do better CVS, the Subversion team should have started with looking at how the network protocol is dealt with in CVS and learned from it.
More ramblings on Subversion to follow soon.