Technology with opinion

Tuesday, May 15, 2007

Review of SCM & VCS

The basic need for both Version Control Systems (VCS) & Software Configuration Management (SCM) have been needed since the beginning of programming. However in recent years more mature products have hit the market and are worth considering.

  1. CVS - old school basic VCS. Not much to delve into here, a quality product. It is a client-server system which is a plus, has no SCM capabilities & is known to have encoding issues from different platforms, basic security implementation. Overall a descent VCS.
  2. SVN (subversion) - seen as the successor to CVS. Architecture is also client-server, can use HTTP through Apache. No SCM & basic security implementation. A quality VCS.
  3. VSS (Visual SourceSafe) - basic VCS, some basic security, flawed design requires users read/write access to directories, leaving repository open to tampering. Architecture is client based. Probably the worst of all offerings, however good integration into Visual Studio is cheap and is better than nother.
  4. StarTeam - my personal favorite, works with Eclipse, Visual Studio, Togethersoft & many other IDEs. Provides full VCS & SCM. This is a very mature product. Integration is seamless & is more intuitive than MS TFS (Team Foundation Server)
  5. TFS (Team Foundation Server) - I really wanted to like this one but I didn't. The integration into Visual Studio seems cumbersome, not as seemless as VSS or StarTeam & the bug tracking was horrible. As for the other features if I cannot get passed the bug tracking then cannot be seriously considered a SCM product. This is a version 1 Microsoft product, hopefully it will become more mature over the years. Our QA personnel at the time also despised the product, installing Visual Studio on a QA persons laptop seems backwards.
There are many other products out there, but I think I covered the main ones. I would be more than happy to review any others that are recommended.

Results
The winner here is Borland's StarTeam, it is a mature product has good integration into a wide variety of IDEs and is both a quality SCM & VCS. Whether you are a heads down coder, PM or QA guy you cannot go wrong here. I also want to give a plug to another great product from Borland, Togethersoft. It integrates into the major IDEs & provides object modeling, data modeling as well as code refactoring and generation. I cannot say enough about this product.

For basic VCS needs CVS, SVN or VSS all fit the bill and is better than nothing and cost nothing. Just make sure that you still have a backup plan regardless of which product you decide to deploy.

Friday, February 16, 2007

Regex vs string.replace() Python vs PHP

RegEx or regular expressions have become popular now in every programming language. It involves a special string which identifies ways to match other strings. It's wonderful for performing a match with a sequences of strings especially if the logic for the match is somewhat complex. RegEx can greatly increase the performance of an app or slow it down & obfuscate it.

Before deciding to use a RegEx, first figure out what you want to do. Lets take two common & basic tasks: input validation of email & replacing a matching string.

First, since the rules for email can be somewhat complicated (for string matching) this would make a good candidate for RegEx, especially if the validation is taking place on the client (who cares about client CPU cycles?).

Benchmark results of 10K operations (smaller number is better)
Large String Length: 65181 chars
Small String Length: 97 chars
Simple String: 45 chars
Complex RegEx: email validation is a RegEx 60 chars

Python:
Testing 100K Loop
0:00:00.025730

Small simple replacement
string.replace()
0:00:00.032380

RegEx (compiled)
0:00:00.055416

RegEx (uncompiled)
0:00:00.054967

Large simple replacement
string.replace()
0:00:04.781832

RegEx (compiled)
0:00:03.997719

RegEx (uncompiled)
0:00:03.141389

Complicated RegEx
compiled
0:00:00.034440

uncompiled
0:00:00.080525

PHP
Testing 100K Loop:
0.0390350818634

Small simple replacement
string replace()
0.0481541156769

RegEx
0.0470488071442

Large simple replacement
string replace()
1.68269109726

RegEx (uncompiled)
1.76776599884

Complicated RegEx
0.0417048931122

Conclusion
Overall
Comparing language to language Python is faster in all categories except where the replace expression is simple & the target string is large, here PHP whooped up very nicely. As you can see there is no one size fits all rule, I would expect every language to have sorted results. In preliminary results with C# (.Net 2.0) the RegEx seemed very ineffecient, I hope to post results later.

As to precompiling Reg Ex, it only makes a difference if your expression is complicated, the more complicated your Reg Ex is the bigger boost in performance you get. As these tests show, precompiling your Reg Ex is not always faster.

Note: PHP doesn't support Unicode strings or precompiled RegEx, Python supports both of these, however Unicode RegEx results seemed fairly slow.

In Python
if the target string small & your matching expression is simple use string.replace(), if the target string is much larger then use RegEx uncompiled. With complicated Reg Ex use RegEx compiled.

In PHP
for small target strings with simple matching expression RegEx is slightly faster, differences are negligible. For larger target strings with simple matching string.replace is a bit faster.