Plone Metrics: November 2007

Thursday, November 29, 2007

Regression Coefficients

I shared the graph of Plone site development at Sandia with a coworker and she asked what the numbers in the upper right-hand corner meant. So I figured that might be a common question and it would be worthwhile to discuss it here in today's posting.

The R-squared is the regression coefficient, which describes how much of the variability (the "wiggliness" among the data points) is explained by the linear equation. In this case, the coefficient is over 98%, meaning that the line is significant (>95%) but not highly significant (>99%). 98% of the "noise" in the data can be explained by the linear trend.

The linear equation y = 0.0269 x - 1031.5 simply defines the predicted best-fit straight line as having a slope of 0.0269 sites/day (about 1 site every 37 days). There is a non-zero x-intercept (where the line hits the x-axis, that is, has a y-value of 0) because we started our data with the first Plone site on day zero. That means that zero Plone sites is somewhere to the left of the origin.

Tuesday, November 27, 2007

New Portals

Well since I posted that graph of portal development since 2004, I've been pleasantly surprised to have some more Plone work drop out of the trees. I've set up a Foreign Currency Exchange Portal to let our outward-bound international travelers more easily link up with people who have unspent foreign currency from previous trips. Our Chemical Security folks need a new portal and the sensor group is meeting with us next week to discuss either an extension to the existing one or a separate one altogether for medium-range planning. Proposals are going in for a MENA (Middle East/North Africa) online community-building network and a follow-on to the WACSI workshop, this time in Aleppo. Other international agencies are looking favorably towards our capabilities.

Its difficult to put numbers on these kind of success stories beyond simply counting portals against time as I did before. But there are significant differences in workflow customizations, security tweaks, volume of content, user/owner training, and archetype development for each of these. The bottom line is that our Corporate tool, SharePoint, does not provide the additional products, the ease of customization, or webpage publishing capacities of Plone. We continue to see as much work as our team can handle. Before too long, we'll be hiring new staff to handle the increasing loads.

This sort of anecdotal evidence of CMS success may not have the rigor that I hoped for when I started this blog, but it certainly doesn't have me laying awake at night worrying about what projects we'll use to cover our time-charges next week.

Sunday, November 25, 2007

Measures of Effectiveness

Back in September I introduced Bullock's dissertation on measures of effectiveness. In it he argues that for an entity to reach a desired end-state, one must identify objectives that are described by values. These values will have attributes, which are in turn measurable.

I did not have to identify the exact Plone end-state, choosing instead to focus on two objectives: widely adopted and stable yet evolving. I believe these two objectives form a reasonable desired state for Plone, although the term "end-state" seems a little too final for something that we all hope is on-going.

The problem that Bullock recognized is that objectives and values are quite likely something that can not be directly observed. We only have hard measurements for attributes of values. That's where his use of a Kalman filter is novel. A Kalman filter essentially uses observed attributes to predict unobservable values. Its a useful way of pooling apples and oranges.

Here are my slightly revised attributes, measurements as of 24 Nov. 2007, and the units.

Attribute	Observed	Actual Measure

Release Frequency	0.94	Years/major release
Bugs	704	Active tickets
Core mailing lists	1.875	Msgs/day
Support mailing lists	17.5	Msgs/day
Security vulnerabilities	4	Recent but resolved
Size of CDT	48	Oct & Nov 2007 from Core Dev forum
Involvement of CDT		__________
New features	95%	Percentage of CMS Matrix features
Downloads		__________
Installations		__________
Defectors		__________
Economic health of third-party companies		__________
Technical reviews	8.7	InfoWorld
High-profile installations	950	Plone.net/sites
Users of Plone portals		__________
Visitors to Plone portals		__________

As you can see, I still have some attributes to collect. As a proxy for development, use, and usage, I may look at measures of activity on Openia and Objectis. More to come.

More about Features

Yesterday I noted that Features were poorly defined in the InfoWorld CMS comparison. Their analysis was biased by a mysterious 25% weight on Features (and Ease-of-use) as well. Looking at the Features scores it was Alfresco-10, everyone else-8. I observed that I doubt if Alfresco is 100% feature-finished.

In trying to stay true to my "added rigor" position, I thought I'd cast about for a metric on Features. Most will agree with me that CMS Matrix has the most thorough listing, both of CMSs and of potential features. Out of 136 features listed (ignoring 9 items dealing with system requirments), Alfresco had a 'Yes' or a 'Free Add On' for 130. DotNetNuke had 108, Drupal 102, and Joomla 90. Plone had 129 of the 136 features. This translates into Features percentages of Alfresco-96%, DotNetNuke-79%, Drupal-75%, Joomla-66%, and Plone-95%.

Converting these into 0-10 scores, we have 9.6, 7.9, 7.5, 6.6, and 9.5. Substituting these new, more objective values for InfoWorld's Features score and rerunning their scoring algorithm (with and without their weights) we have:

Alfresco - 9.1 - 8.9
DotNetNuke - 8.4 - 8.3
Drupal - 8.1 - 8.3
Plone - 9.0 - 8.9
Joomla - 8.0 - 8.3

Suddenly we've taken a data set with a run-away leader, Alfresco, and turned it into a two horse race. It also tightens up the grouping among the others.

That said, who cares if Plone has 129 features if the one critical feature you require is missing or poorly implemented as a free add on. Base your decision on your requirements, your IT environment, your staffing strengths and weaknesses, and the job you need done.

Saturday, November 24, 2007

The Weighting Game

Spent the morning doing laundry and looking back at the recent InfoWorld CMS ratings. These ratings illustrates a couple of dangers and a couple of best practices.

My first points have nothing to do with InfoWorld, but rather on what people do with review data.

On the down side, the ratings were taken by Matt Asay and resulted in his statement "The winner? Alfresco, and by a significant margin (over Plone, Drupal, DotNetNuke)." "Significance" is not a term to be lightly tossed around when dealing with statistics. It is usually accompanied by a significance level (often 5%, but occasionally "highly significant" at 1%) and denotes a strict statistical formula for distinguishing hypotheses. The InfoWorld data is not designed to provide significance sensu strictu and one is left to imagine what the difference of 8.6 vs 9.2 means. (Even then its actually 9.15, but they rounded up.)
Also, Matt forgot to mention that Joomla was scored in the survey and came in third.
On the positive side, Matt does full disclosure--he's VP of Business Development for Americas for Alfresco.

However, InfoWorld doesn't get a free statistical ride today.

On the positive side, they provide a link to a methodology page and make an effort to justify their results. This is all too rare and should be emulated.
On the negative side, they don't explain their selection of categories, even though they state that some combination of the listed categories will be used. The absence of Availability is understandable (these are all open-source), but Performance, Reliability, Setup, and Support should have been addressed.
Also, their methodology does not describe what is considered under the Feature category. Many would say that Interoperability and Setup are features. Curiously, only Alfresco warranted a 10 for Features and I seriously doubt that anyone, especially the developers at Alfresco, would claim that their feature set is 100% complete.
They also never explain how they arrived at an 86-80-70-60-50 grading curve when one expects 90-80-70-etc. On top of that they use rating names to bin results, thus disguising the numerical results (still my favorite complaint against school grading of A-B-C-D-F).
Finally, InfoWorld never explains the rationale behind their weighting of categories (25-25-15-15-10-10). If one doesn't weight scores (or uses a uniform 16.7% across 6 categories), Alfresco scores 9.0 and Plone comes in at 8.7, which puts them both in the InfoWorld "Excellent" bin. The weighting clearly doubles the Alfresco "gap," making it appear a clear leader, and moves Plone just barely into the "Very Good" rating.

Armed with this critic we can play all sorts of statistical games. (Remember that Mark Twain said that there are lies, damn lies, and statistics.)

Using equal weights, applying a 90-80-70 scale, and giving Alfresco a more realistic 9.5 for Features, we find that they then are only "Very Good."
Flipping the weights over (10-10-15-15-25-25) puts Alfresco and Plone in a dead heat (8.85 vs 8.75).

Another trick is to play with graphs. Here's the InfoWorld data done as a default Excel bar chart. Alfresco looks far out in front.

But here's a more accurate bar chart with a properly scaled y-axis. What is significant now?

I'd like to wrap up this long posting with a tip o' the hat to the comments by Amy and Bryan at the bottom of the CMS Report posting on the InfoWorld article. Amy from OpenSource.org correctly raises the point that some CMS reviews seem to take a perverse joy in pitting one open-source CMS against another. Bryan rejoins that reviews are both popular and useful as long as everyone stays well behaved.

Here on PloneMetrics, I am an unabashed Plonista. But also I am trying to look at the world with a little more rigor. Tomorrow I'll post the latest on my work to fill in the matrix ala Bullard's method. Stay tuna'd.

Wednesday, November 21, 2007

Plone at Sandia National Laboratories

Here's an interesting bit I generated today ahead of an internal review meeting. Its a plot of the date of creation of all our Plone sites here in Sandia's Cooperative International Programs. As you can see, its very much a linear treand. We can expect to need to build seven new portals in the remainder of the fiscal year. We should hit 40 sites by December 2008.

Tuesday, November 20, 2007

Something Old, Something New

In my digital dumpster diving this evening I turned up Brad Bollenbach's ample introduction at ONLamp.com from Sept. 2004. Although this means he was probably running 2.0.5, his remarks stand the test of time. Alas, I can not locate any of Brad's follow-up articles. Anyone care to recast this kind of piece into Plone 3.0?

And while I'm looking through O'Reilly.com, I note the O'Reilly Open Source Convention 2008 is scheduled for July 21-25, 2008 in Portland, Oregon (good choice). No specifics available yet, but watch that link.

Also on the radar screen for conferences of a sort is Plone Foundation's Strategic Planning Summit 2008. Scheduled for 8-10 Feb. 2008 at the Googleplex, it shows that Limi is doing a great job of juggling his job at Google with his Plone world.

Yet another one to watch out for will be the Plone Symposium East -- "Rally in the Valley", a North American Plone conference hosted at Penn State. Scheduled for 10-14 March, this bumps right up against Py Con, but the organizers are looking at ways of minimizing collisions between the two. Sprints may start as early as 8 March to frontload the schedule and let people scoot to Py Con immediately after the symposium.

Another one to keep an eye on will be next summer's NA Plone Symposium in New Orleans, 4-6 June. Enfold will be hosting it and, as always, it should be a good one. Details forthcoming.

The Plonistas down at the City of Albuquerque have often asked me about what ABQ could be doing in the way of hosting a sprint or a symposium. With the 2008 conference calendar filling up, we need to start thinking about next fall. Maybe a "Hot Air Sprint" ahead of the annual Balloon Fiesta? I'll have to ask the gang about this at December's NM Plone Users Group meeting.

Finally, I'm pleased to see the results from the Federal Open Source Referendum Study. Tipping factors mentioned in the report are "Organizational reluctance to change the status quo" and "Lack of structured technical support." One of the big drivers for government use of OS is data center consolidation. Multi-level security capabilities also figure big in the OS decision process for gov't. Intelligence agencies seem to be taking a lead (look at cia.gov for a Plone public face). Wish I could get my hands on the raw data to see where DOE and NNSA are in the survey.

Saturday, November 17, 2007

Digital Arts CMS Review

I took the way-back machine six weeks into the past and found Digital Arts piece on CMS. Right off the bat I was pleased with his rational comments about basing your decision on more than just functionality. User community, frequency of updates, and professional support all should figure highly in your evaluation.

I was upset that the author somehow thinks Plone only supports MySQL. In fact the ZODB is the core database and one that I find increasingly makes RDBMS not necessary for many web apps. Plone (and no doubt many other CMS) easily interface with the Microsoft database world via ODBC adapters. Various adapters written in Python have been around since long before Plone.

The author of the Digital Arts review either implies that Alfresco wizards are superior to UML-driven Archetypes or they are unaware of ArchGenXML (and uml.joelburton.com). I find that Plone's use of industry-standard modeling tools a great advantage. Now if I could only reverse engineer the UML from the Archetype...

Monday, November 12, 2007

Back from Amman

Dragged back in from Amman last Friday around 9:30 PM. The close-out Thursday and the fly-home Friday are one big blur.

Got started with the WACSI Plone training Thursday morning and that continued into close-out meetings afterwards with Gen. Shiyyab and the rest. Next came the final luncheon back at the Radisson and finally about 90 minutes to get packed. Then back to the Marriott for a celebratory dinner with Amir. That left a couple hours to get the feet up, shower and shave one last time, and checkout around 11:00 PM.

The flight to CDG was on an old 737 (uncomfortable seats, crowded, poorly ventilated). Had 6 hours at the airport before the trans-Atlantic leg, which was much more comfortable. Zipped through Atlanta and home 3 hrs later. All total, 44 hours up and on the go. No wonder I feel more than just the jet-lag.

Now what has all that to do with Plone? Actually, very little, but all the hours of work and hard travel was worth it. Here are my anecdotal metrics on the success of the workshop:

First off, the software was very well received. Some NGOs were without a web host and the WACSI portal is the perfect place for them. Even our translators were adapting quickly to the Plone interface, often not needing us to answer questions when they'd already heard the answer.

Secondly, we got a round of applause for the system when the workshop was over. I've not heard of that for Drupal or SharePoint.

Thirdly, the participants from UN-DP POGAR (Program on Governance in the Arab Region) were pleased enough that they stated during the close-out that they would use Plone and make any enhancements and customizations available to WACSI (and by extension, to the entire Plone community).

Finally, there's the site activity since we left. Already there have been postings in the discussion area (both in English and Arabic), new material uploaded by participants without our coaching, and a fair amount of e-mail traffic. All in all, a very positive Plone experience.

Now the search is on for Plone trainers and consultants in the Middle East...

Wednesday, November 7, 2007

Plone for WACSI

Our "Web Access for Civil Society Initiatives" Plone workshop continues. The first day of technical training was counted very successful, especially as we had very little lecture and lots of hands on. It was a good load test both for the UNM server and the RSS training facility.

No significant glitches, although we had some odd moments with unicode errors for cut-and-paste when using Arabic. Seems that the cut from 'actions' failed although it worked fine in the contents menu. Hmm??

We'll be looking for Plone consultants in the MENA region for training and development, since some NGOs are hosting their primary web presence on the UNM WACSI server.

Saturday, November 3, 2007

Plone Metrics in Amman

After the usual airline travails, I'm in Amman, Jordan. Not too bad for 8 timezones (soon to be 9 when the US kicks out of DST early tomorrow--its still 7:00 PM Sat. in ABQ as I write this). The WACSI workshop held an informal mixer this evening where we got to meet our regional participants. Of course, the CMC-Amman staff were there. They've done a terrific job getting all the logistics in place, translating material, and helping out with direction connections to hard-to-reach people.

We've got attendance from a dozen Jordanians, over a dozen Syrians, two from Saudi Arabia, and a handful from Egypt. Some are policy heads of various NGOs; others are technical representatives. Whatever their background, they'll have three days of interaction and discussion about how best to form a robust online community to support the several missions of their organizations. Then we'll have two days of Plone how-to, which should take them to the point that they are able to make good use of our Plone framework.

The idea, of course, is to get us webmasters and admins out of the way. The user community should be able to organize and present the information they wish to disseminate as they see fit. And if we can get there in five days, I'll be very pleased.

Plone Metrics