"Count what is countable. Measure what is measureable. What is not measureable, make measureable." -- Galileo

Showing posts with label Google trends. Show all posts
Showing posts with label Google trends. Show all posts

Sunday, January 12, 2014

Trends and Google Trends

I thought I'd take a shot at keeping my New Solar Orbit Resolution and get out the January edition today.  With this item I'm one twelfth of the way there.  I thought I'd discuss a curve I see when I look at the term "Plone" through the lens of Google Trends.
It's positively or right skewed, meaning that the mode (the high point) is far to the left of the mean. One might take this to be an ominous sign of a software product sliding gently towards oblivion.  However, this is not the case.  For example, the Plone curve has almost the exact same shape as that of another term.  Go ahead, take a guess before you read the next paragraph. 

The above is the Google Trend curve for Apache2.  No one is going to claim that Apache is sliding into oblivion.

What's interesting is that Google Trends now makes discovery of these correlations very simple.  The GT correlate page let's you enter a term and the system provides an ordered list of similar search terms.  Typically they have correlation coefficients above 95%, often close to 99%.  Enter "Plone" and this is one of the results:
This gives us a window into a number of other comparisons.  Here's LifeRay and Drupal.




Have they peaked and now are entering their golden years as faded has-been CMS's?

Here's WordPress compared with Google Analytics.  Even with the huge spike in 2006 for the rollout of GA, the correlation coefficient is on the order of 98%, something statisticians would give their eye teeth to see in experimental results. 
 
If one squints, it's possible to see that WP and GA have plateaued and may even begun to decline in their Google Trends scores.  Quick, sell your WordPress stock!  ;-)

Back to our GT graphs, here's the plot for Unix:

The initial upswing and peak almost certainly took place years if not decades ago and is simply truncated by the lack of Google history.  Is this graph showing us Unix dying with a whimper instead of a bang?

What is going on here?  All these Google Trends are showing a positively skewed distribution.  Several probability distributions exhibit these characteristics.  Candidates include the gamma distribution (this might be a subset, the Erlang distribution) and the log-normal distribution.  Log-normal distributions are maximum entropy probability distributions and that hints at what the underlying phenomenon is and why Unix, Apache, and Plone aren't soon going extinct.

Entropy when applied to Web phenomena is a measure of "buzz."  But here we are looking at the use of search terms, not website visits or software downloads or installations.  Once someone has found plone.org, there's rarely a need to search for it again.  Over time, everyone who is interested in a topic, discovers that topic's key online resources.  Thereafter, one rarely needs to repeat a search.  The result is steadily diminishing search volume, which doesn't mean that interest is waning.  Rather, it means that the interested population is being fully reached.  The steady but low-volume long tail of the GT graphs probably represents the entry rate of newcomers to a particular community as they search and discover the key resources for that domain.  Everyone else has long ago bookmarked Plone.org. 


Saturday, March 7, 2009

World Domination and CMS Evolution

I've been noticing that Drupal often has been sending out the message of "world domination" and that has made headlines over at CMSWire this week. I'm not sure what world domination means in terms of CMS or OSS, but it sounds like driving all the competitors out of business or into insignificant niche markets. That sounds like attempting to be all things to all people.

Let's talk ecology and evolution for a moment. There are indeed very good generalist organisms out there: think Homo sapiens and Blattella germanica. But if you actually add up biomass or count individuals or consider many other indicators of breadth of ecological success, you'll find that its probably some species of dinoflagellate, a single-celled phytoplankton that numbers in the gazillions throughout the oceans of the world. You have to be very near the absolute bottom of the food chain to be successful by that metric. In software, these are operating systems and low-level communication protocols.

The only definition of "world domination" that fits a highly evolved system is the ability to actually critically impact the global ecosystem, much as mankind is currently doing in terms of climate change and species extinctions.

Can any CMS claim in its wildest dreams to actually impact the ICT "ecosystem" in such a fundamental way? I would argue that only operating systems, transfer protocols, and maybe some database management systems have this level of world domination. World domination for a CMS must be limited to running the other CMSs off the field.

Back to ecology and evolution and being all things to all environments (or all CMSs to all people). There are basically two ways to look at environmental penetration and it has to do with r and K selection. r-selected species are good at maximizing their intrinsic rate of reproduction. These are invasive species. Think Taraxacum, the dandelion.

K-selected species are those that flourish near their carrying capacity, the limit at which needed resources are scarce. They have low intrinsic rates of reproduction but typically long life spans. These are community dominant organisms. Think Quercus, the oak tree.

r-selected organisms thrive on disturbed environments like dandelions spring up in the bare patch your dog dug up in your lawn. K-selected species are slow-growing and difficult to establish, but once they have a foothold, they hang on while hundreds of generations of dandelions come and go at their feet.

So what is Drupal, a dandelion or an oak?

I'd suggest that spreadsheets and word processors are the trees of the temperate hardwood forest that is the 2009 software environment. But like a forest, its not a monoculture, but a complex mix of oak, maple, yew, and a few dozen other aspect dominants. There is no world dominant deciduous hardwood species.

So what's going on in the forest today? -- r selection.

Google shows the following trends for Plone, Drupal, Joomla, and WordPress.



Note that when Joomla took off or even when WordPress (not really a CMS IMHO) started its rise, there was no corresponding decrease in Drupal and Plone. This is evidence that the CMS environment is not at carrying capacity. In short, CMS development and deployment isn't a zero-sum game. We're nowhere near the carrying capacity of CMSs, as is also evident by the hundreds of CMSs over at CMS Matrix.

What's the slow decline in Google searches for Plone while everyone else is increasing like dandelions in your backyard? Matt Hamilton observed that, internal to Plone.org, searches are handled by Plone itself while Joomla, et al are using Google's site search. That automatically biases the results because they're missing tons of internal searches not tracked by Google.

As Alexander Limi recently observed on the Plone msg boards, if these stats are all there is, we (Plone and Drupal) are already dead--Joomla and WordPress are far ahead. I contend that WordPress is not powerful enough to be a true CMS but can pass as a low-end one for those lacking more complex workflow and security requirements. And if I were looking for an enterprize CMS, events of the Mambo-Joomla rift back in 2005 would give me night terrors.

That leaves us back at Drupal and "world domination," trying to be all things to all people. Can one software stack be the perfect blog, wiki, community plumbing, document archive, and web publishing solution? Can one organism be the perfect cryptogam, fungus, gymnosperm, ungulate, and carnivore? There's a reason for divesity--for organisms its the complex interactions among other organisms and the environment. For software, there are equally complex interactions and a rapidly evolving environment. World domination just won't happen for any one framework or application.

Now don't ask me about convergent evolution and CMS...