Monday, May 20, 2013

Scalability Redefined


Scalability matters. Obviously. But scalability seems to mean many things to different people. Most commonly, scalability is equated to the number of users - people - accessing a particular system. That's a good start, but we we have to of course consider the degree of concurrency. There is a difference between a solution that 1,000 people use once or twice a week versus a solution that 1,000 people pound upon continuously.

In large deployments, there is also the question of how many systems are actually really being used. Early in my career, I spent several years working at Novell. It was the heyday before Windows NT and the company enjoyed a commanding market share. Large and small companies used Novell networks. But we knew, that an average (software) server always had only about 100-200 users working on it. Sure, there were some companies with 100,000 employees that used Novell NetWare. Those were big deals, big deployments, and examples of high scalability. Right? Well, not really.

Those large companies were in reality running hundreds of separate Novell server instances and each one of them would still get just a couple hundred users. I see a similar pattern with today's deployments of Microsoft SharePoint and, to some extent, also with Exchange. It is quite a different kind of scalability when 100,000 users all share a single instance of an OpenText repository - even if that repository runs physically across multiple hardware servers.

There are more dimensions to scalability than just the number of users, though. The number of operations or transactions comes to mind - from MIPS to the number of credit card purchases. How about the number of objects under management? Examples could be the number of documents, number of customers, number of transactions, number of contracts, number of relationships, number of suppliers, etc.

To be scalable, enterprise software must not be just capable of holding the number of objects in a database or repository. It also has to provide the ability to efficiently view and manipulate the data. If your web-based user interface shows the objects in blocks of 20 while you have millions of objects to sort through, your application might not be very scalable! By the way, having a single data container capable of holding millions of objects is another dimension of scalability.

There is yet another factor that needs to be added to the definition of scalability: the metadata. This is the data that describes your data. Without metadata, your data only contains what is explicitly stated in it. Sometimes, that may be useful by itself but in most cases, we want to add lots of enterprise related context. We want to add information about people and teams who work with the data. Information about the organizational structure and approval hierarchies. Information about projects to which the data belongs. The deadlines, the cost centers, the retention requirements, etc. The metadata can be often richer (= bigger) than the data itself. But it is absolutely critical in the enterprise and your application needs to scale to accommodate it.

There are many factors that define scalability and looking just at the number of users can be often insufficient or even misleading.

No comments:

Post a Comment