Wednesday, June 30, 2010

Pictures from the G20 Summit

My post on the G20 Summit and its Technology Infrastructure was very popular and so I have uploaded an additional batch of pictures. Trying out the Flickr code was part of the fun!

Created with Admarket's flickrSLiDR.

I took some of the pictures myself, some were taken by others with my camera and a few are the courtesy of Open Text. The quality varies...

Tuesday, June 29, 2010

G20 Summit and its Technology Infrastructure

This weekend, I had the opportunity to take part in the Group of 20 Summit in Toronto. I should say right up front that I wasn’t rubbing shoulders with the heads of states. In fact, I came nowhere near the actual Summit delegates. But neither did most of the thousands of journalists covering the Summit – they all were relegated to the international media center and that’s where I have spent an interesting couple of days. Yes, this is was the site of the “Fake Lake” – a feature that was bringing the atmosphere of the G8 Summit at Muskoka Lake to the G20 journalists and that was heavily criticized by the press prior to the Summit for its alleged lavishness. It wasn’t lavish and it was actually really well done, if you ask me.

Since my blog is about technology, I should perhaps explain what I was doing at the Summit. As it turns out, Open Text was selected to provide the technology infrastructure for the Summit. As members of the Canadian Digital Media Network, we’ve built a site that is basically a virtual rendering of the media center with the Fake Lake. This site is built using the digital experience management (DEM) technology which manifests our vision for an engaging user interface. Just check it out at I am sure I will write more about digital experience management and its syndication of tethered content in the near future.

The next part of the infrastructure was the secure social media site for all the journalists and other attendees, who were able to communicate with each other along with dozens of librarians at multiple universities happy to answer any question. And then there is the high security social media environment used by the actual delegates. That was a very interesting experience – having to go through a series of security evolutions while keeping a high level of confidentiality about the project. As the actual work is done prior to the Summit which is basically a photo-op, the application has been used for months leading up to the Summit. And the entire time I wasn’t allowed to talk about it for security reasons, which is rather tortuous for a marketer. In fact, it wasn’t by my choice that I didn’t provide any live Twitter update from the Summit.

Well, it was an interesting weekend – we were showing our software to hundreds of journalists including many on-camera demonstrations. This is the first time a highly secure social media application has been used for such multi-lateral event involving senior diplomats from many countries who have used the environment actively prior to the event. And it was cool to show it to the journalists on an iPad and touch-screen monitors. And I even experienced up close and personal some of the protests and riots in downtown Toronto which was not cool at all.

1. Top: Me at the Fake Lake in a moment of pure vanity
2. Middle: Honourable Peter Van Loan, Minister of International Trade of Canada getting a demo from
Tom Jenkins, Open Text's Chairman and Chief Strategy Officer
3. Bottom: Me with an iPad showing Open Text Everywhere accessing the G20 social community

Tuesday, June 22, 2010

Man versus Machine

A recent article in Wired Magazine titled “Clive Thompson on the Cyborg Advantage” described the result of a “freestyle” chess tournament in which teams of players competed with help of any computerized aid. What was surprising was that the winner was not the team with a chess grand-master or the team with the most powerful supercomputer. Instead, the team that won was a team that was best able to combine the power of the machine with the human way of thinking.

Years ago, I was dipping into the field of Artificial Intelligence (AI) which was the hype of the time. AI has failed for a variety of reasons. Perhaps it was way ahead of its time but perhaps it attempted to relegate too much decision power to the machines while the human expertise and intuition have always proven superior in the end. And so AI vanished and I’ve moved on to other things like content management.

The problem AI attempted to solve is more than relevant today. Faced with the staggering over-abundance of information, we are trying to find ways in which to use computers to help us make sense of all the data. The first step was making the information retrievable via search. But as soon as we have halfway accomplished that task, we have come to realize that this is not the solution. Virtually every search query produces too many results and the poor humans have to employ their expertise and intuition yet again to weed out the millions of hits.

The next step is to employ machines to automatically analyze and classify the content to reduce the volume of information humans have to deal with. But while such analytics and classification technologies have been around for years, they are still in their infancy. Outside specific applications that deal with limited content volume and scope, we don’t trust the machines yet. Usually, the final decision is up to the humans – just think of the e-Discovery reference model where we find all relevant content and then filter it to reduce the manual review cost. The goal today is to cull the volume of data that humans have to deal with. And that might remain the right approach for some time to come.

The right line of attack might be just like in the freestyle chess match. The solution is to facilitate the best possible interaction between the machines and humans. That needs to be reflected in the software architecture and its user interfaces but perhaps also in the skills required from us, humans. In the near future, it might not be the smartest people who will be most effective but rather those who will be best able to take advantage of the machines to augment their decision making ability.

Tuesday, June 15, 2010

Redefining Scalability

Scalability matters to enterprise software. Enterprises have usually much more complex requirements than consumers or small businesses and so there are different expectations on enterprise software and scalability is always right up there with security, reliability, and customizability. The idea of scalability is that the deployment needs to be able to grow as the enterprise growths. Surprisingly, that is not always the case for enterprise software as sometimes, scalability is an afterthought for the vendor.
Scalability has been traditionally defined as the ability to grow the deployment in size without any degradation in performance. The growth could be measured as:
  • number of users,
  • amount of transactions, or
  • volume of data.
The growth in one or more of these metrics can put tremendous pressure on the system performance and only well scalable systems can cope with such growth.
That said, scalability has to consider additional factors besides the performance metrics. No, I am not suggesting that performance doesn’t matter. Au contraire! With trends such as mobility, process automation, and relentless data growth, performance matters a lot. But other things matter too:

1. Distributed environments
Enterprises operate often in multiple locations and working in a geographically distributed environment requires the ability to preserve user proximity to data for performance, while maintaining data consistency and integrity. That’s easier said than done and only scalable systems can do it well enough to support tens or even hundreds of locations.

2. Leverage
Deploying enterprise software is inherently expensive, due to its complexity, integration with other enterprise systems, customization, and the above mentioned performance requirements. Because of this cost, the system scalability improves with economies of scale. Basically, the deployment becomes more cost effective when it can be used for more applications. Similar classes of applications can share a significant portion of their infrastructure for multiple use cases. This kind of leverage is a characteristic of a scalable system.

3. Downward Scaling
While most scalability concerns deal with performance growth, modern applications need to be able to also scale down easily when the performance is not required. High volume of people, transactions, or data usually comes in peaks and provisioning a system for the peak is exceedingly expensive. The only economic solution is to have multiple systems share the resources via a grid or virtual environment. But any software that hogs the resources and does not share them well with other applications is simply not scalable.

4. Customization
No two enterprises are the same. There may be similarities across industries or geographies but enterprises are like living organisms and every single one of them is different. Therefore, any software deployment – on premises or in the cloud – requires some degree of customization that adapts the software to the needs of the organization. This customization may include adaptation of the user experience, definition of the data model, integration with other systems, etc. As the organization growths in any dimension, the amount of customization often grows as well. The degree to which any software makes such customization possible and easy is also an aspect of scalability.

Enterprise software needs to be scalable but scalability is not all just about performance.

Tuesday, June 8, 2010

Google Search and the Tip of the Iceberg

The Web is often considered the known content universe. Indeed it appears that today, everything that could be digital has become digital and can be found on the web – articles, images, music, movies, and games. Since every search query on Google returns millions of hits, all digital content in the world is available through the Google search. After all, Google indexed and cached the entire World Wide Web, didn’t they?

Not so. Google only exposes content that wants to be found. It is content on the public Web that is not protected by any security. A vast majority of content, however, resides behind a login challenge that prevents Google from indexing it. In fact, the 2003 study “How Much Information?” conducted by the by the University of California, Berkeley, estimates that less than 1% of content is exposed on the public Web (“Surface Web”) while the majority resides in the “Deep Web”. While the Internet has evolved since 2003, the percentage of content not indexed by Google and other search engines remains huge.

Searching secure content is not trivial which is why Google has only a modest presence in the enterprise. Restricted access to content applies not only to the content itself but also to the index and search results. Not only should you be prevented from accessing the document you are not authorized to see but you should also not be able to see a link with the document’s name in the search results. And so Google takes the easy way out by searching only the unprotected content. Search with security – an absolute must in the enterprise – is a much harder nut to crack.

There is another trick that makes Google’s job easier. It really searches content that really wants to be found – content that has been optimized for search. There is an entire industry called Search Engine Optimization (SEO) which spends millions of dollars on making sure the content employs all kinds of tricks to ensure that Google can find it easily and rank it as relevant. That’s of course not the case for the deep Web content, not to mention documents and other types of content in the enterprise. And Page Rank, Google’s famed relevance algorithm, is based on hyperlinks between Web pages that don’t really exist in the world of enterprise content.

The public Web searchable via Google only represents the tip of the iceberg and searching secure content on the web or content in the enterprise is much more difficult.