Wednesday, June 30, 2010
Pictures from the G20 Summit
Created with Admarket's flickrSLiDR.
I took some of the pictures myself, some were taken by others with my camera and a few are the courtesy of Open Text. The quality varies...
Tuesday, June 29, 2010
G20 Summit and its Technology Infrastructure
Since my blog is about technology, I should perhaps explain what I was doing at the Summit. As it turns out, Open Text was selected to provide the technology infrastructure for the Summit. As members of the Canadian Digital Media

The next part of the infrastructure was the secure social media site for all the journalists and other attendees, who were able to communicate with each other along with dozens of librarians at multiple universities happy to answer any question. And then there is the high security social media environment used by the actual delegates. That was a very interesting experience – having to go through a series of security evolutions while keeping a high level of confidentiality

Well, it was an interesting weekend – we were showing our software to hundreds of journalists including many on-camera demonstrations. This is the first time a highly secure social media application has been used for such multi-lateral event involving senior diplomats from many countries who have used the environment actively prior to the event. And it was cool to show it to the journalists on an iPad and touch-screen monitors. And I even experienced up close and personal some of the protests and riots in downtown Toronto which was not cool at all.
Images:
1. Top: Me at the Fake Lake in a moment of pure vanity
2. Middle: Honourable Peter Van Loan, Minister of International Trade of Canada getting a demo from Tom Jenkins, Open Text's Chairman and Chief Strategy Officer
3. Bottom: Me with an iPad showing Open Text Everywhere accessing the G20 social community
Tuesday, June 22, 2010
Man versus Machine
A recent article in Wired Magazine titled “Clive Thompson on the Cyborg Advantage” described the result of a “freestyle” chess tournament in which teams of players competed with help of any computerized aid. What was surprising was that the winner was not the team with a chess grand-master or the team with the most powerful supercomputer. Instead, the team that won was a team that was best able to combine the power of the machine with the human way of thinking.
Years ago, I was dipping into the field of Artificial Intelligence (AI) which was the hype of the time. AI has failed for a variety of reasons. Perhaps it was way ahead of its time but perhaps it attempted to relegate too much decision power to the machines while the human expertise and intuition have always proven superior in the end. And so AI vanished and I’ve moved on to other things like content management.
The problem AI attempted to solve is more than relevant today. Faced with the staggering over-abundance of information, we are trying to find ways in which to use computers to help us make sense of all the data. The first step was making the information retrievable via search. But as soon as we have halfway accomplished that task, we have come to realize that this is not the solution. Virtually every search query produces too many results and the poor humans have to employ their expertise and intuition yet again to weed out the millions of hits.
The next step is to employ machines to automatically analyze and classify the content to reduce the volume of information humans have to deal with. But while such analytics and classification technologies have been around for years, they are still in their infancy. Outside specific applications that deal with limited content volume and scope, we don’t trust the machines yet. Usually, the final decision is up to the humans – just think of the e-Discovery reference model where we find all relevant content and then filter it to reduce the manual review cost. The goal today is to cull the volume of data that humans have to deal with. And that might remain the right approach for some time to come.
The right line of attack might be just like in the freestyle chess match. The solution is to facilitate the best possible interaction between the machines and humans. That needs to be reflected in the software architecture and its user interfaces but perhaps also in the skills required from us, humans. In the near future, it might not be the smartest people who will be most effective but rather those who will be best able to take advantage of the machines to augment their decision making ability.
Tuesday, June 15, 2010
Redefining Scalability
Scalability has been traditionally defined as the ability to grow the deployment in size without any degradation in performance. The growth could be measured as:
- number of users,
- amount of transactions, or
- volume of data.
That said, scalability has to consider additional factors besides the performance metrics. No, I am not suggesting that performance doesn’t matter. Au contraire! With trends such as mobility, process automation, and relentless data growth, performance matters a lot. But other things matter too:
1. Distributed environments
Enterprises operate often in multiple locations and working in a geographically distributed environment requires the ability to preserve user proximity to data for performance, while maintaining data consistency and integrity. That’s easier said than done and only scalable systems can do it well enough to support tens or even hundreds of locations.
2. Leverage
Deploying enterprise software is inherently expensive, due to its complexity, integration with other enterprise systems, customization, and the above mentioned performance requirements. Because of this cost, the system scalability improves with economies of scale. Basically, the deployment becomes more cost effective when it can be used for more applications. Similar classes of applications can share a significant portion of their infrastructure for multiple use cases. This kind of leverage is a characteristic of a scalable system.
3. Downward Scaling
While most scalability concerns deal with performance growth, modern applications need to be able to also scale down easily when the performance is not required. High volume of people, transactions, or data usually comes in peaks and provisioning a system for the peak is exceedingly expensive. The only economic solution is to have multiple systems share the resources via a grid or virtual environment. But any software that hogs the resources and does not share them well with other applications is simply not scalable.
4. Customization
No two enterprises are the same. There may be similarities across industries or geographies but enterprises are like living organisms and every single one of them is different. Therefore, any software deployment – on premises or in the cloud – requires some degree of customization that adapts the software to the needs of the organization. This customization may include adaptation of the user experience, definition of the data model, integration with other systems, etc. As the organization growths in any dimension, the amount of customization often grows as well. The degree to which any software makes such customization possible and easy is also an aspect of scalability.
Enterprise software needs to be scalable but scalability is not all just about performance.
Tuesday, June 8, 2010
Google Search and the Tip of the Iceberg
Not so. Google only exposes content that wants to be found. It is content on the public Web that is not protected by any security. A vast majority of content, however, resides behind a login challenge that prevents Google from indexing it. In fact, the 2003 study “How Much Information?” conducted by the by the University of California, Berkeley, estimates that less than 1% of content is exposed on the public Web (“Surface Web”) while the majority resides in the “Deep Web”. While the Internet has evolved since 2003, the percentage of content not indexed by Google and other search engines remains huge.
Searching secure content is not trivial which is why Google has only a modest presence in the enterprise. Restricted access to content applies not only to the content itself but also to the index and search results. Not only should you be prevented from accessing the document you are not authorized to see but you should also not be able to see a link with the document’s name in the search results. And so Google takes the easy way out by searching only the unprotected content. Search with security – an absolute must in the enterprise – is a much harder nut to crack.
There is another trick that makes Google’s job easier. It really searches content that really wants to be found – content that has been optimized for search. There is an entire industry called Search Engine Optimization (SEO) which spends millions of dollars on making sure the content employs all kinds of tricks to ensure that Google can find it easily and rank it as relevant. That’s of course not the case for the deep Web content, not to mention documents and other types of content in the enterprise. And Page Rank, Google’s famed relevance algorithm, is based on hyperlinks between Web pages that don’t really exist in the world of enterprise content.
The public Web searchable via Google only represents the tip of the iceberg and searching secure content on the web or content in the enterprise is much more difficult.
Thursday, May 27, 2010
Metadata - The Secret Value of Social Media
Metadata is essential for any content. Without good metadata, content is difficult to find and impossible to process. But good metadata is very difficult to get. Traditionally, the users were expected to add it at the time of content creation and throughout the processing. But users are notoriously unmotivated to do so and they won’t. Automatic approaches through content analytics are promising but still nascent (see my article The Problems Waiting to Be Solved). Yet here comes a 3rd way to create metadata – social media.
Just think about what’s happening on Facebook. Millions of users are socially interacting. They are telling each other their opinions about what they like and don’t like, what they do, what they need. This is the online version of an Italian piazza or an American sports bar on a football night. And they do it of their own accord, without corporate policies, and without having to adhere to any taxonomy. Mining these interactions is relatively easy and the result is information about people, processes, and content. And that’s how we define metadata.
Unfortunately, Facebook owns their metadata and they will be selling it to advertisers a bit at a time. And that really worries Google. If you want to read a good book or find a good store, you can either search for it or you can get a recommendation. You can either trust some search algorithm that relies on keyword prominence and frequency of links or you can trust your friends whom you know and share common interests and values with. That’s why the metadata generated from social media is more valuable than any search index and, as a result, ads based on social media are much more relevant than the ads based on search. And that’s why Google fears Facebook.
The same applies for private social media deployments on your intranet or web site. The high quality metadata generated by social media in an enterprise is just as valuable. It can be the driver for use cases such as recommendations, expertise location, knowledge discovery or suggestive records filing. Yes, the value of social media lies not only in the user engagement and interactions; it also represents a great source of metadata. And that’s the secret value of social media.
Sunday, May 23, 2010
EMC Content Management Family Tree

My view is not meant to better Alan's; it simply provides a different perspective. Alan did a great job on the Open Text's family tree, by the way.
Ping me if you have any further detail to add.
Friday, May 21, 2010
What’s Apple’s Next Move?
So, what’s next for Apple? What other types of content can Steve Jobs realistically make us want to pay for? Well, there are multiple content types that have been digitized without an apparent monetization model: newspapers, magazines, radio, maps, classified ads, etc. These are the candidates for Apple’s next move. However, Apple requires a couple of ingredients for the model to be successful: the content has to be valuable, protected by copyrights and threatened by piracy.
To monetize copyrighted digital content that is too easy to copy illegally, Apple has built a complete vertical stack based on a closed system. You want the cool device and you need content to use it. The content comes from Apple’s own iTunes store, it is managed by Apple’s iTunes application, and it is delivered in formats controlled by Apple (and not by Adobe). This is the only effective way to fight content piracy.
Many of the not yet monetized content types fit the bill. The newspapers and magazines have been fighting commoditization of their content for years with no success. The radio industry has been also in a continued decline, although the temporary success of Sirius radio suggests that people are willing to pay for good quality streamed content. And yet, the world will hardly become a place with no news, no magazine articles and no radio.
We can expect that as more newspapers, magazines, and radio stations go bust, the remaining, strongest content providers will accumulate enough negotiation power to change the rules. They will likely embrace the closed vertical stack and find a way to monetize their content in the same way Hollywood did (or was forced to do). In other words, Apple is likely to team up with outfits such as Bloomberg, Reuters, Associated Press, Newswire, Newscorp, Turner, and others to start selling their content – likely repackaged, personalized and with a dose of social interactions. High quality, personalized news and editorials delivered to your iPad for a fee - and you will pay it and think that it’s awesome.
This will not be the same as when Wall Street Journal tries to charge you for a subscription today. That’s a losing battle – offering the same content package to everyone when all that info is available for free anyway. Reading news on a laptop screen is boring compared to the pre-packaged apps available for the iPad. It will also not be the same as what AOL tried to do by purchasing Time Warner back in 2001 – that was an attempt to control the content creation which is not what Apple does. Apple will resell the content via content apps tailored to their devices and let the creators compete among themselves.
My prediction is that Apple will continue to eat away Google’s attempt to make all the information available to everyone for free (well, for giving up privacy). Apple’s strategy is to lock the information in a vertical stack and charge you for every bit. And news, editorial content, and streamed media are most likely their next move.
Tuesday, May 18, 2010
8 Business Use Cases for Twitter
I have been using Twitter for about 18 months, first being just curious, then tentative and eventually I “got it”; or at least some of "it”. I found out that I am more and more using Twitter for work, following the community of similar-minded individuals ranging from industry leaders, analysts, bloggers, gurus, to competitors. I keep finding more and more great applications for the conversations that occur on Twitter, appreciating its open, public nature and brevity. And so I have captured some of the best use cases in my guest article on Digital Landfill:
Friday, May 7, 2010
Don’t Use Strong Passwords
This is a great idea. That is, until you have to do it for 20-30 different systems. You all have passwords for your work environment – ideally just one since your organization uses single sign-on, right? Yeah, right… And we all have numerous passwords for online services such as banks, brokerages, retirement plans, frequent flyer programs, social networking sites, utilities, retail stores, etc. I like the idea of security provided by strong passwords but how am I supposed to remember them?
Users deal with this in different ways such using the same password for every site or keeping a list of passwords. This, however, introduces much greater security vulnerability than the off-chance that someone will guess your less strong password. You may use specialized encryption software for managing your passwords but that represents a potential single point of major vulnerability and it requires some discipline keeping password lists up to date.
The problem is that many sites and systems require a strong password because of a policy set by an overzealous administrator. Sure, I want to protect really well my bank account in which all my savings could vanish with a single mouse-click (or finger-touch) but the same is not true for my utility bills. There is nothing I can do on the utility’s site except to see and pay my bills and any intruder is welcome to pay it for me. Thus, I don’t need that strong of a password here. Similarly, many shopping sites don’t need a secure password as long as I don’t save my credit card info within said site.
The utility bill could be a privacy issue for some – maybe if I was a politician or a celebrity and I didn’t want the world to see what an energy hog I am. Well, in that case I still have the choice to use a strong password. The bottom line is that administrators should think twice before imposing the need for strong passwords on their users. They should give them the choice. Password strength needs to be adequate to the value of the information it protects and the risk any breach would represent. Weaker – easy to remember - passwords are perfectly appropriate for many applications, especially if they avoid users keeping password lists. The result can be a more secure environment.