Showing posts with label documents. Show all posts
Showing posts with label documents. Show all posts

Saturday, November 4, 2023

The Lost Art of Content Management

After leaving the ECM space a few years ago, I had the chance to see how companies, those that don't sell content management products, manage their content. What I discovered is a disturbing mess.

In most companies, content like documents, spreadsheets, images, and more is left unmanaged. Thanks to cloud-based office suites like Microsoft 365 and Google Workspace, sharing documents has become incredibly easy. You no longer need to send documents via email (which was a mess by itself), and you don't have to put documents in shared folders either. Nowadays, you can share documents right from where they were created. Unfortunately, most people's default sharing location is their personal folder. 

That's right, most knowledge workers share their documents directly from their personal folder - with some disastrous consequences. By not using a shared, well-organized repository for their documents, many issues arise. As each worker controls their own sharing permissions, they tend to be either overly restrictive or overly open. Sharing with everyone by default creates obvious security vulnerabilities. Sharing only with a specific list of individuals is more secure, but it prevents new colleagues from leveraging the content later.  


The concept of a central repository is quite simple. It is a structure of containers (folders) from which documents inherit their properties, including access permissions. When organized logically, it is easy to navigate. If, for example, you need to find all Engineering projects from Q2 2022, you'd navigate through the folders Engineering -> Projects -> 2022 -> Q2 to find them. Since you might not know the project names or topics, relying on search would be a poor option, not to mention dealing with those pesky access permissions (you can;t find what you can’t access). This is where the hierarchical structure shines. A decent content management system handles access permissions while helping employees navigate the structure and locate relevant content without the need to search. 


Another effective content management tool for organizing content is tagging or classification. It allows users to find content using filters as an alternative to hierarchical navigation and search. That is how good content libraries are organized. Unfortunately, few bother to tag content today. 


With no efficient way to find and organize documents, employees are left to manage links to documents on their own. I've seen some crazy approaches forced by necessity - extensive browser bookmark collections, documents full of links, and browsers with over 100 tabs open. The only alternative is search, assuming you know what you are searching for. Search will not tell you 'What else you should know when working on this project.


When an employee leaves, their documents are typically inherited by their supervisor. However, supervisors rarely have the time to review those documents, so they end up in a subfolder, never to be looked at again.


It's a nightmare. 


Of course, you can create a shared folder structure in Google or Microsoft; you can even use tags. In reality, though, few people do that because nobody told them to do so. The IT department doesn't pay any attention to this issue (although they should) because they are too busy supporting hundreds of cloud apps, from Anaplan to Zuora. 


Knowledge workers don't worry about content management because they don't know any better. They were promised consumerization, where using software at work should be as easy as using Facebook at home. Nobody thought they needed training. But it turns out, they do. Without proper training, you end up with a mess. 


This chaos has opened the door for a range of new companies. Project Management software like Asana or Wrike is essentially a bunch of shared folders with a dashboard on top. Brand Management software like Brandfolder and Bynder are just simple digital asset management systems in a shared container. Intralinks Deal Rooms are shared folders marketed to investment bankers. Seismic is just a shared folder with better tagging. All of these could be created in a regular content management system like Box. The reason these companies exist is NOT because Box is lacking some features. 


Box understands the concept of content management and has built a fantastic product. However, Box, along with other content management vendors (sorry, I use Box and so I like to pick on it), has failed to create a market. Yes, there are companies in regulated industries like Life Sciences and Financial Services that are so heavily regulated that they have no choice but to use a content management system. But most companies don't see the point. Instead of using Box, they end up purchasing Asana, Brandfolder, and Seismic and then wonder why they have high IT costs and low productivity.


That is the problem that Box should be solving. It should establish a market by educating companies and knowledge workers on the importance of managing information. It should stop chasing the latest trends like generative AI because those won't result in a single new customer. The key to creating a market is explaining to people why they need to care about managing content.


Because right now, nobody cares. 


Sunday, September 21, 2014

It’s Not Just About the Unstructured Data

For well over a decade, the content management world has been claiming unstructured data. The argument usually goes something like this:
Structured data is the information that comes in the form of numbers, words, dates, percentages, and currency amounts that all fit neatly into the rows and columns of a database. Unstructured data, on the other hand, consists of documents, images, web pages, video files, CAD drawings, and PowerPoint files for which a database is ill suited and that thus require specialized technologies to ingest, analyze, manipulate, share, and archive it. This unstructured data – or content - represents over 80% of all the data in the enterprise. BTW, I’m pretty sure that Gartner made up that 80% number.
I admit that I was one of the early pioneers of this message and I carried it dutifully for years. The entire content management industry did that. But the more I’m learning about what customers really want, the more I’m coming to realize that we have been all wrong.
Because, customers don’t care about managing unstructured data.
What customers want are applications that address real business problems. Real business problems require real information and that almost always comes in both, structured and unstructured form. In fact I can hardly think of an application that doesn’t need to combine both types of data sets.

Take Invoice Processing. There is the structured data like the name of the supplier, the date, the list of goods, the total, etc. But there are also the invoice itself, the bill of lading, the damage reports and pictures, and other unstructured data.

How about Employee File Management? You have the employee files such as the original job application, resume, contract, performance reviews, and training certificates – all of them are unstructured documents or scanned images. But you also have the reporting structure, salary data, bank account info, benefits, bonus attainment, and other structured data.
In most applications, the structured and unstructured data need to be used together. Sure, the data may need to be kept in different containers – structured data in a database and unstructured data in the repository of a content management system. But using one without the other doesn’t really solve real business problems.
I think that the myopic focus on unstructured data has hurt the enterprise content management (ECM) industry. Sure, we need the specialized software that can manage the unstructured data but ultimately, customers need applications that can handle both, structured and unstructured data together in a single solution.

Tuesday, August 21, 2012

Records Management Is Easy, Disposal is Hard

Records management is one of the traditional disciplines in the vast field of enterprise content management. The purpose of records management is to satisfy the regulators and the court of law by ensuring that official records of transactions and activities are being preserved for future reference. The regulators typically prescribe a retention period - the length of time for which your organization needs to keep the record.

From the outset, records managers focus on making sure that all records are being retained, that they cannot be tampered with, that the retention period is being enforced, and that the records are properly classified so that they can be easily found when requested. The more sophisticated records management solutions also deal with advanced capabilities such as access control, storage optimization and legal holds to pause the retention clock in the case of a lawsuit.  

However, the most important part of records management is, in my opinion, the disposition. The idea of disposition is pretty straightforward - once the retention period expires, the best practice in records management is to dispose of the now no longer needed records. This is nothing shady in the ways of Enron but rather a perfectly legal and recommended practice. A reliable records disposition, though, is very hard.


Photo by bartmaquire Flickr
Indeed, filing records and locking them up for the prescribed number of years is not trivial but it is a solved problem today. Disposing of the record in the official records repository is also relatively easy. The problem is to dispose of all copies of the record. That’s right, records disposition is pointless unless you can ensure that the record has been completely expunged. Gone. Forever. If not, you can rest assured that a copy of the record will be found by investigators or by a subpoena and it can and will be used against you.

But, a reliable and secure disposition of records and all its copies is the tough part.

Chances are, that a copy your record exists in more than a dozen locations - on your co-workers’ desktops, on various servers, on SharePoint sites, and as an attachment in many email inboxes. Add all the iPads and other mobile devices to the mix and combine it with the popular cloud-based file sharing services such as Dropbox, Microsoft SkyDrive, Apple iCloud, Amazon Cloud Drive, or Google Drive and you have a very challenging scenario for records disposal. How can you ever be sure that you are expunging all copies of your records?

There are ways to solve this challenge. It starts with a common enterprise governance infrastructure that applies de-duplication across your email and all servers. That way, the record only exists in one instance while keeping the links to all the SharePoint sites and email inboxes. It also requires the ability to give employees a secure alternative to Dropbox that can be part of the same de-duplication infrastructure. In extreme cases where you know that your documents are regularly shared with external parties, the solution may need to involve rights management as well. While I usually try to stay away from blatantly promoting my employer’s products, we have some really good solutions for all of that.

Don’t get fooled into believing that you have solved your records management problems by applying retention rules to your documents. While that may satisfy the regulators, it won’t address your need to reduce unnecessary liability. Reliable records disposal is difficult but very important. Because you can be sure that if a copy of that smoking gun document exists on someone’s iPad or in Dropbox, it will be found when you least expect it.