In today’s world, Ask.com, Google, Yahoo and others have changed the way we search, and have freed most people from the If, While, Else and all the other components of a SQL statement.
Several other companies have also approached the search/retrieve process from a non-SQL perspective. One example is MetaCarta, with generalized geographic searches. Another is Bentley, with the ability to search down to the component level in infrastructure. In addition, there are those companies in the competitive intelligence space that offer targeted searches on unorganized data. Finally, note the recent announcement by SRC that the company now offers search capabilities for articles and news within a defined geography.
The obvious reason for the unstructured data search is that most data don’t live in a table or a database. They are found in documents, drawings, graphics and sometime just as a blob of unstructured text or, worse yet, handwritten text. Imagine that on your tablet PC you have a map, a table, some handwritten notes, a photo and a table on the screen. What do you respond when asked “Save As…”?
In this newsletter I report on a company called QL2 that makes software for Web searches. The company creates agents that go out and find what you have tasked them to find. You get to see the results in real or near-real time. After you see what comes back, you can adjust the agent using a series of tools to “tune” the result. The search results can be a combination of text, tables, graphics and links. You could put the results in a Word document, maybe Notepad, and even Paint, so they stay unformatted and easily readable. Or you could put them in a database. How do you efficiently search for them next time? Do you need a new agent to get them out of the unstructured environment, or if they are in a database, do you put the graphics and random stuff in as a blob?
A lot of effort has gone into creating search engines that find almost anything in a multi-format, multi-medium world. It seems like we have the “find” problem pretty well solved. The problem now seems to be, once we find it, how do we keep all this disparate stuff in a generally organized form?
If data for the most part are unorganized, can you make the case that the 20th century database is way too restrictive, not just in storage, but in its organizational constructs? Is the table structure really a case of square pegs and round holes? Perhaps we are moving toward a storage system that is more like our brain, in which whim, fantasy and indiscriminate ponderings can rule.
Using a database, tables, rows, columns and cells can work, but it is a compromise and only works really well for things that fit in a cell. Even cells have to be adjustable in order to fit wide items. You can stuff graphics, video or sound in a cell, but if you store a lot of them, it really impacts storage and performance. I am looking forward to seeing how these challenges will be addressed by all those brainiacs out there…