The magic of data visualization for everyone

By

Every day I am amazed afresh by the transformative power of the Web. Today I have discovered Many Eyes, a site hosted by IBM’s AlphaWorks. It combines open participation with a wonderful set of visualization tools. As such anyone can upload data sets, and then create sophisticated visual representations of those data sets, including scatterplots, tree maps, histograms, bubble diagrams, network maps and far more. Anyone can then either reuse the data sets, create new visualizations, add comments, or blog about the visualizations. To try it out I created in around one minute a bubble diagram representation of the frequency of words in the English language (See below for the non-interactive diagram – I won’t link directly, as I think generating the diagram is rather resource-intensive – have a look at the visualization gallery that includes it). In the first edition of my book Developing Knowledge-Based Client Relationships, which was published in 2000, I wrote about both data visualization and concept visualization (which uses visual representations to convey concepts rather than information). Both of these will be fundamental in a world in which we are swamped with information. While I haven’t spent as much time on visualization over the last years, I am shifting back towards this space, not least in facilatating clients in easily understanding and responding to strategic issues.

wordfrequency.jpg

In Tim O’Reilly’s very interesting post about the site, he asked Martin Wattenberg and Fernanda Viegas, the people who conceived Many Eyes, about their inspiration. Fernanda calls it ‘“social data analysis,” in which “playful, social exploration of data leads to serious analysis”. Martin says that their goal is to “democratize” visualization. These are seriously valuable tools being provided for free to the community at large, where one person can use the tools for their own purposes, then have their ideas be taken up and developed further by others. I’ve long used AlphaWorks as one of the best and earliest examples of open innovation. It’s great to see them both offering this kind of value to the community, and have this fully integrated into their business models. Note that another social data visualization site, Swivel, launched before Many Eyes – it doesn’t appear to have as rich visual functionality, as Brian Dennis notes, but has far more data sets uploaded for people to play with.

Robots, Akihabara, and a baby…

By

Last week, wandering around Tokyo, I decided to check out what sort of consumer robots were available. I ended up initially finding the Kondo Robo-Spot in Akihabara, where Kondo, a manufacturer of radio controllers and kit robots, has a demonstration facility. The video below shows the robot kicking a ball into a goal, doing pushups and cartwheels, bowing, and waving his arms in celebration – a pretty impressive display. It also shows the reaction of my five-month old daughter Leda to the robot.

The Kondo robots are available only in kit form, seIling for a little less than $1,000, and taking five or so hours to assemble. It turns out that there is a large market of robot “otaku” in Japan, who prefer to assemble robots than buy them complete, with a magazine dedicated to kit robots. The Kondo robot and its brethren are humanoid, and are able to walk and perform basic functions by virtue of compressed air-powered “muscles”, directed by people through wireless controllers. While the controllers can be programmed to make the robots perform complex movements, such as cartwheels, the robots are not autonomous, so are more mechanical wonders than self-directing robots.

There is no question that Japan’s future (and past) is deeply enmeshed with robots. It is the only country where there is serious experimentation with what robots can do. I have written before about cuddly seal robots for therapy, lifelike doppelganger robots, household robots, and other explorations of the boundaries of robotics in Japan. The human race is on the verge of creating robots that become part of our everyday lives, yet we are still far from discovering precisely what roles robots will play in our future. I think it will be fun having robots around to help us.

Techmeme and finding the most interesting conversations

By

I have always said – particularly to those who don’t understand blogging – that blogs are not necessarily important individually, but in aggregate they are massively powerful. The “blogosphere” pulls together what millions of talented people around the world are discovering and thinking. Collectively, blogs enable us to collaborate to filter and uncover the most worthwhile news. As I wrote in my second book Living Networks, we are currently all participating in the birth of a global brain, and the world of blogs allows makes visible our collective stream of consciousness.

In that vein, Techmeme is one of the top three sites I refer to – often several times a day – to discover what is the most interesting technology news of the moment. It is an automated site that tracks a continually evolving list of the top few thousand blogs. It uses a complex algorithm to pull out both the most discussed news items or blog posts at that time, and the current conversations between top bloggers that these have sparked. Because of its exclusive scope, you can be sure that the comments are interesting and relevant. More importantly, the site uncovers conversations, discussions, points of difference, disagreements, creating a view on the news that is far more than the sum of its parts. Techmeme’s sister sites – memeorandum for politics, WeSmirch for celebrity gossip, and Ballbug for baseball – fulfil the same role for other topics. Memeorandum in particular provides fascinating insights into the American political debate, and how topics are viewed by partisans of both left and right.

Danny Sullivan of SearchEngineLand has just interviewed Techmeme founder Gabe Rivera, providing some fascinating insights into Techmeme. I was very surprised to discover that Techmeme has only around 30,000 daily unique visitors, a tiny amount compared to sites like Digg.com, with around 1.5 million per day. Apparently Techmeme is still a tool used by the cognoscenti, including many journalists who use it to discover news stories. However in my mind Techmeme and its sister sites rank alongside Technorati as the most valuable tools to uncover the power of the blogosphere. Joshua Jaffe of The Deal says he’s convinced that Techmeme will be acquired by Google for a stupendous sum. I certainly have no doubt that Techmeme or similar tools for tracking insightful online conversations will soon come to the fore.

Reinventing HTML and the evolution of standards

By

Tim Berners-Lee, the creator of the World Wide Web, has just announced that the World Wide Web Consortium (W3C) will establish a completely new working group to work on the development of HTML, in his words “reinventing HTML”. This bold move has been prompted by the too-slow shift from a mark-up system which works reasonably well, but is still flawed. Attempts to evolve HTML to a “well-formed” structure that draws on the power of XML have been stymied by most people’s satisfaction with the current standard.

In my book Living Networks I created a chart showing what I described as the gradual progress towards open, accepted standards – see below. On one level, HTML is a poster child for an open, accepted standard, shown on the upper right of the diagram. There are no competitors for HTML – it is fully accepted as the standard for representing data on the Internet, and the W3C, for all the critcism it garners, genuinely attempts to represent and incorporate the views of all stakeholders. Yet, as anyone who has been involved in a standards committee knows, maintaining and developing an existing standard, particularly one with the impact of HTML, is no easy task, with ample scope for personalities and politics, which will certainly have to be addressed in this case. The developer community seems split between the positive and enthusiastic on one side, and skeptical on the other, with also other interesting analysis.

Figure 2-2.jpg

The gradual shift to open, accepted standards

Open innovation in collaborative filtering

By

Netflix has just announced a $1 million prize to whoever can improve the accuracy of their movie recommendation engine. To enable people to design an improved recommendation engine, they’ve provided their users’ ratings of 100 million movies, an extremely valuable database. This harkens back to Canadian gold mining company Goldcorp’s initiative, whereby they publicly released the geological data on their properties, and set up a competition with prizes for whoever could give them the best recommendations on where to dig for gold. Other open innovation initiatives such as Innocentive match a whole series of people looking for innovation, again providing pre-specified rewards for meeting specific parameters. Some note that the prize will mean a lot of people work for free, and it’s arguable that if you can indeed do better than the other competitors, you’ll be able to make more than $1 million from it commercially anyway. The size of the prize indicates the value in enhancing the accuracy of collaborative filtering, as I’ve written about many times before. If Netflix can more accurately recommend a movie to its customers, the more likely they will stay with Netflix. For companies with other business models, greater accuracy directly impacts sales and revenue. More and more energy and resources will be going into this space. Netflix has chosen to combine two of my passions – open innovation and collaborative filtering – so I will be very interested to see the results from this. Details of the prize are at netflixprize.com, which will provide a progress chart on how the competing teams are doing.

Web 2.0 and user filtered content

By

Tomorrow I’m heading off to the Influence conference run by Phil Sim’s Mediaconnect. The event brings together media and other influencers (I believe I’m labelled a “new media influencer” there) and corporates, discussing current trends in key technology sectors. I’m on the Web 2.0 panel tomorrow, so I thought I’d briefly capture here my introductory comments, on my chosen topic of

User Filtered Content.

The user filtering landscape

+ The primary focus recently has been on the explosion of user generated content, with Wikipedia, MySpace, YouTube and many others just the vanguard of an immense wave of content creation, unleashed by accessible tools of production and sharing. We are moving towards a world of infinite content, further unleashed by the vast scope of content remixing and mashups.

+ With massively more content available, we need the means to filter it, to make the gems visible in vastness of the long tail. Fortunately, Web 2.0 is in fact just as much about user filtered content as about user generated content.

+ As far more people participate in the web, as technologies such as blogging, social networking, photo sharing and more become easier to use, the collective ability of the web to filter content is swiftly growing, and will more than keep pace with the growth in content.

User filtering mechanisms

Clicks indicate popularity of specific content within a site (with many caveats).

Links are stronger and more valid votes on the value of content.

Ratings provide explicit opinions on quality.

Tags describe content with words, locations etc.

Web-wide and site-specific filtering

There are two primary ways of implementating user filtering: taking data from across the web, and from within one site.

+ Google’s PageRank is a seminal example of web-wide user filtering, where people’s aggregated linking behaviors enable people to find relevant content. Technorati more explicitly shows how many blogs link to other blogs or blog posts, to indicate their authority. Techmeme draws on the timing and relationship of new links to uncover current conversations.

+Amazon.com’s book recommendations kicked off site-specific user filtering, notably by identifying related titles. Slashdot was for several years the primary site that enabled communities to select stories and rate each others’ commentary.

+In two years Digg.com has reached over 1 million daily visitors with its core model of user filtering of content. Copycats or similar sites such as Reddit, Meneame, and Shoutwire have abounded. Finally AOL-owned Netscape launched a Digg copy, providing mainstream media endorsement of the model.

+Content sites such as YouTube, Flickr, MySpace, and Odeo all embed user filtering as core features of their services.

What’s next for user filtering

+ Effective user filtering will have increasing value, and there will be more plays in this space. Network effects will apply strongly to site-specific filtering, however this will not preclude new players with better models gaining traction quickly. The move by Netscape to hire active raters away from Digg is an attempt to accelerate shifts.

+ Social search engines such as Eurekster and Yahoo!’s Search Builder indicate the next level of sophistication of search, enabling filtering aggregation of specific communities rather than the web at large.

+ Tools such as Last.FM and Yahoo!’s Launchcast will, with permission, use extremely detailed personal taste profiles to provide content filtering for individuals.

+ New mechanisms will emerge that draw on people’s web activities, tagging, specific communities, and combine these perspectives in various ways to create more refined user filtering. This filtering will increasingly be designed to be relevant to people with particular interest profiles and individuals.

Microsoft’s Zune player enables social networks for music

By

News is just out that Microsoft’s Zune mp3 player, due out before Christmas to compete with Apple’s iPod, will have social networking capabilities, in addition to its core features of a 30GB disk and a 3 inch screen. Zune users will have the option of using the device’s inbuilt WiFi to send and receive music, playlists, videos, and photos to up to four other players. They can either broadcast these to any Zune player within range, or only to those of their selected friends. If they have the broadcast feature switched on, anyone permitted within range will be able to listen simultaneously to what they’re listening to.

This social networking feature, together with the device’s WiFi capabilities, is the only really significant feature difference to the iPod, and it is one that actually could shift users to the Microsoft player. Particularly for young people, music is fundamental to their identity and relationships, and sharing music is truly at the heart of their social networks. Sharing around musical preferences was the initial premise behind MySpace, and while it has gone quite a bit beyond this, it was the seed and still is at the centre of the largest human social network ever to exist. However I have a few concerns about how the feature is implemented on the Zune. One is that broadcasting to four people is not enough to really enable true social networks. It makes it a little bit more a gimmick than a feature to have it so limited in scope, though it can still act as a social glue for smaller groups. Another issue is probably related, in that WiFi is very energy-hungry. I have not seen any figures on battery life with WiFi turned on, but I suspect that the device won’t be able to last very long while it is broadcasting music, making it far less mobile. It may take, sometime down the track, the use of fuel cells or alternative wireless technologies for this kind of music social networking to be a broadly used application. A final issue is that, given this is Microsoft, we know that there will be solid Digital Rights Management (DRM) in place. In fact the release specifically says that people will be able to share “promotional copies” of songs, which will be just a fraction of what people have available on their players. Given these factors, the question remains whether this feature will prove to be a key differentiator for the Zune here in an extremely competitive marketplace, but I don’t doubt that mobile musical social networking will get massive uptake at the right time, when it’s done right.

Microsoft kickstarts the Live brand

By

Microsoft has just launched Windows Live Spaces, replacing MSN Spaces, its social networking and blogging site which has over 100 million visitors monthly, and offering a substantial facelift and new features. While Microsoft has had a wide suite of Live offerings in Beta for many months now, this is the first Live product that has been launched as a working product on a large scale. Live Messenger is out of Beta, but it has not replaced MSN Messenger, and is for now running in parallel with it. Given the breadth of MSN Spaces’ usage, this launch is the most powerful way Microsoft can kickstart the Live brand, and start gradually moving its array of Live products into the market. Ray Ozzie, now Chief Software Architect at Microsoft, spoke to financial analysts last week about Microsoft’s vision. This is well worth a read, as it provides a coherent view on how committed Microsoft is to web services in the broadest sense. Windows Live is absolutely central to Microsoft’s shift to web services, and the launch of Windows Live Spaces is just the beginning of what will be a major blitz on the Windows Live brand and product suite over the next couple of years.

Mobile traffic data will pressure local radio

By

Google has just released maps for use on mobiles, that indicate traffic congestion with four color levels from green to red, across 30 US cities. This is one of those applications that has been obvious forever, and it’s only been a question of time until it’s implemented well (which is not quite yet). When people are navigating traffic and choosing alternate routes, they have until now been guessing which way to go, having available at best a trickle of information in from the radio. In fact, traffic information is one of the main reasons that people listen to local radio. Once you can get far superior traffic information from other sources, you might as well go to the radio that gives you your preferred music or talk, which is unlikely to be local radio. Next steps include not just current traffic intensity, but also predicted traffic intensity. As I wrote in my book Living Networks, UK company Applied Generics has a product called RoDIN24, that anonymously monitors the movement of mobile phones relative to cell towers in order to provide extremely detailed and live views not just of where traffic is slow (mobile phones moving slowly), but also where traffic is converging to. Beyond that, computers will be able to predict reasonably accurately how long different routes will take, enabling drivers to make route choices without gazing at screens too much. Of course, these predictive devices will play off against each other – if every one made the same recommendation to their drivers, that route in turn would become congested. But in the long run, in congested urban traffic we will see the different possible routes taken even out, so that it takes a similar time whichever of the major possibilities you choose. Resource Shelf gives an overview of other traffic data resources. The US dominates, with some services also in the UK. As with good GPS mapping, there will be a several year lag for effective mobile traffic services to reach most other developed countries. As with many of these applications, it is the cost of mobile data that is a key driver. Cheap mobile data in the US is driving these kinds of applications. In countries where mobile data is very expensive, including Australia, it will, unfortunately, take a long time for mobile applications such as traffic data to take off.

Being in two places at the same time

By

I really like this. A Japanese researcher has created a lifelike doppelganger robot of himself, which he uses to do lectures at a university an hour’s drive away from his home, thus saving himself the commute. He is the live voice of the robot, and can see through its eyes. Pickups on his mouth and lips control the movement of the robot’s mouth while it is speaking. Apparently the robot looks very human – certainly the photos are hard to pick as a machine (though the machine does look less friendly than the man). While the cost of creating this kind of robot will never get very low, as every one must be custom-made, the implications are staggering. If the robot really is good, this is a big step beyond videoconferencing, and arguably even teleimmersion. Robot duplicates could be put on airplanes in lieu of people to go to distant meetings, for example. I will definitely explore the possibility of using one of these for keynote speeches I’m asked to deliver in distant lands, though I suspect it will be a good while before I have a duplicate of myself, unfortunately. Who hasn’t dreamed of being in two places at the same time?