• Home
  • Archive by category 'Semantic Web'

Archive for the ‘Semantic Web’ Category

Nodalities and Facebook’s David Recordon

This is a podcast I recorded for Talis’ Nodalities series of talks. Because Facebook has recently made announcements about moving in a Semantic Web direction, I spoke with their Senior Open Programs Manager, David Recordon, about Facebook’s perspectives on many of the technologies they’re beginning to use. We ended up discussing Social Networking as a graph—that is: a network of related things. We also spoke about the Open Graph Protocol they’ve worked on and touched on privacy and walled gardens.

As you listen to the podcast, you can have a look at the source code for my site. (Just don’t run any validators on it and complain about what a poor developer I am: I already know ;) ). In the head, you’ll notice a few lines of metadata that are discussed in the podcast:

<meta property=”og:title” content=”Blogging Perspective” /> <meta property=”og:type” content=”blog” /> <meta property=”og:email” content=”contact@zachbeauvais.com” /> <meta property=”og:url” content=”http://www.zachbeauvais.com” /> <meta property=”og:description” content=”Zach Beauvais’ home on the web: his perspective, images and ideas.” />

For more information, you can also read the Nodalities Magazine article I wrote about Facebook’s announcements.

The Open Graph Protocol page has information about the protocol itself. Facebook’s f8 developers’ conference site also has links with more information for developers.

Many thanks to David Recordon for having this conversation with me for Nodalities, and to my employer Talis, who has made this podcast available under a Creative Commons Attribution 3 license.

Download podcast here or play it below.

 
 

Talis: We’re Excited

This post was originally published on Nodalities Blog. Yay!The Talis offices, for the past few weeks, have been awash with geeky excitement—that kind of near giddy excitement that comes with eager expectation. We’ve all been waiting for something important.

For some, this was no doubt augmented with the announcement of Steve’s new iPad; but that’s not what’s gotten us all worked up.

For months, we’ve been looking forward to the launch of data.gov.uk; and last week, the wraps finally came off. The official press release put it:

A major new website has been launched to the public which gives anyone who wants to use it unprecedented and free access to government data in one place.
This doesn’t quite capture the coolness of the launch, for me. Yes, it’s a major new website, and it’s point is to publish information. But, the exciting thing is that this information is being published as data: data that can be used, reused, remixed and enriched. Sir Tim Berners-Lee’s perspective was more exciting:
Making public data available for re-use is about increasing accountability and transparency and letting people create new, innovative ways of using it. Government data should be a public resource. By releasing it, we can unlock new ideas for delivering public services, help communities and society work better, and let talented entrepreneurs and engineers create new businesses and services.
The point is that this public resource is finally getting a home on the web, and an infrastructure to make it not just available, but useful.

The exceptional team behind data.gov.uk have striven to adhere to web standards in its production: including Linked Data as a priority, as Professor Nigel Shadbolt explained:

We are also going to increase the use of ‘Linked Data’ standards, which allows people to provide data in a way that is as flexible and easy-to-use as possible.
Back in November, Leigh Dodds wrote a post explaining how we’ve been involved, and there’s an official Talis Platform press release too. Basically, we’ve been working with the data.gov.uk team to help with the Linked Data part of the site—hosting the SPARQL endpoints and providing consultancy and training, for example.

I can confidently say that we’re very proud of data.gov.uk, the team behind it, and our involvement with it. We’re excited by the prospect of this data being used as raw material for clever people to make interesting, useful, even world-changing things with it. We’ve seen the beginnings and proof-of-concept projects already.

Now comes the really exciting stuff. What are you going to build?

Image: “Yay for happy days!” by le vent le cri via flickr (CC: By)

 

Trends and Barriers

|This article first appeared in Nodalities Magazine, CC By + SA

For anyone following the Nodalities blog, you may have read some of my recent posts discussing the trends boiling up around Web 3.0 (other buzzwords are available). The Mobile Web and upgraded connectivity in general; the rise of ubiquitous computing from chips in every product imaginable; Linked Data and the “Semantic Web” as an organising platform for this rising tide of data—these are three very broad trends seeing a lot of media attention presently. From where I’m standing, I tend to see the next great turning point of the Web as a convergence of some of these trends, and see it as a rise in the importance of and reliance upon data itself and data tools generally.

The mobile web is bringing new sorts of information to people, and they can make use of this info wherever they happen to be because of advances in devices ad connectivity. As phones and web-enabled devices get better, so to do the chips we seem to have embedded all over the place, and we can now begin to have a more clear picture of what we do through the information we gather from our heaters, cars, and pedometers. Also, as more objects become connected, the grunt-work of number-crunching and storage is becoming commoditised into big, efficient, utility-like cloud services, which host and work with our collected information much more effectively than the gadget in your hand could ever hope to do. Others, like us here at Talis, talk about the Semantic Web, which allows for an evolution from a bunch of connected documents to the explicit connections between bits of information.

Also fermenting in this mix is a strengthening trend of political transparency and a public, shared ownership of social data. Barack Obama’s new administration has clearly made this a priority with the launch and work around data.gov; and in the UK, Sir Tim Berners-Lee himself has been appointed to an Parliamentary advisory role. There is growing pressure to be able to have access to public data, and to see it as belonging to the nation’s people rather than allowed to be legitimately filed away in the great, locked bureau of the capitols.

So, picking up two fairly obvious trends here: Social, Public Data and Linked Data; it would seem to follow that people would begin to have access to previously unavailable information in usable, linked forms. And it’s certainly beginning, as articles elsewhere in this magazine have illustrated. But, what about other chunks of public data? What about when data comes from universities, institutions, scientific foundations and NGO’s? What about charities monitoring crime, CO2 emissions and family histories? Wouldn’t these make a useful piece in the web of social data? What resources have the governments themselves got, if they want to make their public-owned data available in a useful format?

These questions form a major part of the thinking behind Talis’ Connected Commons initiative (talis.com/cc). Basically, Talis has made its Semantic Web platform (including data hosting and access tools) available free of charge for any datasets made available to the public. In doing so, we’re hoping to remove the barrier of cost entirely to publishing interesting data in a Linked Data way. One major reason for this is to promote reuse and mashups of this interesting data, and for people to be able to “follow their noses” to the data that completes their projects. But, from a publishers’ perspective, this is important, because it’s removing a major reason not to bother with making data useful, if not only public. So, with this, data can be made public and useable and the developers and users get the benefit of public SPARQL endpoints and API access to interesting data.

To keep the data open and public, datasets need to make use of either the Public Domain Dedication and License (PDDL) or Creative Commons’ CC0 license. Ian Davis, in his article in this magazine, explains more about waivers and the Connected Commons, and there is a lot more about this particular initiative over on the Talis site (talis.com/platform/cc/faqs/).

In a recent interview with the BBC, Sir Tim said: “This is our data. This is our taxpayers’ money which has created this data, so I would like to be able to see it, please.” I wonder if initiatives such as Connected Commons will begin to remove excuses, hindrances, and obstacles? As public awareness of the importance of access gets hotter, this might become a political issue, as well as a pragmatic one. I hope that in the rush to publish data, and in the ensuing discussion and debate that follows, that the users, hackers and developers don’t get sidelined. I think the world is ready for its data back.

 

What we’ve been working on…

threeTalis, my employer, has been a big promoter of Linked Data and open-access to information, because we see that new ideas often arise when existing ideas come together. Innovation, if you like, occurs at the join between ideas when they connect. I see this as fundamental to the way ideas and their applications (technology) advance. I tend to believe that anything “novel” is actually affected when other ideas are connected together.

In the technological world, this seems like a strong analogy for Linked Data: information which can be connected by a web-like network of links. These Linked Data have become the foundation for what has come to be known as the “Semantic Web”, a web of connected information which breaks out of information silos and enables the discovery of new ideas from old, and innovation from existing information. We use the phrase “serendipitous reuse” for the idea that once an idea (or a piece of data) is published, it can be used and reused in novel ways and in context of other data to produce unexpected, and unforeseeable possibilities. These ideas (data, again) become increasingly useful when published in a format which allows them to be linked freely to ANY other piece of information. We’ve had the distribution method for this network for years (the good, ol WWW itself) and it’s been about a year since RDF was launched by the WWW Consortium to handle the data itself. The idea is basically to give every bit of data an address (a universal address, not one subjective to a database like a cell reference), and to predicate that bit of information very much like language does. If you think of it like a language, RDF lets bits of data (nouns) to be acted upon or act upon (verbs) others (other nouns). This triple-format enables a near infinite recombination (theoretically) of any data, anywhere with an address.

So, what’s the problem? Well, most of the world’s data are locked away in silos (prisoners of the cells their databases confine them to). Many organisations may wish to make use of their data in a semantic environment, and many might even embrace the Open-source nature of their data, and make it freely available to the world to recombine and use: there are always more innovations outside an organisation than within! In order to lower barriers to enter this linked data world, Talis has built a Platform with resources to host and utilise these connections, making use of semantic web standards (RDF and SPARQL, the query language of the semantic web) and a developer-friendly environment (a RESTFul API, for example).

However, this innovation is only possible when data are accessible. In order to further lower the barriers, Talis is now offering free access to the Platform to host public domain data. We are calling this initiative the Talis Connected Commons, and the offer is not limited to free hosting: the data access services, including access to a public SPARQL endpoint, are also freely available. To keep this data open, you will need to use either the Open Data Commons Public Domain Dedication and License or the recently launched Creative Commons CC0 license to publish data. Anyone will then be able to freely access the stored data using the Platform services, without API keys and without usage limits.

There is more information available at www.talis.com/cc, where you can find detailed technical information, FAQ’s and other resources.

Image: “Eggistentialism 1.5 or Three of a Perfect Pair” by bitzcelt (via flickr), CC Licensed

 

Twitter metadata—metaphor?

This post featured originally in Nodalities Magazine.

Snow near us.
Image by Zach_Beauvais via Flickr

I’m sure I’m introducing old friends; but Twitter is a “microbloggiing” platform, to give it its proper description. it gives users 140 characters to publish status updates, comments, gripes, complaints, praises, news and whatever comes to mind. It’s burst out of its original answer to the simple question: “What are you doing?” and users often tweet just about everything.

One interesting innovation is the integration of the hashtag. Simply a hash symbol (#) and a tag descriptor for the comment. This gives people the ability to follow particular threads of updates or participate in conversations around an interest. They’re often used, for example, to update the goings on from conferences (#FOWA for example). People give their own content this little bit of information, and a search engine can find them. People can add additional information and follow conventions which allow for distributed trends that anyone can follow and interact with.

The recent snowfall in Britain gave rise to a flurry of tweets about road closures, amounts of snow falling, schools closing down and all the other chaos unleashed. When users followed a simple convention, however, this information got organised. People quickly adopted the #uksnow hashtag to track the topic; and eventually someone worked out a way to capture all the info needed to follow these geographically. By tweeting the first half of a UK post code plus a rating out of ten snowfall, anyone following the thread knows exactly where it’s snowing. It’s like an instant weather polling station, distributed across the country. It can go a step further, however, when services can actually mashup these tweets when users turn their simple status updates into a mini line of code.

This little bit of information allows for people to write software to track and automate the twitter information. This interactive map from benmarsh.co.uk, for example, actually plots a visual graph of snowfall across Britain. Bigger snowflakes indicate larger numbers out of ten in the poll. It’s simple, really. Ingenious, possibly. But the fundamental distinction between this tracking ability and the noise of thousands of Twits shouting about the snow is that little bit of metadata.

So, is this use of twitter a metaphor for the Semantic Web? It’s certainly a picture of automating information flow using metadata via software. Sounds Semanticcy to me.

 

Enhanced by Zemanta
 

Data as metaphor

Confetti by mr_gonzales

"Confetti" by mr_gonzales

I have talked a lot about metaphor, both here and, perhaps sadly, to my friends and family. Metaphor and the abstract are true passions of mine, and I can’t help but see them everywhere. I suppose, it’s the nature of metaphor to be everywhere, really.

The essence of metaphor is understanding and experiencing one thing in terms of another.

Lakoff and Johnson (1980)

So, seeing (or, “experiencing”, since “seeing” is really a metaphor) one idea or concept in terms of another is a kind of abstraction. You’re essentially changing your perspective on something by bringing in another concept. Metaphor, generally, is about comparison and noting the similarity, but I suppose there can be an element of the dissimilarity which makes them work. So, if I use a literary metaphor (comparing two things without the use of a similating word like “like” or “as”) and say: “this computer is rubbish”; I’m fundamentally making a comparison between the two notions—”this computer” and “rubbish”. It is the similarity which I am stressing; and, on the surface, using “rubbish” as a sort of modifier of the computer.

However, there is a whole plethora of meaning in this statement, if you pull yourself back from it a bit. What’s rubbish? Rubbish is stuff we throw away; it can smell bad; it’s collected from our houses and fills holes in the ground; we don’t want rubbish; we don’t like rubbish; it’s a generic term for things we don’t like or are unhappy with. With this simple statement, I’m ever-so-casually bashing together large quantities of information and notion, and letting the meanings fall where they will. Inside this somewhere is the idea of “propositionality”, meaning that I’m letting the hearer of this statement draw their own conclusions to what I’m saying (he’s not happy with his computer, his computer may not be very good, he wouldn’t recommend it, he’s having a bad day…) some of which is intended, some of it not (at, least, not consciously). There are also cultural considerations in that there is a sort of social consensus that this metaphor “works” and that we must not literally interpret this statement as an intention to physically dispose of an object (which is good when you consider any time you’ve ever heard a person “understood or experienced in terms of” rubbish ;) This leads me to think that there are also elements of disassociation between the two concepts, so that some of the meaning is actually in the difference between “rubbish” and “this computer”. I’m probably not going to throw it away (at least, not immediately); It’s probably something I’ve bought and have no intention of burying in the ground; I expect to be happy or satisfied with it (whereas, you wouldn’t about a used tea-bag). So, the two concepts modify each other, they’re like points in a perspective, making it possible to glean added meaning from the situation which is greater than just the two ideas themselves.

(if you’re still reading by here, email me, and we’ll have a pint!)

I mentioned in my previous post that data are used in abstraction. What I mean by this, is that a bit of data is “used” in a process when it’s a point of reference for something. This number + that number = another number, the two numbers are reference points for the sum. When I say: “I’m busy on the 2nd” it means that I’ve referred to a bit of data (a number on a calendar application, an email, or whatever) that I’ve used as a point of comparison. I’ve essentially understood the projected state of my schedule in terms of what i’ve already planned to occur. And, these bits of data are more and more powerful when the perspective you gain from them is more accurate.

When we get more reference points, and more interactions, our perspective becomes more flexible. We can abstract ourselves right out, and look at a very broad picture. Google’s Pagerank does this by mining clickstream information from a very, very large dataset using very simple reference points: the number of links to an item increases its position in the rank. Conversely, we can focus right in on a single notion or dataset using as many different references as possible to understand a limited set of transactions. Amazon’s book results page is full of this kind of perspective with user-ratings, purchase histories, browsing behaviour, and mathematical algorithms to give a very full picture (and options to accomplish a task) of a single notion.

So, I think that data are very similar to metaphor in that they are used to understand one thing (or set of things) in terms of another (or others).

The upshot of this is that we can refer to this abstract concept of “data” in terms that help us to understand both their significance and their utility to us. When I say: “I want my data to do this”, it’s not that helpful unless you understand that I’m trying to get all my reference points to produce a perspective to help me accomplish something. Which leads me to my main point: the whole point of metaphor is to help—or possibly enable—our understanding. Data should do the same. Collecting all the bits and pieces of information you incur by being a person and doing things should bring you some form of understanding leading to a benefit.

Enhanced by Zemanta

 
© 2010 Zach Beauvais
some rights reserved