Jan 28, 2012

2012: The Year of the Semantic Web


In 1996, Tim Berners-Lee, director of the World Wide Web Consortium (W3C), defined modern Semantic Web technology with this vision:

If the interaction between person and hypertext could be so intuitive that the machine-readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations.

Fifteen years later, the Semantic Web is used in a variety of fields from art museums informatics to breast cancer research. Although the global implementation of the Semantic Web vision may be years from becoming a reality, many sophisticated IT departments are increasingly adopting semantic standards and migrating to semantic technology-based products to achieve the same benefits in their enterprise that the Semantic Web delivers to the Web. This technological trend will continue to penetrate industries as diverse as finance, medical devices, telecommunications, life sciences, and the intelligence community. In fact, I believe that 2012 will be the year of the Semantic Web.

Here are three use cases from 2011 that illustrate the growing impact of semantic technology in commerce and culture today, and why society is migrating toward a data-driven world.

1. Telecommunications -- The Siri Use Case

Even if you don't know anyone whose Christmas list included an iPhone 4S, it's still true that Apple sold 35 million iPhones in the first quarter, ending December. And it's estimated that Apple will sell 125 million more in 2012. That's a lot of people talking to themselves, I mean, their voice assistant Siri. She's the virtual servant or concierge who can help you arrange a place to eat or stay, activities to do, provide you with directions (hopefully better than those dictated by my GPS). All you do is speak, click, or type, and your little helper collects information from a slew of websites, assisting you in your decision-making process. It can even secure a restaurant reservation for you, or an airplane ticket. All of this is why Siri and a few other features use TWICE as much data as the last iPhone model. The iPhone 4S even uses more data than the iPad.

Co-founder, CTO, and VP Design of Siri, Tom Gruber, is a pioneer in the world of the Semantic Web. A forerunner in using the Web to collect and share information, he is credited with defining "ontology" in a technical sense for computer science -- the first one to call "ontologies" a technology for enabling knowledge sharing. Gruber established the DARPA Knowledge Sharing Library and was among the founding innovative thinkers who laid the groundwork for what we now call the Semantic Web.

2. Enterprises -- The Best Buy Use Case

In December of 2009, Jay Myers, the Lead Development Engineer at Best Buy, published a strategic formula for business data and semantics. It consisted of three circles, the first two added together to produce the third: Externally facing linked open data + Internal linked data = Insights. He explains:


The external data sphere represents human and machine readable data that you'd want everyone to access. One of the primary vehicles gaining popularity on the web is RDFa, a way of utilizing richly annotated HTML to deliver data to machines while retaining the rich visual web human users have become accustomed to... The great thing about "front-end" semantic markup techniques is with a little additional knowledge and tools, it allows countless numbers of HTML devs to create a very rich web of data by simply adding data annotations to their HTML, essentially making the entire web an open and queryable database or API for us to extract knowledge from.

Was the strategy successful? According to an interview of Jay by Doc Sheldon of SearchNewsCentral.com ("RDFa: The Inside Story From Best Buy"), I would say yes. The Best Buy Lead Development Engineer said this:

Within just a couple of months, we began to see an increase in our organic search results. Before long, it had increased by 30 percent over historical rates. We also saw an increase in our click-through rate. Yahoo did a study a while back and found that people that had rich snippets on the results pages were seeing around a 15 percent increase in CTR, which has proven to be the case for us. And of course, it makes our web site "smarter" and more open to machines, which ultimately benefits customers.


3. Museum Informatics -- The Annapolis Historic Foundation Use Case

Finally, consider the recent collaboration between a museum collection in Annapolis, Marylandand my technology firm, Orbis Technologies, Inc. The bulk of our business concentrates on delivering semantic applications to the Department of Defense and commercial clients with near-Internet data challenges. However, we were also able to use our technological capabilities to enhance the world of art exhibits. In a display that ran for seven months, we worked with the Annapolis Historic Foundation to showcase the work of a variety of craftsmen in Annapolis between the years 1700 and 1810, with special focus on portrait artists, silversmiths, and cabinetmakers.

Orbis essentially created an interactive knowledge application for the exhibit that facilitated cross-referencing of information on an artist or image. For example, by clicking on the name of silversmith William Faris, or cabinetmaker John Shaw, a person is able to access all other kinds of information related to the craftsman. As in millions of other use cases, semantic technology was used to create connections between different kinds of data available on points of interest -- in this instance, artisans and objects.

What Semantic Technology Can Do

These broad project applications of semantic technology share common components, of course. Successful implementations often have well understood process workflows that support the generation of a defined product. The unique, domain/industry vocabularies are often required for structured, semi-structured, and unstructured data. These project characteristics, combined with the correct products, can create successful semantic technology-driven projects that demonstrate the value and subsequent return on investment.
In other words, in the best of situations -- where optimal project characteristics are in place -- semantic technology can address common infrastructure problems associated with massive database integration efforts and data overload (i.e., too much data and not enough actionable information or real knowledge). The core semantic technology standards (e.g. RDF) provide machine-readable formats for explicitly describing relationships in a format that models human cognition, thereby creating information that facilitates the human decision process.

The Semantic Web allows us to invest our brain power on responsibilities and tasks that require alert human cognition -- and gives the tedious line checking and data grabbing to a machine who doesn't talk back, get grumpy or demand coffee.

That's why 2012 will be the year of the Semantic Web.

This article was originally posted at   http://ping.fm/YblCg

Give Me a Sign: What Do Things Mean on the Semantic Web?




Coca-Cola, Toucans and Charles Sanders Peirce

The crowning achievement of the semantc Web is the simple use of URIs to identify data. Further, if the URI identifier can resolve to a representation of that data, it now becomes an integral part of the HTTP access protocol of the Web while providing a unique identifier for the data. These innovations provide the basis for distributed data at global scale, all accessible via Web devices such as browsers and smartphones that are now a ubiquitous part of our daily lives.

Yet, despite these profound and simple innovations, the semantic Web’s designers and early practitioners and advocates have been mired in a muddled, metaphysical argument of at least a decade over what these URIs mean, what they reference, and what their actual true identity is. These muddles about naming and identity, it might be argued, are due to computer scientists and programmers trying to grapple with issues more properly the domain of philosophers and linguists. But that would be unfair. For philosophers and linguists themselves have for centuries also grappled with these same conundrums [1].

As I argue in this piece, part of the muddle results from attempting to do too much with URIs while another part results from not doing enough. I am also not trying to directly enter the fray of current standards deliberations. (Despite a decade of controversy, I optimistically believe that the messy process of argument and consensus building will work itself out [2].) What I am trying to do in this piece, however, is to look to one of America’s pre-eminent philosophers and logicians, Charles Sanders Peirce (pronounced “purse”), to inform how these controversies of naming, identity and meaning may be dissected and resolved.

‘Identity Crisis’, httpRange-14, and Issue 57

The Web began as a way to hyperlink between documents, generally Web pages expressed in the HTML markup language. These initial links were called URLs (uniform resource locators), and each pointed to various kinds of electronic resources (documents) that could be accessed and retrieved on the Web. These resources could be documents written in HTML or other encodings (PDFs, other electronic formats), images, streaming media like audio or videos, and the like [3].
All was well and good until the idea of the semantic Web, which postulated that information about the real world — concepts, people and things — could also be referenced and made available for reasoning and discussion on the Web. With this idea, the scope of the Web was massively expanded from electronic resources that could be downloaded and accessed via the Web to now include virtually any topic of human discourse. The rub, of course, was that ideas such as abstract concepts or people or things could not be “dereferenced” nor downloaded from the Web.

One of the first things that needed to change was to define a broader concept of a URI “identifier” above the more limited concept of a URL “locator”, since many of these new things that could be referenced on the Web went beyond electronic resources that could be accessed and viewed [3]. But, since what the referent of the URI now actually might be became uncertain — was it a concept or a Web page that could be viewed or something else? — a number of commentators began to note this uncertainty as the “identity crisis” of the Web [4]. The topic took on much fervor and metaphysical argument, such that by 2003, Sandro Hawke, a staffer of the standards-setting W3C (World Wide Web Consortium), was able to say, “This is an old issue, and people are tired of it” [5].

Yet, for many of the reasons described more fully below, the issue refused to go away. The Technical Architecture Group (TAG) of the W3C took up the issue, under a rubric that came to be known as httpRange-14 [6]. The issue was first raised in March 2002 by Tim Berners-Lee, accepted for TAG deliberations in February 2003, with then a resolution offered in June 2005 [7]. (Refer to the original resolution and other information [6] to understand the nuances of this resolution, since particular commentary on that approach is not the focus of this article.) Suffice it to say here, however, that this resolution posited an entirely new distinction of Web content into “information resources” and “non-information resources”, and also recommended the use of the HTTP 303 redirect code for when agents requesting a URI should be directed to concepts versus viewable documents.

This “resolution” has been anything but. Not only can no one clearly distinguish these de novo classes of “information resources” [19], but the whole approach felt arbitrary and kludgy.

Meanwhile, the confusions caused by the “identity crisis” and httpRange-14 continued to perpetuate themselves. In 2006, a major workshop on “Identity, Reference and the Web” (IRW 2006) was held in conjunction with the Web’s major WWW2006 conference in Edinburgh, Scotland, on May 23, 2006 [8]. The various presentations and its summary (by Harry Halpin) are very useful to understand these issues. What was starting to jell at this time was the understanding that the basis of identity and meaning on the Web posed new questions, and ones that philosophers, logicians and linguists needed to be consulted to help inform.

The fiat of the TAG’s 2005 resolution has failed to take hold. Over the ensuing years, various eruptions have occurred on mailing lists and within the TAG itself (now expressed as Issue 57) to revisit these questions and bring the steps moving forward into some coherent new understanding. Though linked data has been premised on best-practice implementation of these resolutions [9], and has been a qualified success, many (myself included) would claim that the extra steps and inefficiencies required from the TAG’s httpRange-14 guidance have been hindrances, not facilitators, of the uptake of linked data (or the semantic Web).

Today, despite the efforts of some to claim the issue closed, it is not. Issue 57 and the periodic bursts from notable semantic Web advocates such as Ian Davis [10], Pat Hayes and Harry Halpin [11], Ed Summers [12], Xiaoshu Wang [13], David Booth [14] and TAG members themselves, such as Larry Masinter [15] and Jonathan Rees [16], point to continued irresolution and discontent within the advocate community. Issue 57 currently remains open. Meanwhile, I think, all of us interested in such matters can express concern that linked data, the semantic Web and interoperable structured data have seen less uptake than any of us had hoped or wanted over the past decade. As I have stated elsewhere, unclear semantics and muddled guidelines help to undercut potential use.

As each of the eruptions over these identity issues has occurred, the competing camps have often been characterized as “talking past one another”; that is, not communicating in such a way as to help resolve to consensus. While it is hardly my position to do so, I try to encapsulate below the various positions and prejudices as I see them in this decades-long debate. I also try to share my own learning that may help inform some common ground. Forgive me if I overly simplify these vexing issues by returning to what I see as some first principles . . . .


What’s in a Name?


Original Coca-Cola bottle
One legacy of the initial document Web is the perception that Web addresses have meaning. We have all heard of the multi-million dollar purchasing of domains [17] and the adjudication that may occur when domains are hijacked from their known brands or trademark owners. This legacy has tended to imbue URIs with a perceived value. It is not by accident, I believe, that many within the semantic Web and linked data communities still refer to “minting” URIs. Some believe that ownership and control over URIs may be equivalent to grabbing up valuable real estate. It is also the case that many believe the “name” given to a URI acts to name the referent to which it refers.

This perception is partially true, partially false, but moreover incomplete in all cases. We can illustrate these points with the global icon, “Coca-Cola”.

As for the naming aspects, let’s dissect what we mean when we use the label “Coca-Cola” (in a URI or otherwise). Perhaps the first thing that comes to mind is “Coca-Cola,” the beverage (which has a description on Wikipedia, among other references). Because of its ubiquity, we may also recognize the image of the Coca-Cola bottle to the left as a symbol for this same beverage. (Though, in the hilarious movie, The Gods, They Must be Crazy, Kalahari Bushmen, who had no prior experience of Coca-Cola, took the bottle to be magical with evil powers [18].) Yet even as reference to the beverage, the naming aspects are a bit cloudy since we could also use the fully qualified synonyms of “Coke”, “Coca-cola” (small C), “Classic Coke” and the hundreds of language variants worldwide.

On the other hand, the label “Coca-Cola” could just as easily conjure The Coca-Cola Company itself. Indeed, the company web site is the location pointed to by the URI of http://www.thecoca-colacompany.com/. But, even that URI, which points to the home Web page of the company, does not do justice to conveying an understanding or description of the company. For that, additional URIs may need to be invoked, such as the description at Wikipedia, the company’s own company description page, plus perhaps the company’s similar heritage page.

Of course, even these links and references only begin to scratch the surface of what the company Coca-Cola actually is: headquarters, manufacturing facilities, 140,000 employees, shareholders, management, legal entities, patents and Coke recipe, and the like. Whether in human languages or URIs, in any attempt to signify something via symbols or words (themselves another form of symbol), we risk ambiguity and incompleteness.

URI shorteners also undercut the idea that a URI necessarily “names” something. Using the service bitly, we can shorten the link to the Wikipedia description of the Coke beverage to http://bit.ly/xnbA6 and we can shorten the link to The Coca-Cola Company Web site to http://bit.ly/9ojUpL. I think we can fairly say that neither of these shortened links “name” their referents. The most we can say about a URI is that it points to something. With the vagaries of meaning in human languages, we might also say that URIs refer to something, denote something or identify (but not in the sense of completely define) something.

From this discussion, we can assert with respect to the use of URIs as “names” that:

  1. In all cases, URIs are pointers to a particular referent

  2. In some cases, URIs do act to “name” some things

  3. Yet, even when used as “names,” there can be ambiguity as to what exactly the referent is that is denoted by the name

  4. Resolving what such “names” mean is a matter of context and reference to further information or links, and

  5. Because URIs may act as “names”, it is appropriate to consider social conventions and contracts (e.g., trademarks, brands, legal status) in adjudicating who can own the URI.



In summary, I think we can say that URIs may act as names, but not in all or most cases, and when used as such are often ambiguous. Absolutely associating URIs as names is way too heavy a burden, and incorrect in most cases.


What is a Resource?




The “name” discussion above masks that in some cases we are talking about a readable Web document or image (such as the Wikipedia description of the Coke beverage or its image) versus the “actual” thing in the real world (the Coke beverage itself or even the company). This distinction is what led to the so-called “identity crisis”, for which Ian Davis has used a toucan as his illustrative thing [10].Keel-billed Toucan
As I note in the conclusion, I like Davis’ approach to the identity conundrum insofar as Web architecture and linked data guidance are concerned. But here my purpose is more subtle: I want to tease apart still further the apparent distinction between an electronic description of something on the Web and the “actual” something. Like Davis, let’s use the toucan.

In our strawman case, we too use a description of the toucan (on Wikipedia) to represent our “information resource” (the accessible, downloadable electronic document). We contrast to that a URI that we mean to convey the actual physical bird (a “non-information resource” in the jumbled jargon of httpRange-14), which we will designate via the URI of http://example.com/toucan.

Despite the tortured (and newly conjured) distinction between “information resource” and “non-information resource”, the first blush reaction is that, sure, there is a difference between an electronic representation that can be accessed and viewed on the Web and its true, “actual” thing. Of course people can not actually be rendered and downloaded on the Web, but their bios and descriptions and portrait images may. While in the abstract such distinctions appear true and obvious, in the specifics that get presented to experts, there is surprising disagreement as to what is actually an “information resource” v. a “non-information resource” [19]. Moreover, as we inspect the real toucan further, even that distinction is quite ambiguous.

When we inspect what might be a definitive description of “toucan” on Wikipedia, we see that the term more broadly represents the family of Ramphastidae, which contains five genera and forty different species. The picture we are showing to the right is but of one of those forty species, that of the keel-billed toucan (Ramphastos sulfuratus). Viewing the images of the full list of toucan species shows just how divergent these various “physical birds” are from one another. Across all species, average sizes vary by more than a factor of three with great variation in bill sizes, coloration and range. Further, if I assert that the picture to the right is actually that of my pet keel-billed toucan, Pretty Bird, then we can also understand that this representation is for a specific individual bird, and not the physical keel-billed toucan species as a whole.

The point of this diversion is not a lecture on toucans, but an affirmation that distinctions between “resources” occur at multiple levels and dimensions. Just as there is no self-evident criteria as to what constitutes an “information resource”, there is also not a self-evident and fully defining set of criteria as to what is the physical “toucan” bird. The meaning of what we call a “toucan” bird is not embodied in its label or even its name, but in the context and accompanying referential information that place the given referent into a context that can be communicated and understood. A URI points to (“refers to”) something that causes us to conjure up an understanding of that thing, be it a general description of a toucan, a picture of a toucan, an understanding of a species of toucan, or a specific toucan bird. Our understanding or interpretation results from the context and surrounding information accompanying the reference.

In other words, a “resource” may be anything, which is just the way the W3C has defined it. There is not a single dimension which, magically, like “information” and “non-information,” can cleanly and definitely place a referent into some state of absolute understanding. To assert that such magic distinctions exist is a flaw of Cartesian logic, which can only be reconciled by looking to more defensible bases in logic [20].


Peirce and the Logic of Signs


The logic behind these distinctions and nuances leads us to Charles Sanders PeirceCharles Sanders Peirce (1839 – 1914). Peirce (pronounced “purse”) was an American logician, philosopher and polymath of the first rank. Along with Frege, he is acknowledged as the father of predicate calculus and the notation system that formed the basis of first-order logic. His symbology and approach arguably provide the logical basis for description logics and other aspects underlying the semantic Web building blocks of the RDF data model and, eventually, the OWL language. Peirce is the acknowledged founder of pragmatism, the philosophy of linking practice and theory in a process akin to the scientific method. He was also the first formulator of existential graphs, an essential basis to the whole field now known as model theory. Though often overlooked in the 20th century, Peirce has lately been enjoying a renaissance with his voluminous writings still being deciphered and published.

The core of Peirce’s world view is based in semiotics, the study and logic of signs. In his seminal writing on this, “What is in a Sign?” [21], he wrote that “every intellectual operation involves a triad of symbols” and “all reasoning is an interpretation of signs of some kind”. Peirce had a predilection for expressing his ideas in “threes” throughout his writings.

Semiotics is often split into three branches: 1) syntactics – relations among signs in formal structures; 2) semantics – relations between signs and the things to which they refer; and 3) pragmatics – relations between signs and the effects they have on the people or agents who use them.

Peirce’s logic of signs in fact is a taxonomy of sign relations, in which signs get reified and expanded via still further signs, ultimately leading to communication, understanding and an approximation of “canonical” truth. Peirce saw the scientific method as itself an example of this process.

A given sign is a representation amongst the triad of the sign itself (which Peirce called a representamen, the actual signifying item that stands in a well-defined kind of relation to the two other things), its object and its interpretant. The object is the actual thing itself. The interpretant is how the agent or the perceiver of the sign understands and interprets the sign. Depending on the context and use, a sign (or representamen) may be either an icon (a likeness), an indicator or index (a pointer or physical linkage to the object) or a symbol (understood convention that represents the object, such as a word or other meaningful signifier).

An interpretant in its barest form is a sign’s meaning, implication, or ramification. For a sign to be effective, it must represent an object in such a way that it is understood and used again. This makes the assignment and use of signs a community process of understanding and acceptance [20], as well as a truth-verifying exercise of testing and confirming accepted associations.

John Sowa has done much to help make some of Peirce’s obscure language and terminology more accessible to lay readers [22]. He has expressed Peirce’s basic triad of sign relations as follows, based around the Yojo animist cat figure used by the character Queequeg in Herman Melville’s Moby-Dick:


The Triangle of Meaning


In this figure, object and symbol are the same as the Peirce triad; concept is the interpretant in this case. The use of the word ‘Yojo’ conjures the concept of cat.

This basic triad representation has been used in many contexts, with various replacements or terms at the nodes. Its basic form is known as the Meaning Triangle, as was popularized by Ogden and Richards in 1923 [23].

The key aspect of signs for Peirce, though, is the ongoing process of interpretation and reference to further signs, a process he called semiosis. A sign of an object leads to interpretants, which, as signs, then lead to further interpretants. In the Sowa example below, we show how meaning triangles can be linked to one another, in this case by abstracting that the triangles themselves are concepts of representation; we can abstract the ideas of both concept and symbol:


Representing an Object by a Concept


We can apply this same cascade of interpretation to the idea of the sign (or representamen), which in this case shows that a name can be related to a word symbol, which in itself is a combination of characters in a string called ‘Yojo’:


Representing Signs of Signs of Signs

According to Sowa [22]:



“What is revolutionary about Peirce’s logic is the explicit recognition of multiple universes of discourse, contexts for enclosing statements about them, and metalanguage for talking about the contexts, how they relate to one another, and how they relate to the world and all its events, states, and inhabitants.


“The advantage of Peircean semiotics is that it firmly situates language and logic within the broader study of signs of all types. The highly disciplined patterns of mathematics and logic, important as they may be for science, lie on a continuum with the looser patterns of everyday speech and with the perceptual and motor patterns, which are organized on geometrical principles that are very different from the syntactic patterns of language or logic.”


Catherine Legg [20] notes that the semiotic process is really one of community involvement and consensus. Each understanding of a sign and each subsequent interpretation helps come to a consensus of what a sign means. It is a way of building a shared understanding that aids communication and effective interpretation. In Peirce’s own writings, the process of interpretation can lead to validation and an eventual “canonical” or normative interpretation. The scientific method itself is an extreme form of the semiotic process, leading ultimately to what might be called accepted “truths”.


Peircean Semiotics of URIs




So, how do Peircean semiotics help inform us about the role and use of URIs? Does this logic help provide guidance on the “identity crisis”?

The Peircean taxonomy of signs has three levels with three possible sign roles at each level, leading to a possible 27 combinations of sign representations. However, because not all sign roles are applicable at all levels, Peirce actually postulated only ten distinct sign representations.

Common to all roles, the URI “sign” is best seen as an index: the URI is a pointer to a representation of some form, be it electronic or otherwise. This representation bears a relation to the actual thing that this referent represents, as is true for all triadic sign relationships. However, in some contexts, again in keeping with additional signs interpreting signs in other roles, the URI “sign” may also play the role of a symbolic “name” or even as a signal that the resource can be downloaded or accessed in electronic form. In other words, by virtue of the conventions that we choose to assign to our signs, we can supply additional information that augments our understanding of what the URI is, what it means, and how it is accessed.

Of course, in these regards, a URI is no different than any other sign in the Peircean world view: it must reside in a triadic relationship to its actual object and an interpretation of that object, with further understanding only coming about by the addition of further signs and interpretations.

In shortened form, this means that a URI, acting alone, can at most play the role of a pointer between an object and its referent. A URI alone, without further signs (information), can not inform us well about names or even what type of resource may be at hand. For these interpretations to be reliable, more information must be layered on, either by accepted convention of the current signs or the addition of still further signs and their interpretations. Since the attempts to deal with the nature of a URI resource by fiat as stipulated by httpRange-14 neither meet the standards of consensus nor empirical validity, the attempt can not by definition become “canonical”. This does not mean that httpRange-14 and its recommended practices can not help in providing more information and aiding interpretation for what the nature of a resource may be. But it does mean that httpRange-14 acting alone is insufficient to resolve ambiguity.

Moreover, what we see in the general nature of Peirce’s logic of signs is the usefulness of adding more “triads” of representation as the process to increase understanding through further interpretation. Kind of sounds like adding on more RDF triples, does it not?


Global is Neither Indiscriminate Nor Unambiguous


Names, references, identity and meaning are not absolutes. They are not philosophically, and they are not in human language. To expect machine communications to hold to different standards and laws than human communications is naive. To effect machine communications our challenge is not to devise new rules, but to observe and apply the best rules and practices that human communications instruct.

There has been an unstated hope at the heart of the semantic Web enterprise that simply expressing statements in the right way (syntax) and in the right form (RDF) is sufficient to facilitate machine communications. But this hope, too, is naive and silly. Just as we do not accept all human utterances as truth, neither will we accept all machine transmissions as reliable. Some of the information will be posted in error; some will be wrong or ill-fitting to our world view; some will be malicious or intended to deceive. Spam and occasionally lousy search results on the Web tell us that Web documents are subject to these sources of unsuitability, why is not the same true of data?

Thus, global data access via the semantic Web is not — and can never be — indiscriminate nor unambiguous. We need to understand and come to trust sources and provenance; we need interpretation and context to decide appropriateness and validity; and we need testing and validation to ensure messages as received are indeed correct. Humans need to do these things in their normal courses of interaction and communication; our machine systems will need to do the same.

These confirmations and decisions as to whether the information we receive is actionable or not will come about via still more information. Some of this information may come about via shared conventions. But most will come about because we choose to provide more context and interpretation for the core messages we hope to communicate.


A Go-Forward Approach


Nearly five years ago Hayes and Halpin put forth a proposal to add ex:refersTo and ex:describedBy to the standard RDF vocabulary as a way for authors to provide context and explanation for what constituted a specific RDF resource [11]. In various ways, many of the other individuals cited in this article have come to similar conclusions. The simple redirect suggestions of both Ian Davis [10] and Ed Summers [12] appear particularly helpful.

Over time, we will likely need further representations about resources regarding such things as source, provenance, context and other interpretations that would help remove ambiguities as to how the information provided by that resource should be consumed or used. These additional interpretations can mechanically be provided via referenced ontologies or embedded RDFa (or similar). These additional interpretations can also be aided by judicious, limited additions of new predicates to basic language specifications for RDF (such as the Hayes and Halpin suggestions).

In the end, of course, any frameworks that achieve consensus and become widely adopted will be simple to use, easy to understand, and straightforward to deploy. The beauty of best practices in predicates and annotations is that failures to provide are easy to test. Parties that wish to have their data consumed have incentive to provide sufficient information so as to enable interpretation.

There is absolutely no reason that these additions can not co-exist with the current httpRange-14 approach. By adding a few other options and making clear the optional use of httpRange-14, we would be very Peirce-like in our go-forward approach: We are being both pragmatic while we add more means to improve our interpretations for what a Web resource is and is meant to be.




[1] Throughout intellectual history, a number of prominent philosophers and logicians have attempted to describe naming, identity and reference of objects and entities. Here are a few that you may likely encounter in various discussions of these topics in reference to the semantic Web; many are noted philosophers of language:

  • Aristotle (384 BC – 322 BC) – founder of formal logic; formulator and proponent of categorization; believed in the innate “universals” of various things in the natural world

  • Rudolf Carnap (1891 – 1970) - proposed a logical syntax that provided a system of concepts, a language, to enable logical analysis via exactly formula; a basis for natural language processing;rejected the idea and use of metaphysics

  • René Descartes (1596 – 1650) – posited a boundary between mind and the world; the meaning of a sign is the intension of its producer, and is private and incorrigible

  • Friedrich Ludwig Gottlob Frege (1848 – 1925) – one of the formulators of first-order logic, though syntax not adopted; advocated shared senses, which can be objective and sharable

  • Kurt Gödel (1906 – 1978) – his two incompleteness theorems are some of the most important logic contributions of all time; they establish inherent limitations of all but the most trivial axiomatic systems capable of doing arithmetic, as well as for computer programs

  • David Hume (1711 – 1776) – embraced natural empiricism, but kept the Descartes concept of an “idea”

  • Immanuel Kant (1724 – 1804) – one of the major philosophers in history, argued that experience is purely subjective without first being processed by pure reason; a major influence on Peirce

  • Saul Kripke (1940 – ) – proposed the causal theory of reference and what proper names mean via a “baptism” by the namer

  • Gottfried Wilhelm Leibniz (1646 – 1716) – the classic definition of identity is Leibniz’s Law, which states that if two objects have all of their properties in common, they are identical and so only one object

  • Richard Montague (1930 – 1971) – wrote much on logic and set theory; student of Tarski; pioneered a logical approach to natural language semantics; associated with model theory, model-theoretic semantics

  • Charles Sanders Peirce (1839 – 1914) – see main text

  • Willard Van Orman Quine (1908 – 2000) – noted analytical philosopher, advocated the “radical indeterminancy of translation” (can never really know)

  • Bertrand Russell (1872 – 1970) – proposed the direct theory of reference and what it means to “ground in references”; adopted many Peirce arguments without attribution

  • Ferdinand de Saussure (1857 – 1913) – also proposed an alternative view to Peirce of semiotics, one grounded in sociology and linguistics

  • John Rogers Searle (1932 – ) – argues that consciousness is a real physical process in the brain and is subjective; has argued against strong AI (artificial intelligence)

  • Alfred Tarski (1901 – 1983) – analytic philosopher focused on definitions of models and truth; great admirer of Peirce; associated with model theory, model-theoretic semantics

  • Ludwig Josef Johann Wittgenstein (1889 – 1951) – he disavowed his earlier work, arguing that philosophy needed to be grounded in ordinary language, recognzing that the meaning of words is dependent on context, usage, and grammar.



Also, Umberto Eco has been a noted proponent and popularizer of semiotics.



[2] As any practitioner ultimately notes, standards development is a messy, lengthy and trying process. Not all individuals can handle the messiness and polemics involved. Personally, I prefer to try to write cogent articles on specific issues of interest, and then leave it to others to slug it out in the back rooms of standards making. Where the process works well, standards get created that are accepted and adopted. Where the process does not work well, the standards are not embraced as exhibited by real-world use.


[3] Tim Berners-Lee, 2007. What Do HTTP URIs Identify?


This article does not discuss the other sub-category of URIs, URNs (for names). URNs may refer to any standard naming scheme (such as ISBNs for books) and has no direct bearing on any network access protocol, as do URLs and URIs when they are referenceable. Further, URNs are little used in practice.


[4] Kendall Clark was one of the first to question “resource” and other identity ambiguities, noting the tautology between URI and resource as “anything that has identity.” See Kendall Clark, 2002. “Identity Crisis,” in XML.com, Sept 11 2002; see http://www.xml.com/pub/a/2002/09/11/deviant.html. From the topic map community, one notable contribution was from Steve Pepper and Sylvia Schwab, 2003. “Curing the Web’s Identity Crisis,” found at : http://www.ontopia.net/topicmaps/materials/identitycrisis.html.


[5] Sandro Hawke, 2003. Disambiguating RDF Identifiers. W3C, January 2003. See http://www.w3.org/2002/12/rdf-identifiers/.


[6] The issue was framed as what is the proper “range” for HTTP referrals and was also the 14th major TAG issue recorded, hence the name. See further the httpRange-14 Webography .


[7] See W3C, “httpRange-14: What is the range of the HTTP dereference function?”; see http://www.w3.org/2001/tag/issues.html#httpRange-14.



[9] Leo Sauermann and Richard Cyganiak, eds., 2008. Cool URIs for the Semantic Web, W3C Interest Group Note, December 3, 2008. See http://www.w3.org/TR/cooluris/.


[10] Ian Davis, 2010. Is 303 Really Necessary? Blog post, November 2010, accessed 20 January 2012. (See http://blog.iandavis.com/2010/11/04/is-303-really-necessary/.) A considerable thread resulted from this post; see http://markmail.org/thread/mkoc5kxll6bbjbxk.


[11] See first Harry Halpin, 2006. “Identity, Reference and Meaning on the Web,” presented at WWW 2006, May 23, 2006. See http://www.ibiblio.org/hhalpin/irw2006/hhalpin.pdf. This was then followed up with greater elaboration by Patrick J. Hayes and Harry Halpin, 2007. “In Defense of Amibiguity,” http://www.ibiblio.org/hhalpin/homepage/publications/indefenseofambiguity.html.


[12] Ed Summers, 2010. Linking Things and Common Sense, blog post of July 7, 2010. See http://inkdroid.org/journal/2010/07/07/linking-things-and-common-sense/.


[13] Xiaoshu Wang, 2007. URI Identity and Web Architecture Revisited, Word document posted on posterous.com, November 2007. (Former Web documents have been removed.)


[14] David Booth, 2006. “URIs and the Myth of Resource Identity,” see http://dbooth.org/2006/identity/.


[15] See Larry Masinter, 2012. “The ‘tdb’ and ‘duri’ URI Schemes, Based on Dated URIs,” 10th version, IETF Network Working Group Internet-Draft,January 12, 2012. See http://tools.ietf.org/html/draft-masinter-dated-uri-10.


[16] Jonathan Rees has been the scribe and author for many of the background documents related to Issue 57. A recent mailing list entry provides pointers to four relevant documents in this entire discussion. See Jonathan A Rees, 2012.
Most think that language evolved as a way for people to exchange information, however, linguists and other communication students have long reasoned over why language evolved. Famous linguists, amongst them MIT's Noam Chomsky, have debated that language is actually badly designed for communication and state that it is only a byproduct of a system that may have evolved for other reasons, maybe for structuring our own private thoughts.

As proof for their theory, these linguists highlight the fact that language is ambiguous. They claim that in a scheme, which is optimized for passing information between a speaker and a listener each word would only have one meaning to avoid any risk of confusion or misunderstanding. In a study published in the journal Cognition a team of MIT cognitive scientists has now upturned the linguists hypothesis with a new theory, which argues that ambiguity makes language in fact more efficient as it permits the reuse of short, efficient sounds that listeners can easily distinguish depending on the context. 



Senior author of the study Ted Gibson, an MIT professor of cognitive science says:


"Various people have said that ambiguity is a problem for communication. But once we understand that context disambiguates, then ambiguity is not a problem - it's something you can take advantage of, because you can reuse easy [words] in different contexts over and over again."


The word "Mean" for instance is a rather ironic example of ambiguity, as it can obviously stand for indicating and signifying something, yet it can also refer to an intention or purpose, for instance as in "I meant to go to the store". It could be another word for something or someone offensive or nasty, as well as referring to the 'mathematical average', and just by adding an 's' at the end of the word makes the definition even more versatile, for example, "a means to an end" refers to an instrument or method, or financial management, as in "to live within one's means".

Given all these different definitions, literally no one who masters the English language gets confused when hearing the word "mean." The reason is that the different senses of the word occur in very different contexts, which enables listeners to interpret its meaning almost automatically.

The researchers believe that the simplest words for language processing systems most probably exist because of this disambiguating power of context, which may restrain the ambiguity of languages to reuse words.

Based on previous studies and on observations they suggest that words with fewer syllables, high frequency and the simplest pronunciations should have the most meanings. 


To examine their theory, the researchers conducted corpus studies in Dutch, English and German. A corpus study is the study of language based on "real life" language examples that are stored in corpora (or corpuses), i.e. computerized databases created for linguistic research.

Their theory that shorter words that occurred more frequently and conformed to the language's typical sound patterns tend to be ambiguous was confirmed, when they compared certain properties of words to their numbers of meanings. They observed that the trends were statistically important in all three languages.

In order to comprehend why ambiguity makes a language more instead of less efficient, one has to examine the competing desires of a speaker and listener. Whereas a speaker wants to put across as much as possible to a listener with as few words as possible, the listener aims to gain a complete and specific understanding of what the speaker is trying to convey. However, as the researchers have already pointed out, it is "cognitively cheaper" if the listener concludes certain things from the context of the conversation, rather than the speaker having to spend more time on longer and more elaborate descriptions.

The result is a system that leans toward ambiguity by reusing the "easiest" words. Piantadosi states that once the context is taken into account, it becomes clear that "ambiguity is actually something you would want in the communication system."



Implications for computer science



According to the researchers, the statistical nature of their paper demonstrates a trend in the field of linguistics that is starting to depend more heavily on information theory and quantitative methods. 



Gibson states that, "The influence of computer science in linguistics right now is very high," and adds that natural language processing (NLP) is a major objective of those who operate at the intersection of the two fields. 



Piantadosi highlights that ambiguity in natural language presents enormous challenges for NLP developers, saying:


Ambiguity is only good for us [as humans] because we have these really sophisticated cognitive mechanisms for disambiguating. It's really difficult to work out the details of what those are, or even some sort of approximation that you could get a computer to use."


However, as Gibson pointed out, this problem has long been known by computer scientists and even though the new study offers a better theoretical and evolutionary explanation as to why ambiguity exists, the fact that, "Basically, if you have any human language in your input or output, you are stuck with needing context to disambiguate," still exists he says. 


Written by Petra Rattue
Copyright: Medical News Today
Not to be reproduced without permission of Medical News Today

This article was originally posted at http://ping.fm/sKnTC



Jan 25, 2012

Collaborative Intelligence & the EHR


A provider's essential critical knowledge is often so obscured that the EHR becomes more of an obstacle than a useful source of clinical information.

The meaningful use-compliant electronic health record (EHR) has quickly become very adept at capturing and sharing standardized, structured clinical content that can be communicated, stored, and to some extent consumed by other systems. Unfortunately, this strength is also the EHR's greatest limitation. Amid the structured templates and required fields of the EHR, the essential critical knowledge a provider needs to know is often so obscured that the EHR becomes more of an obstacle or annoyance than a truly useful source of clinical information.

No Place for Clinician's Thought-Process?
The critical clinical insights that providers most need from an EHR are simply not available to allow for informed decision-making. The required fields may all be populated, but the patient's story remains frustratingly incomplete.

The reason for this is simple: by its very nature, the EHR paradigm of capturing clinical information by way of mouse-and-keyboard input into structured forms limits the expressiveness of content. Because there is no place for non-standard information or for the clinician's thought process in reaching certain diagnoses in the templates, we not only miss out on the details of a patient's clinical history, but also on the critical information that reflects the way doctors think.

Documentation of the rationale for conclusions, relevant temporal and sequential facts, causal information, etc. is either lost or obscured beyond efficient retrieval. Some EHRs have incorporated options to allow providers to capture unstructured narrative information, but the resulting text usually has limited utility since it remains unstructured data buried inside various notes fields.

This dilemma is significant. It will take more than incremental feature improvements to realize the promise of the EHR: to support everything from disease management to clinical decision support to major operational efficiencies. To deliver on the expectations for eHealth, we need the EHR not only to capture and effectively use structured data, but also to capture the full patient story and support clinical collaboration based on that story.

What is needed is collaborative intelligence, a solution that enables and supplements the kind of complete and focused clinical picture physicians convey via face-to-face collaboration. Providing such intelligence requires an understanding of clinical workflows, and an ecosystem of people, process and technology to provide the clinical insights that permit clinicians to zoom in on the most critical information quickly and effectively.

All of the pieces required for such collaborative intelligence are in place today: Recognition and understanding of spoken content, semantic web coding and analysis to drive actions and learning algorithms that continuously improve the performance of automated systems based on human feedback. Four key technologies provide the backbone:

Speech Understanding: Speech is the most natural way for humans to convey complex information, and it is the preferred mode of clinical documentation for most physicians today. Speech-based documentation is fast and interferes with the provider-patient interaction least. Converting speech into structured clinical notes using computers reduces costs and time lag associated with human transcription.

The availability of next-generation speech understanding technology now provides significantly higher accuracies across medical disciplines and documentation types than what has previously been available through speech recognition systems. Integration with various clinical systems further optimizes the efficiency of the technology.

Natural Language Understanding (NLU)Sophisticated technology to "read" and understand unstructured clinical narrative is a critical ingredient for collaborative intelligence. We can now produce meaningful structured information from narrative content, merging the benefits of dictation and structured documentation.

Irrespective of whether clinical narrative is captured through dictation or directly in textual form, the synergistic combination of speech and natural language processing (NLP) technologies now yields highly accurate, context-aware clinical content that is codified to standardized medical ontologies such as SNOMED-CT. This in turn drives actionable information and together with structured EHR data enables clinical decision support and improves the quality of care.



Semantic Clinical Reasoning: Once meaningfully structured narrative information is available, it must be made accessible in workflow-friendly, flexible modes. Newly available tools allow physicians to gain access and insights into clinical data that were impossible to get a few years ago. Also, these tools make physicians more productive because they are capable of abstracting and summarizing the relevant clinical information for each provider. They can reason across millions of documents or drill down on the relevant information about one patient in a given context.

Information mined from narrative content can be combined with structured data from EHRs to obtain holistic insights into the patient's story. From retrospective analyses to real-time feedback for physicians at the time of documentation that enables more timely clinical documentation improvement (CDI) to the ability to share clinical insights among caregivers in a collaborative system, the fruits of this reasoning are game-changing.

Machine LearningTo realize the full scope of its benefits, a collaborative intelligence system must be both highly scalable and responsive to the incessant changes in medical knowledge. The only way to achieve these objectives is through "machine learning" - intelligent systems that improve their predictions as they process more information.

Many NLP systems lack a robust capability to do this or rely on hand-crafted rules for knowledge updates, an inherently non-scalable approach. Learning from human feedback is crucial as it provides a constant opportunity to adapt to the changing environment as well as to improve the results and insights gained from collaborative intelligence.

Taken together and combined in the right manner, these technologies and workflows offer the best path to fulfill the goals of eHealth. The EHR remains an essential tool for advancing the quality and efficiency of care, but all stakeholders in healthcare have to remember that it is far from a panacea. To reach the goals of complete, accurate and seamlessly interoperable clinical information, we need to take into account that the most complete, accurate and interoperable way of communicating clinical information is via the spoken word. It also happens to be the most efficient way of capturing such information.

Juergen Fritsch is the chief scientist of MedQuist. He was previously chief scientist and co-founder of M*Modal, and before that, he was one of the founders of Interactive Services (ISI), where he served as principal research scientist

This article was originally posted at http://health-information.advanceweb.com/Features/Articles/Collaborative-Intelligence-the-EHR.aspx



Collaborative Intelligence & the EHR

A provider's essential critical knowledge is often so obscured that the EHR becomes more of an obstacle than a useful source of clinical information.

The meaningful use-compliant electronic health record (EHR) has quickly become very adept at capturing and sharing standardized, structured clinical content that can be communicated, stored, and to some extent consumed by other systems. Unfortunately, this strength is also the EHR's greatest limitation. Amid the structured templates and required fields of the EHR, the essential critical knowledge a provider needs to know is often so obscured that the EHR becomes more of an obstacle or annoyance than a truly useful source of clinical information.

No Place for Clinician's Thought-Process?
The critical clinical insights that providers most need from an EHR are simply not available to allow for informed decision-making. The required fields may all be populated, but the patient's story remains frustratingly incomplete.

The reason for this is simple: by its very nature, the EHR paradigm of capturing clinical information by way of mouse-and-keyboard input into structured forms limits the expressiveness of content. Because there is no place for non-standard information or for the clinician's thought process in reaching certain diagnoses in the templates, we not only miss out on the details of a patient's clinical history, but also on the critical information that reflects the way doctors think.

Documentation of the rationale for conclusions, relevant temporal and sequential facts, causal information, etc. is either lost or obscured beyond efficient retrieval. Some EHRs have incorporated options to allow providers to capture unstructured narrative information, but the resulting text usually has limited utility since it remains unstructured data buried inside various notes fields.

This dilemma is significant. It will take more than incremental feature improvements to realize the promise of the EHR: to support everything from disease management to clinical decision support to major operational efficiencies. To deliver on the expectations for eHealth, we need the EHR not only to capture and effectively use structured data, but also to capture the full patient story and support clinical collaboration based on that story.

What is needed is collaborative intelligence, a solution that enables and supplements the kind of complete and focused clinical picture physicians convey via face-to-face collaboration. Providing such intelligence requires an understanding of clinical workflows, and an ecosystem of people, process and technology to provide the clinical insights that permit clinicians to zoom in on the most critical information quickly and effectively.

All of the pieces required for such collaborative intelligence are in place today: Recognition and understanding of spoken content, semantic coding and analysis to drive actions and learning algorithms that continuously improve the performance of automated systems based on human feedback. Four key technologies provide the backbone:

Speech Understanding: Speech is the most natural way for humans to convey complex information, and it is the preferred mode of clinical documentation for most physicians today. Speech-based documentation is fast and interferes with the provider-patient interaction least. Converting speech into structured clinical notes using computers reduces costs and time lag associated with human transcription.

The availability of next-generation speech understanding technology now provides significantly higher accuracies across medical disciplines and documentation types than what has previously been available through speech recognition systems. Integration with various clinical systems further optimizes the efficiency of the technology.

Natural Language Understanding (NLU): Sophisticated technology to "read" and understand unstructured clinical narrative is a critical ingredient for collaborative intelligence. We can now produce meaningful structured information from narrative content, merging the benefits of dictation and structured documentation.

Irrespective of whether clinical narrative is captured through dictation or directly in textual form, the synergistic combination of speech and natural language processing (NLP) technologies now yields highly accurate, context-aware clinical content that is codified to standardized medical ontologies such as SNOMED-CT. This in turn drives actionable information and together with structured EHR data enables clinical decision support and improves the quality of care.



Semantic Clinical Reasoning: Once meaningfully structured narrative information is available, it must be made accessible in workflow-friendly, flexible modes. Newly available tools allow physicians to gain access and insights into clinical data that were impossible to get a few years ago. Also, these tools make physicians more productive because they are capable of abstracting and summarizing the relevant clinical information for each provider. They can reason across millions of documents or drill down on the relevant information about one patient in a given context.

Information mined from narrative content can be combined with structured data from EHRs to obtain holistic insights into the patient's story. From retrospective analyses to real-time feedback for physicians at the time of documentation that enables more timely clinical documentation improvement (CDI) to the ability to share clinical insights among caregivers in a collaborative system, the fruits of this reasoning are game-changing.

Machine Learning: To realize the full scope of its benefits, a collaborative intelligence system must be both highly scalable and responsive to the incessant changes in medical knowledge. The only way to achieve these objectives is through "machine learning" - intelligent systems that improve their predictions as they process more information.

Many NLP systems lack a robust capability to do this or rely on hand-crafted rules for knowledge updates, an inherently non-scalable approach. Learning from human feedback is crucial as it provides a constant opportunity to adapt to the changing environment as well as to improve the results and insights gained from collaborative intelligence.

Taken together and combined in the right manner, these technologies and workflows offer the best path to fulfill the goals of eHealth. The EHR remains an essential tool for advancing the quality and efficiency of care, but all stakeholders in healthcare have to remember that it is far from a panacea. To reach the goals of complete, accurate and seamlessly interoperable clinical information, we need to take into account that the most complete, accurate and interoperable way of communicating clinical information is via the spoken word. It also happens to be the most efficient way of capturing such information.

Juergen Fritsch is the chief scientist of MedQuist. He was previously chief scientist and co-founder of M*Modal, and before that, he was one of the founders of Interactive Services (ISI), where he served as principal research scientist

 


Jan 24, 2012

10 health IT wishes for 2012

It’s easy to make predictions about health IT for the year to come, but what if someone asked what your IT wishes were for 2012? What would you like to see happen most in the health IT space?

We asked Wendy Whittington, MD, a practicing pediatrician and chief medical officer of Anthelio Healthcare Solutions, to list her top 10 IT wishes for 2012. From interoperability to telehealth, Whittington outlined what she, and most of her peers, would hope to see come true during the upcoming year.

1. A greater emphasis placed on the federal health IT strategic plan.According to Whittington, healthcare professionals and government officials alike should be paying closer attention to federal health IT strategic plan, and she suggests a revision of sorts could be helpful. “I would like to see that become a working document that we’re constantly referring to,” she said. “One of our biggest problems is a document comes out and it’s good, but what’s happening in healthcare is changing – a document needs to constantly be tweaked.”

2. The emergence of more affordable solutions for healthcare systems and hospitals to attain meaningful use. Many hospitals and systems have been scrambling to find a fast solution to an EHR, said Whittington, to gain access to those meaningful use dollars. “But what ends up happening is they think to get there, [they need to] buy the biggest and the best,” she said. “The total cost of ownership far exceeds the return they’ll get back. I’d like to see a lot of the lesser-known providers of EHRs getting more attention.” Whittington also added alternatives to EHRs, like open source, could be just as successful for a 100-bed hospital, for example. “I’d put the money into optimizing the less-expensive option,” she said.

3. Real interoperability and not just “lip service” interoperability of our health IT systems.  Whittington referenced vendors who promise true interoperability, yet, months after implementing the technology, hospitals are still left with communication issues. “[Hospitals] will ask, 'Will this communicate with doctors in the outpatient clinic?’ and the answer is ‘yes,’” she said. “But years after hearing that answer, you still have the same problem. So interoperability is important, but there’s no progress and, in fact, no financial incentives for vendors to play nice.” And financial incentives, in theory, wouldn’t end with vendors and interoperability – Whittington suggests the same goes for communication among hospitals. “Both technology and health communication,” she said. “Less financial disincentive to communicate and more real interoperability.”

4. A better health IT “roadmap.” Ultimately, Whittington would like to see a healthcare system that’s, “patient-centered, evidence-based, efficient, equitable and prevention oriented,” she said. The health IT strategic plan, she said, has vision but isn’t a “cookbook.” “In medicine, we resist cookbooks,” she said. “It’s taken a long time for physicians to assess protocols and evidence-based medicine order sets, so it’s in our nature to not be told how to do things.” However, with everyone left to his or her own devices, it’s easy for chaos to ensue, so Whittington suggests a more standardized way of implementing required technology.

5. The optimization of EHRs. Installing them is just the beginning, said Whittington. “We end up doing what we need to do to get by … slap in that EHR and meet those standards, when really, there’s so much more work that needs to be done.” She said not to forget to optimize your EHR, and when it comes to doing so in hospitals, she suggests doing away with commonly held “silos” and working holistically. “[We need to] work more holistically to optimize clinical documentation and ICD-10, and optimize EHRs around those same principles,” she said. “Work as one big team rather than little, individual ones.”

6. Less whining about going to ICD-10 and smarter planning about how to get there. Whittington said her point with this wish is simple. “It’s like,'Come on guys, we’ve known for a long time that we’re the last country in the world [to transition to ICD-10] and we need to go there,'” she said. “For a while ... the argument from the AMA was, 'We’re too busy and we have a lot of other things going on,’ and I agree; there is a lot of change. But we’ve known about this for years.” There’s going to be change in how care is delivered for many years to come, she continued, and waiting for things to calm down would take even longer. “Just suck it up,” she said. “That’s what I tell my kids.”

7. More innovation across all of healthcare but mainly health IT. EHRs in hospitals just aren’t innovative enough, said Whittington. “There’s a lot of money being dumped in and all these systems being put in, but doctors are still complaining that it slows them down and is cumbersome,” she said. According to her, there needs to be more innovation around ways to get information into the EHR from the beginning. “We’re starting to see a little glimmer of hope with transcription work and being able to put info into an EHR, but we haven’t begun to realize of the benefit [of EHRs] because we still struggle to get information in and out,” she said.

8. A shift to patient-centered care and population health. “The way we have our health delivery system set up, with hospitals being the center of the universe and EHRs being the information repository, we aren’t necessarily making populations more healthy,” said Whittington. She referenced once again the strategic plan, which calls for more attention paid to shifting the center of care out of the hospitals. “As we build out HIT infrastructure, we need to think about where patients need to go to find the right care at the right place at the right time to keep populations healthy.”

9. Value out of big data in healthcare. Professionals are constantly “throwing data” into their EHRs, but, said Whittington, we haven’t even begun to realize the value we can get out of it. “You can even tie in ICD-10 and a lot of other principles into this as we get better at capturing granular data in patients,” she said. “ICD-10 helps with that: apples to apples coding, more specifically. We should get better at comparative effectiveness research and knowing what’s going on.”

10. The expansion of telehealth principles into the wellness space. “The way we deliver healthcare today is inefficient, and it’s not going to take us into the future if we ever intend to be cost effective and affect the health of more people,” said Whittington. She recognized the positive ways telehealth is being used in rural communities, but she said she would like to see it being used more to keep populations healthy. “So if a patient wakes up and checks [his/her] glucose levels, the results are beamed to a case management center,” she said. “And if you take that one step further, all of the people who walk into the ER for their strep throats. It’s about using the principles of telehealth to keep those folks where they belong.”

Follow Michelle McNickle on Twitter, @Michelle_writes

This article was originally posted at http://ping.fm/Nbsog

11 healthcare data trends in 2012

Mobile devices, data breaches and patient privacy rights were some of the most talked-about topics in health IT in 2011, and according to expert opinions complied by ID Experts, 2012 won’t be any different.

In fact, experts continue to predict an upswing in mobile and social media usage, response plans, and even reputation fallout. Eleven industry experts outlined healthcare data trends to look for in 2012.

1. Mobile devices could mean trouble. Healthcare organizations won’t be immune to data breach risks caused by the increased use of mobile devices in the work place, said Larry Ponemon, chairman and founder of the Ponemon Institute. A recent study confirms that 81 percent of healthcare providers use mobile devices to collect, store, and/or transmit some form of personal health information (PHI). But, 49 percent of those admit they’re not taking steps to secure their devices.

2. Class-action litigation firestorms are looming. Class-action lawsuits will be on the rise in 2012, predicts Kirk Nahra, partner, Wiley Rein LLP. This will most likely be due to patients suing healthcare organizations for failing to protect their PHI. This past year was filled with several similar suits for organizations, some of which involved business associates and breached patient data. And despite the outcomes, one affect is certain: significant risk and cost for companies affected by the suits.

3. Social media risks will grow. Chris Apgar, CEO and president at Apgar & Associates, predicts that, as more physicians and healthcare organizations move to social media, its misuse will increase the exposure of PHI. A recentexample includes a healthcare worker posting sensitive information about a patient on his Facebook. According to ID Experts, healthcare organizations often don’t develop a social media use plan, leaving a gray area of sorts for employees exposing PHI through personal social networking pages.

4. Cloud computing is not a panacea. Moreover, the technology is outpacing security and creating unprecedented liability risks, said James C Pyles, principal, Powers Pyles Sutter & Verville. According to Pyles, with fewer resources, cloud computing is an attractive option for healthcare providers, especially with the rise of HIEs. But, with privacy and legal issues coming to light, ID Experts said a “covered entity” will need to enter into a “carefully written business associate agreement with a cloud-computing vendor before disclosing protected health information.”

5. Reliance on business associates could result in new risks. Larry Walker, president of the Walker Company, believes economic realties will force healthcare providers to continue to outsource many of their functions. This includes billing to third parties or business associates, even though business associates are considered the “weak link in the chain” when it comes to privacy and security.

6. Organizations could see reputation fallout. Rick Kam, president and co-founder of ID Experts, said identity theft and medical identity theft resulting from data breach exposure are causing patients financial and emotional harm. This often results in patients switching to other providers. According to the Ponemon study, the average lifetime value of one patient is more than $113,000.

7. Mobile will be big in the industry. Christina Theilst, consultant and blogger, reiterated how the use of tablets, smartphones, and tablet applications in healthcare continues to grow. In fact, nearly one-third of providers use mobile devices to access EMRs or EHRs, according to a CompTIA study. And with the onslaught of this mobile technology, providers will need to balance usability, preferences security, and more all while adopting written terms of use with employees and contractors, said ID Experts.

8. Emphasis on “willful neglect” will lead to increased enforcements of HIPAA. Adam Greene, partner, Davis, Wright, Tremaine, said the focus over the next year will be on the 150 HITECH Act audits and publication of the final rules implementing modifications to the HIPAA regulations. But the biggest changes, he said, may be at the OCR investigation level. Expect OCR to pursue enforcement against noncompliance due to “willful neglect,” resulting in a sharp increase in financial settlements and fines.

9. Privacy and security training to become an annual requirement. Peter Cizik, co-founder and CEO at BridgeFront, said healthcare organizations have gotten better at putting procedures in place, but staff still isn’t following them. And since the majority of breaches happen due to human error, targeted training and awareness programs will become common in the upcoming year.

10. An increase in fraudsters means an increase in fraud risk education.Jonnie Massey, supervisor at the Special Investigations Unit, Oregon Dental Service Companies, said pressure, opportunity, and rationalization are all dangerous elements that can lead to committing a healthcare-related crime. And during hard economic times, these crimes are more prevalent. Educating those at risk may deter some from stepping over the line, or help those at risk to prevent themselves from being a victim.

11. Healthcare organizations will turn to cyber liability insurance. As organizations continue to implement their EHRs, said Christine Marciano, president of Cyber Data Risk Managers, they will consider options to protect themselves and their patients. A breach can be both costly and damaging to the organization’s reputation. And with the increased vulnerabilities, as part of a data breach response plan, organizations will increasingly turn to a cyber security/data breach insurance policy.

Follow Michelle McNickle on Twitter, @Michelle_writes

This article was originally posted at http://ping.fm/GgjDO

Jan 17, 2012

Save money by outsourcing legal transcription jobs

Outsourcing legal documentation tasks has proved to be a cost-effective option for law firms. Rather than transcribing the various legal documents in-house, they stand to gain a lot when they entrust the work to reliable service providers. With in-house legal transcription, a law firm has to spending considerable time and resources on transcribing legal dictation. Legal transcription outsourcing services helps legal practitioners and law firms to get the work done in quick time, as well as save the money that would be required to carry out the documentation work within their practices.

By hiring the Legal Transcription Service offered by professional legal transcription companies, you can eliminate your transcription headaches. You don't have to worry about transcribing briefs, depositions, notes, letters, motions, statements, client tapes, pleadings, court tapes, summons, interviews, court transcripts and jury instructions. Legal transcription outsourcing services considerably reduce the work load of legal professionals and allow them to save valuable time which could be spent on other important areas of the business.

Equipped with a team of qualified and experienced transcriptionists, proofreaders, legal experts and editors, transcription outsourcing service providers use advanced technology, dictation equipment and software to provide quality legal transcripts in minimum turnaround time. Cost-effective transcription solutions are available for client letters, memorandums, wire tap, general correspondence, reports, interrogations, trials, judgments, briefs, minutes of seminars and conferences and so on.

How it can help you save money

Outsourcing legal documentation jobs to an experienced legal transcription company ensure an assortment of benefits in terms of money, time and effort. The major advantage is that you can reduce the number of people who work for you. Thus, you can save money spent on

  • Salaries

  • Employee benefits such as payroll taxes, health insurance, medical office space, paid vacation time, workers' compensation insurance, and more

  • Ongoing training

  • Human resources for recruiting and handling turnover

  • Management and administration

  • Setting up office space and utilities

  • Technical support: When you outsource you need not have to maintain additional personnel for technical support. The transcription companies provide you with continuous technical assistance.


Major Benefits of Legal Transcription Outsourcing

As the legal transcription firms hire professionals to do the work, you have experts working for you without paying any thing for their services. Also, they help you to free up your resources and personnel to work on core business activities. These transcription firms offer services at low rates and provide accurate customized transcripts. They usually have a turnaround time of 24 hours, preventing any kind of backlog. Well-planned legal transcription services ensure the following additional benefits:

  • Error-free and timely updated legal documents

  • Reduce bulk paper work in your practice

  • Minimize backlogs of legal files

  • Easy retrieval of files at any time

  • Maintain legal documents in user-friendly file formats


Outsourcing solutions are available for busy lawyers, paralegals, attorneys and other legal professionals. Legal transcription outsourcing services save their valuable time and money, and allow them to focus on their core business and improve their bottom line.

http://ping.fm/BTPx9

SemTechBiz Conference ? Call for Presentations Open until January 16





SemTechBiz is returning to San Francisco on June 3-7, and once again we plan to make it the biggest and most comprehensive educational conference on the business of semantic technologies. And that’s exactly where we are asking you to contribute please – by sharing the practical experience you have gained in your own semantic projects.


We’re looking for case studies big and small – whether you’re building the semantic infrastructure of the future, like the DoD Enterprise Web, or you’ve done semantic annotation on a local business web site, like Plush Beauty Bar. They’re all relevant, because the curiosity of the audience is so rich and diverse.


TheCall for Presentationsends on January 16, 2012, so get your abstract together ASAP. All the information you need, and the links to submit your presentation proposal, areHERE.


Conference registration is also open. Register by February 17 and save with substantial early bird discounts.


If you have any questions, feel free to email me at Tony@SemanticWeb.com


Thanks,


Tony Shaw