Marl Ontology Query Experiments
Grupo de Sistemas Ingeligentes Marl Ontology

Marl Ontology Query Experiments

07 February 2011

This version: http://purl.org/marl/query_experiments/0.1/
Latest version: http://purl.org/marl/query_experiments/
Editors: Adam Westerski
Authors: Adam Westerski
Contributors: See acknowledgements

Creative Commons License


Abstract

Marl is a standardised data schema (also referred as "ontology" or "vocabulary") designed to annotate and describe subjective opinions expressed on the web or in particular Information Systems. The following document contains results of semantic query experiments where we evaluate the capabilities of Marl metadata to answer various queries related to linking distributed opinions. For the description of ontology and instructions how to connect it with descriptions of other resources see Ontology Specification.


Table of Contents

  1. Introduction
    1. Opinions on the Web and the opinion mining process
    2. The Semantic Web
    3. What is Marl for?
  2. Competency questions
    1. Movie opinions
    2. Product opinions
    3. Opinions in Idea Management Systems
  3. Datasets tests
    1. Aggregating movie opinions
    2. Visualising Idea Management data

Appendixes

  1. Changelog
  2. Acknowledgements

1 Introduction

The following document gathers the data results of various experiments done with Marl Ontology. It’s goal it to test the coverage of Marl properties for different datasets constructed independently of the Marl project.

The analysis is split into two parts. Each of the sections presents a list of sources and Marl mappings for them, along with some coverage statistics.

Section two describes usage of the ontology to produce mappings for various datasets published by researchers during their opinion mining algorithms study. The second part relates to the same effort but conducted in context of online services for end users that publish opinion mined data.

The choice of sources for datasets is based on state of the art knowledge of authors (in case of research datasets the list was partially created based on resources listed by Pang et al. [ref]).

An important note is that Marl ontology presented here is not a complete model to address the problem of describing and linking opinions online and inside information systems. It marly defines concepts that are not described yet by the means of other ontologies and provides the data attributes that enable to connect opinions with contextual information already defined in metadata created with other ontologies. For detailed instructions and recommendations how to fully model opinions and the results of opinion mining process refer to analysis done by Gi2MO project.

1.1 Opinions on the Web and the opinion mining process

With the birth of Web 2.0 users started to provide their input and create content on mass scape about their subjective opinions related to various topics (e.g. opinions about movies). While this kind of content can be very beneficial for many different uses (e.g. market analysis or predictions) it's accurate analysis and interpretation has not been fully harnessed yet. Information left by the users is often very disorganized and many portals that enable user input leave the user added information unmoderated.

Opinion mining (often referred as sentiment analysis) is one of the attempts bring order to those vast amounts of user generated content. The domain focuses to analyse textual content using special language processing tools and as output provides a quantified judgement of the sentiments contained in the text (e.g. if the text expresses a positive or negative opinion).

Due to the complexity of the problem and attempts to provide efficient and fast tools the area can be devided into three main research directions:

  • document wide sentiment analysis
  • sentence sentiment analysis
  • feature-based sentiment analysis

In relation to the World Wide Web, there is a number of common uses of opinion formalisation and analysis. Firstly, it can be applied on top of search engines to find the desired content and next run it through opinion analysis software to obtain desired statistics (e.g. Swotti). Secondly, such algorithms can used within dedicated systems that use the Web to connect to particular communities and gather their opinions on very specific topics (e.g. Internet shops or review websites).

In relation to the dedicated systems (e.g. Enterprise Systems), there the community collaborative models that have proven successful in the open web are often transferred to large enterprise to enhance knowledge exchange and bring the employees together. The same opinion mining techniques can be applied in such cases to extact particular information and use it for internal statistics and to improve knowledge search across the enterprise (e.g. see use of opinion mining in Idea Management [link]).

1.2 The Semantic Web

The Semantic Web is a W3C initiative that aims to introduce rich metadata to the current Web and provide machine readable and processable data as a supplement to human-readable Web.

Semantic Web is a mature domain that has been in research phase for many years and with the increasing amount of commercial interest and emerging products is starting to gain appreciation and popularity as one of the rising trends for the future Internet.

One of the corner stones of the Semantic Web is research on interlinkable and interoperable data schemas for information published online. Those schemas are often refered to as ontologies or vocabularies. In order to facilitate the concept of ontologies that lead to a truly interoperable Web of Data, W3C has proposed a series of technologies such as RDF and OWL. Marl uses those technologies and the research that comes within to propose an ontology for the particular goal of describing opinions and linking them with contextual information (such as opinion topic, features described in the opinion etc.).

1.3 What is Marl for?

The goals of the Marl ontology to achieve as a data schema are:

  • enable to publish raw data about opinions and the sentiments expressed in them
  • deliver schema that will allow to compare opinions coming from different systems (polarity, topics, features)
  • interconnect opinions by linking them to contextual information expressed with concepts from other popular ontologies or specialised domain ontologies
For more information please refer to Marl usage study done as part of the research in the Gi2MO project.

2. Datasets tests

The goal of this experiment was to check if the annotations done with Marl ontology can be used to answer questions about opinions expressed with regard to different topics on the Web. The areas for which we have built the competency questions correspond to the use cases and mappings experiments for Marl ontology.

2.1 Movie opinions

Template: Show all opinions about {certain movie}
Example: Show all opinions about Avatar
PREFIX sioc: <http://rdfs.org/sioc/ns#> .
PREFIX marl: <http://purl.org/marl/ns#> .
SELECT ?opinion_full_text ?opinion_uri ?opinion_polarity WHERE {
   ?comment_uri a sioc:Post .
   ?comment_uri sioc:content ?opinion_full_text .
   ?comment_uri marl:hasOpinion ?opinion_uri .
   ?opinion_uri marl:hasPolarity ?opinion_polarity .
   ?opinion_uri marl:describesObject ?opinion_about .
   FILTER regex(?opinion_about, "Avatar")
}
Template: Show all {polarity type} opinions about {{certain movie}
Example: Show all positive opinions about Avatar
PREFIX sioc: <http://rdfs.org/sioc/ns#> .
PREFIX marl: <http://purl.org/marl/ns#> .
SELECT ?opinion_full_text ?opinion_uri WHERE {
   ?comment_uri a sioc:Post .
   ?comment_uri sioc:content ?opinion_full_text .
   ?comment_uri marl:hasOpinion ?opinion_uri .
   ?opinion_uri marl:hasPolarity marl:Positive .
   ?opinion_uri marl:describesObject ?opinion_about .
   FILTER regex(?opinion_about, "Avatar") .
}
Template: Show all {polarity type} opinions about {certain movie} made with regard to {movie feature}
Example: Show all positive opinions about Avatar made with regard to acting
PREFIX sioc: <http://rdfs.org/sioc/ns#> .
PREFIX marl: <http://purl.org/marl/ns#> .
SELECT ?opinion_full_text ?opinion_uri WHERE {
   ?comment_uri a sioc:Post .
   ?comment_uri sioc:content ?opinion_full_text .
   ?comment_uri marl:hasOpinion ?opinion_uri .
   ?opinion_uri marl:hasPolarity marl:Positive .
   ?opinion_uri marl:describesObject ?opinion_about .
   ?opinion_uri marl:describesFeature ?opinion_about_feature .
   FILTER regex(?opinion_about, "Avatar") .
   FILTER regex(?opinion_about_feature, "acting") .
}
Template: Show all {polarity type} opinions about {certain movie} made by a {certain person}
Example: Show all positive opinions about Avatar made by IMDB reviewers
PREFIX sioc: <http://rdfs.org/sioc/ns#> .
PREFIX marl: <http://purl.org/marl/ns#> .
PREFIX dcterms: <http://purl.org/dc/terms/> .
SELECT ?opinion_full_text ?opinion_uri 
FROM imdb_dataset
WHERE {
   ?comment_uri a sioc:Post .
   ?comment_uri sioc:content ?opinion_full_text .
   ?comment_uri sioc:has_creator ?author_uri .
   ?author_uri sioc:has_function ?author_role .
   ?author_role dcterms:title ?role_name .
   ?comment_uri marl:hasOpinion ?opinion_uri .
   ?opinion_uri marl:hasPolarity marl:Positive .
   ?opinion_uri marl:describesObject ?opinion_about .
   ?opinion_uri marl:describesFeature ?opinion_about_feature .
   FILTER regex(?opinion_about, "Avatar") .
   FILTER regex(?role_name, "reviewer") .
}

2.2. Product opinions

Template: Show all opinions about {certain product}
Example: Show all opinions about iPads
PREFIX sioc: <http://rdfs.org/sioc/ns#> .
PREFIX marl: <http://purl.org/marl/ns#> .
SELECT ?opinion_full_text ?opinion_uri ?opinion_polarity
WHERE {
   ?comment_uri a sioc:Post .
   ?comment_uri sioc:content ?opinion_full_text .
   ?comment_uri marl:hasOpinion ?opinion_uri .
   ?opinion_uri marl:hasPolarity ?opinion_polarity .
   ?opinion_uri marl:describesObject 
}
Template: Show all {polarity type} opinions about {certain product}
Example: Show all positive opinions about iPads
PREFIX sioc: <http://rdfs.org/sioc/ns#> .
PREFIX marl: <http://purl.org/marl/ns#> .
SELECT ?opinion_full_text ?opinion_uri
WHERE {
   ?comment_uri a sioc:Post .
   ?comment_uri sioc:content ?opinion_full_text .
   ?comment_uri marl:hasOpinion ?opinion_uri .
   ?opinion_uri marl:hasPolarity marl:Positive .
   ?opinion_uri marl:describesObject 
}
Template: Show all opinions about {certain product} made with regard to {product part}
Example: Show all opinions about iPads with regard to screen
PREFIX sioc: <http://rdfs.org/sioc/ns#> .
PREFIX marl: <http://purl.org/marl/ns#> .
SELECT ?opinion_full_text ?opinion_uri ?opinion_polarity
WHERE {
   ?comment_uri a sioc:Post .
   ?comment_uri sioc:content ?opinion_full_text .
   ?comment_uri marl:hasOpinion ?opinion_uri .
   ?opinion_uri marl:hasPolarity ?opinion_polarity .
   ?opinion_uri marl:describesObject  .
   ?opinion_uri marl:describesObjectPart 
}
Template: Show all {polarity type} opinions about {certain product} made with regard to {product feature}
Example: Show all positive opinions about iPad usability
PREFIX sioc: <http://rdfs.org/sioc/ns#> .
PREFIX marl: <http://purl.org/marl/ns#> .
SELECT ?opinion_full_text ?opinion_uri
WHERE {
   ?comment_uri a sioc:Post .
   ?comment_uri sioc:content ?opinion_full_text .
   ?comment_uri marl:hasOpinion ?opinion_uri .
   ?opinion_uri marl:hasPolarity marl:Positive .
   ?opinion_uri marl:describesObject  .
   ?opinion_uri marl:describesFeature 
}

2.3 Opinions in Idea Management Systems

Template: Show all {polarity type} opinions about {certain idea}
Example: Show all opinions about "Bigger screen in iPads" (http://..../bigger_screen_in_ipads/)
PREFIX gi2mo: <http://purl.org/gi2mo/ns#> .
PREFIX sioc: <http://rdfs.org/sioc/ns#> .
PREFIX marl: <http://purl.org/marl/ns#> .
SELECT ?opinion_full_text ?opinion_container_type ?opinion_uri ?opinion_polarity
WHERE {
   ?idea_uri a gi2mo:Idea .
   ?idea_uri foaf:page  .
   ?opinion_uri marl:describesObject ?idea_uri .
   ?opinion_uri marl:extractedFrom ?opinion_container_uri .
   ?opinion_container_uri a ?opinion_container_type .
   ?opinion_uri marl:hasPolarity ?opinion_polarity .
   OPTIONAL { ?opinion_container_uri  gi2mo:content ?opinion_full_text } .
   OPTIONAL { ?opinion_container_uri  sioc:content ?opinion_full_text }
}
Template: Show all {polarity type} opinions on {certain product} expressed in comments to ideas
Example: Show all positive opinions on keyboards expressed in comments to ideas
Example: Show amount of positive and negative opinions for all ideas submitted
PREFIX gi2mo: <http://purl.org/gi2mo/ns#> .
PREFIX marl: <http://purl.org/marl/ns#> .
SELECT ?opinion_full_text ?comment_uri ?opinion_uri ?idea_uri
WHERE {
   ?idea_uri a gi2mo:Idea .
   ?idea_uri gi2mo:hasComment ?comment_uri .
   ?opinion_uri marl:describesObject  .
   ?opinion_uri marl:extractedFrom ?comment_uri .
   ?opinion_uri marl:hasPolarity marl:Positive .
   ?comment_uri  gi2mo:content ?opinion_full_text
}

PREFIX gi2mo: <http://purl.org/gi2mo/ns#> .
PREFIX sioc: <http://rdfs.org/sioc/ns#> .
PREFIX marl: <http://purl.org/marl/ns#> .
SELECT ?opinion_full_text ?comment_uri ?opinion_uri ?idea_uri COUNT(?positive_opinion_uri) COUNT (?negative_opinion_uri)
WHERE {
		{
   			?idea_uri a gi2mo:Idea .
   			?idea_uri gi2mo:hasComment ?comment_uri .
		   	?comment_uri gi2mo:content ?opinion_full_text
		   	?positive_opinion_uri marl:extractedFrom ?comment_uri .
		   	?positive_opinion_uri marl:hasPolarity marl:Positive .
	   	} UNION
	   	{
   			?idea_uri a gi2mo:Idea .
   			?idea_uri gi2mo:hasComment ?comment_uri .
		   	?comment_uri gi2mo:content ?opinion_full_text
		   	?negative_opinion_uri marl:extractedFrom ?comment_uri .
   			?negative_opinion_uri marl:hasPolarity marl:Negative .
   		}
}
GROUP BY ?idea_uri
ORDER BY DESC(?positive_ideas)
Template: Show all opinions related to ideas about {certain product} made with regard to {product part}
Example: Show all opinions related to ideas about iPad with regard to camera
PREFIX gi2mo: <http://purl.org/gi2mo/ns#> .
PREFIX marl: <http://purl.org/marl/ns#> .
SELECT ?opinion_full_text ?comment_uri ?opinion_uri ?opinion_polarity ?idea_uri
WHERE {
   ?idea_uri a gi2mo:Idea .
   ?idea_uri gi2mo:hasComment ?comment_uri .
   ?opinion_uri marl:describesObject  .
   ?opinion_uri marl:describesObjectPart  .
   ?opinion_uri marl:extractedFrom ?comment_uri .
   ?opinion_uri marl:hasPolarity ?opinion_polarity .
   ?comment_uri  gi2mo:content ?opinion_full_text
}

3. Online Opinion Analysis Services

The following use cases aim to show how Marl Ontology could be used in different environments (as in systems) and when applied to to opinions of various complexity and structure.

3.1 Aggregating movie opinions

For aggregating movie opinions we tested the capabilities of Marl to behave as a data integration layer for different online opinion mining services and datasets. In the experiment we used Tweetsentiments, IMDB (via Cornell Movie dataset) and Swotti. For each of the data sources we used Marl mappings already defined in the previous coverage experiment.

From each of the data sources we extracted small data sets and performed two kinds of tests:

  • Online Semantic Search Engines/Index (we used Sindice search engine)
  • SPARQL Queries using a SPARQL endpoint
In the first case the goal was to see the capabilities

3.2 Visualising Idea Management data

Context: Idea Management Systems are used to collect input from a large audience regarding innovation proposals for products or services. For our experiment we used two Idea Management System instances.

Technologies: The Semantic Web infrastructure was provided by tools from the Gi2MO project. The the data was also taken from test instances of the Gi2MO project. Finally the tool used for visualisation were Idea Browser and Idea Analyst. For processing SPARQL queries we used ARC2 RDF store.

Data: The data used came from ETSIT Ideas and ETSIT Ideas International - systems created to collect ideas about the university, respectable from spanish students and international visitors. First instance was run entirely in spanish, the second in english. Both systems were running different software however the data exported from each has been described with the same ontologies: Gi2MO ontology for ideas and Marl for describing opinions about ideas.

Outcome: Data encoded in Marl has provided new metrics and enabled to compare two multilingual instances on a new level.

Datasets: etsit_ideas_es.rdf, etsit_ideas_en.rdf

Queries and Results:

a) Amount of negative opinions/comments in each instance (with colors)

b) Amount of positive opinions/comments from each instance (with colors)

c) Amount of positive opinions/comments per category (categories using the same URIs in both instances)
* select categories and the amount of positive ideas they have, sort by ideas amount (tags and categories are included)

PREFIX gi2mo: <http://purl.org/gi2mo/ns#> .
PREFIX dcterms: <http://purl.org/dc/terms/> .
PREFIX owl: <http://www.w3.org/2002/07/owl#> .
PREFIX marl: <http://purl.org/marl/ns#> .
SELECT ?generic_category_uri  COUNT(?idea_name) AS ?positive_ideas WHERE {
   ?idea_uri a gi2mo:Idea .
   ?idea_uri gi2mo:hasCategory ?category_uri .
   ?category_uri owl:sameAs ?generic_category_uri .
   ?idea_uri dcterms:title ?idea_name .
   ?idea_uri marl:polarityValue ?polarityValue .
   FILTER (?polarityValue > 0)
}
GROUP BY ?generic_category_uri
ORDER BY DESC(?positive_ideas)

* select categories and the amount of ideas they have, sort by ideas amount (tags and categories are included)
PREFIX gi2mo: <http://purl.org/gi2mo/ns#> .
PREFIX dcterms: <http://purl.org/dc/terms/> .
PREFIX owl: <http://www.w3.org/2002/07/owl#> .
SELECT ?generic_category_uri  COUNT(?idea_name) AS ?ideas WHERE {
   ?idea_uri a gi2mo:Idea .
   ?idea_uri gi2mo:hasCategory ?category_uri .
   ?category_uri owl:sameAs ?generic_category_uri .
   ?idea_uri dcterms:title ?idea_name
}
GROUP BY ?generic_category_uri
ORDER BY DESC(?ideas)

A Changelog

  • First version of the document

B Acknowledgements

The style formatting of the following document has been inspired on FOAF specification.

Special thanks for support with Marl ontology creation and research to: Prof. Carlos A. Iglesias and members of the GSI Group of DIT department of Universidad Politécnica de Madrid.