FOod in Open Data

The FOOD (FOod in Open Data) project defines and makes available standardisation models and reference ontologies for representing food quality certification schemes, in accordance with product specifications defined by the Italian Ministry of Agricultural, Food and Forestry Policies. FOOD focuses on the semantic representation of the information and production rules set out in the product specifications for agri-food products and their quality designations, including Protected Designation of Origin (PDO) and Protected Geographical Indication (PGI) schemes.

The project

Objectives

The FOOD project, born within a collaboration with the Institute of Cognitive Sciences and Technologies of CNR - at STLAB (Semantic Technology Laboratory) - and the Italian Digital Agency (AGID), aimed at:

  1. designing reference ontologies for modelling the data included in policy documents (or product specifications) of quality schemes for agricultural and food products – i.e., PDO (Protected Designation of Origin), PGI (Protected Geographical Indication) and TSG (Traditional Speciality Guaranteed). These documents are made available by the Italian Ministry of Agriculture (MIPAAF);

  2. producing open datasets to be released following the Linked Data paradigm and using the well known Semantic Web standards (e.g., RDF). The open datasets are created starting from the data contained in the policy documents of the agriculture products, and they are fully aligned with the reference designed ontologies;

  3. publishing the data under the Creative Commons Attribution 4.0 International License, so as to enable the reuse of the data by anyone and for any purpose.

Contributors

The working group of the project consisted of the following partners and people. For CNR, the STLab team coordinated by Aldo Gangemi and formed by Silvio Peroni, Luigi Asprino, Giorgia Lodi, Andrea Nuzzolese and Daria Spampinato. The STLab group also collaborated with the Trees and Timber Institute of CNR located in Catania. For AgID, the area named "architecture, standards and infrastructures" guided by Francesco Tortorelli provided some contributions to the project.

Results

The main results, obtained in the time period October 2014 - May 2015 can be summarised as follows.

  1. A set of OWL (Ontology Web Language) ontologies on quality schemes for agriculture and food products have been created. In the context of this activity, other ontologies were considered. Specifically, we reused AGROVOC, the multilingual agricultural thesaurus created by the Food and Agriculture Organization (FAO), for modelling the raw materials of the agriculture and food products.

    In addition, some additional OWL files have been created in order to align the developed ontologies with other ones available in the literature such as DBpedia, DOLCE, WordNet ontologies, and the ontology design patterns these use.

    Finally, we carefully evaluated the possibility to reuse ontology design patterns as defined in ontologydesignpatterns.org in our modelling; in particular, the following design patterns were reused: a) the pattern Description, for representing raw materials and other product characteristics; b) the pattern Place, for modelling the production place; c) the pattern Classification, for expressing both the different characteristics of the products and the quality schemes defined at EU level; and d) the pattern Information Realization, for relating policy documents with their various versions released pver time.

  2. Automatic and manual extraction of a set of data on quality schemes for POD, PGI and TSG agriculture and food products.

  3. The production of RDF datasets so that to be fully aligned with the produced ontologies.

  4. A linking between the produced data and others available in the Web of Data was enabled. For instance, the data related to the production places is linked to the same concepts defined in DBpedia, in AgID's SPCData project, and in the national territorial classification provided in LOD by the Italian National Institute of Statistics (ISTAT).

  5. A set of metadata that describes the data and the different activities that have been performed in order to produce the ontologies and the Linked Open Data of the quality schemes for agriculture and food products. In this case, we reused Web standards such as PROV-O and DCAT.

Ontologies

Data

RDF dataset

The overall FOOD dataset, including all the data, the ontologies and the metadata can be downloaded in "bulk".

SPARQL endpoint

In order to enable the LOD paradigm, it is necessary to make available a SPARQL endpoint through which querying the data.

In the context of FOOD, the SPARQL endpoint can be used to query the data via a more user-friendly interface. For instance, a possible SPARQL query that can retrieve the first 10 wines, and the related names, whose production places are "Lucca" city, is the following:

PREFIX wine: <http://w3id.org/food/ontology/disciplinare-vino/>
PREFIX upper: <http://w3id.org/food/ontology/disciplinare-upper/>
PREFIX dbpedia: <http://dbpedia.org/resource/>

SELECT DISTINCT ?vino ?den
WHERE {
 ?vino a wine:Vino ;
   upper:haDenominazione ?den .
 ?den upper:haLuogoDiProduzione dbpedia:Lucca
} LIMIT 10

Alternatively, it is possible to query the SPARL endpoint through a REST service, useful in the case of applications/services development activities. An example of query done via cURL for getting the same data as the query above is the following:

curl -L http://w3id.org/food/sparql?query=PREFIX+wine%3A+%3Chttp%3A%2F%2Fw3id.org%2Ffood%2Fontology%2Fdisciplinare-vino%2F%3E+PREFIX+upper%3A+%3Chttp%3A%2F%2Fw3id.org%2Ffood%2Fontology%2Fdisciplinare-upper%2F%3E+PREFIX+dbpedia%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2F%3E+SELECT+DISTINCT+%3Fvino+%3Fden+WHERE+%7B%3Fvino+a+wine%3AVino+%3B+upper%3AhaDenominazione+%3Fden+.+%3Fden+upper%3AhaLuogoDiProduzione+dbpedia%3ALucca%7D+LIMIT+10

Browse the data

The data and metadata stored in the triple store can be also navigated through the LodView application.