Our project started in June 2012 and is due to finish on December 31st, 2012. We have just completed the user requirements gathering stage and are writing up the corresponding deliverable. As soon as it is ready, we will share it here for feedback. We also had our third meeting today, discussing the work carried out in the past two weeks on user engagement and LOD-based semantic enrichment.
In the mean time, here are some more details on the project workplan:
WORKPACKAGES
|
Month |
1
|
2
|
3
|
4
|
5
|
6
|
7
|
1: Project Management
|
||||||||
2: User Engagement & Case
Studies
|
||||||||
3: Linked Environment Data
Enrichment
|
||||||||
4: User-Friendly Semantic
Search over Linked Data
|
||||||||
5: Evaluation
|
||||||||
6: Dissemination &
Engagement
|
WP 1: Project
Management
(Responsible partner: Sheffield)
The cross-institutional nature of the project necessitates close liaison between Sheffield, the British Library (BL) and
HR Wallingford; in addition to communication as a result of collaborative
working, monthly telecoms and regular face-to-face meetings will be used to
advance the project and monitor progress.
Deliverables: Project plan. Legacy plan, including sustainability
and support. Final report.
WP 2: User Engagement and Case Studies (BL, HR Wallingford)
This WP covers engagement with environmental
science researchers and other key stakeholders. This takes place throughout the project, but in particular: (i) early in the project,
to produce detailed requirements and use cases, based on interviews; (ii) later
in the project, when we will test the utility of Linked Data and assessing how
the vocabularies support the needs of researchers and practitioners, and
whether the Linked Open Data (LOD) approach will produce an added benefit in comparison
with keyword searching.
Deliverables: Stakeholder analysis, requirements
and use cases; User feedback.
WP 3: Linked Environment Data Enrichment (Sheffield)
This WP will deliver semantic enrichment tools,
based on relevant LOD vocabularies. Where required, relevant ontologies not
already connected to existing Linked Environment Data will be integrated.
Sheffield’s open-source tools for lookup and term disambiguation with respect
to Linked Data vocabularies will be tested and adapted to the environmental
science domains. As part of this work, we are evaluating the coverage and
accuracy of relevant general purpose LOD datasets (namely GeoNames and DBPedia),
when applied to data and content from our domain. Tools for LOD-based geo-location
disambiguation, date and measurement recognition and normalisation will also be
delivered.
Our solution is based on Ontotext's high performance OWLIM semantic repository, the open-source GATE semantic annotation tools, and their
integration with Linked Data endpoints. We import Linked Data into the semantic
repository, which provides a SPARQL endpoint and also full text, metadata, and
semantic annotation indices, which underpin the semantic search UI.
Deliverables: Open source tools for semantic
enrichment with Linked Environment Data.
WP 4:User-Friendly Semantic Search over Linked Data (Sheffield)
GATE Mimir (Multi-paradigm Information ManagementIndexing and Retrieval) is open-source software framework for multi-paradigm
indexing and searching of semantically annotated documents. Enriching documents
with explicit semantics allows users to search more effectively for ambiguous
names such as London (Ontario) and London (UK).The multi-paradigm aspect of
Mimir refers to the accessing and linking together of multiple information
sources, such as the textual content of the documents, the semantic metadata and
knowledge encoded in the Linked Data vocabularies. Accessing knowledge from
Linked Data allows Mimir to understand generalisations, making it capable of answering
more complex information needs, such as identification of documents that refer
to water levels at the Thames barrier as relevant to a keyword search for
flooding in south-east Britain. At the same time, the explicit LOD semantics
associated with the indexed semantic metadata and content makes sure that
references to places called London (other than the one in the UK) are not seen
as relevant results to such a query.
This WP will develop a customised semantic
search interface, which enables users to carry out such powerful searches and
fully benefit from the knowledge contained in Linked Data, without needing to
write SPARQL queries.
Deliverable: A web-based interface for semantic
search with Linked Environment Data.
WP 5: Evaluation (Sheffield and BL)
Firstly, quantitative evaluation of the accuracy of semantic enrichment and Linked Data vocabulary coverage will be carried out, based on a human annotated gold standard and established metrics such as f-measure. In addition, a comparative evaluation of the new semantic search web interface will be completed, against the current keyword-search Envia tool, using a set of search queries supplied by the BL. Evaluation will be carried out in the context of the user requirements developed in WP2.
Deliverables: Quantitative evaluation results;
A report detailing the lessons learned.
WP 6: Dissemination and Engagement (Sheffield, BL, HR Wallingford)
The project will
devote significant effort to dissemination, including practical activities such
as demonstrations and tutorials, to show how project outputs might be exploited
in other institutions. Details of planned dissemination activities are provided below.
Deliverables: Presentations; research paper;
online demonstration; training materials; blog; website; user workshop, engagement
with JISC programme manager and related projects.
Timing
|
Dissemination Activity
|
Audience
|
Purpose
|
Key Message
|
M1-M7
|
Participation in JISC programme activities, such as JISC Involve (http://jiscinvolve.org/)
|
JISC
|
Raise awareness, Promote results
|
Benefits and challenges of using LOD
|
M1-M7
|
Collaboration
with other “Research Tools” projects
|
JISC development programmes
|
Inform, engage, and promote
|
EnviLOD objectives and results
|
M2-M7
|
Project website
|
External stakeholders and research community
|
Raise awareness, inform, promote results
|
EnviLOD objectives and results
|
M4-M7
|
Peer-reviewed
publications at journals, conferences and workshops, including relevant
environmental science (e.g. EnviroInfo, Ecological Informatics), as well as
technical semantic technology ones (ISWC, ESWC, Journal Web Semantics)
|
Research community, including environmental science and
web science
|
Inform and promote research results
|
EnviLOD research methods, open-source tools, and
evaluation results
|
M7
|
Dissemination
workshop hosted at The British Library
|
Stakeholders
|
Engage
stakeholders with the EnviLOD outputs
|
Benefits of LOD for environmental scientists
|
M3-M7
|
Practical,
“hands-on” outreach, through open-source software, user documentation, online
demonstrations and tutorials
|
Research community, end users, JISC, and other
stakeholders
|
Promote project results
|
Availability of open-source tools for LOD-based semantic
enrichment and search
|
M1-M7
|
Engagement with
interested researchers from other institutions and other disciplines
|
Stakeholders
|
Inform and promote results
|
Lessons learned and results delivered
|