|From:||Victor Chernov <vchernov@xxxxxxxxxxxxxx>|
|Date:||Mon, 24 Mar 2014 14:26:40 +0400|
Project summary: |
Project roster page: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2014_Hackathon_OptimizedSPARQLviaNativeAPI
Team lead: VictorChernov (MSK, UTC+4) vchernov at nitrosbase.com
Event starts 29th of March 2014 14:00 MSK / 10:00 UTC / 03:00 PST all over the world
The Goals of the project are
Studying the kinds of queries revealing the advantages of one or another RDF database. The goals imply:
- Selection of a SPARQL subset from SP2Bench
- Forming a dataset and loading it to all triple-stores.
- Implementing measurement aids, testing
- Accurate time measurement, getting min, max, average and median times.
- Reflection on the results, advantages and disadvantages of the triplestores on each selected query.
The following triplestores will be compared:
The triplestores have the following important advantages:
- Very high performance on demonstrated on sp2bench benchmark
- Linux and Windows versions
- Native API for fast query processing
It is important to use native API for fast query execution. All 3 tools provide native API:
Jena, Sesame and Virtuoso ODBC RDF Extensions for SPASQL
the core SNARL (Stardog Native API for the RDF Language) classes and interfaces
C++ and .NET native API
We suppose writing additional codes needed for accurate testing:
- Accurate time measurement;
- Functions for getting min, max, average and median times;
- Functions for getting time of scanning through the whole query result;
- Functions for getting time of retrieving first several records (for example, the first page of web grid);
The following steps are needed for loading test dataset:
- Selecting a data subset from sp2bench benchmark
- Measuring data loading time
Note: Data are considered as loaded as soon as the system is ready to perform a simplest search query. This is done to eliminate background processes (eg. indexing).
We are going to explore the query execution performance by the databases under consideration (Virtuoso, Stardog, NitrosBase).
The queries should be fairly simple and cover the different techniques, for example:
- search the small range of values
- search the big range of values
- Several different join queries
- Retrieving part of result
- Retrieving whole result
Note: During testing each database may allocate a lot of resources, that can affect the performance of other databases. That’s why each test should be stared from system reboot.
_________________________________________________________________ Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/ Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontology-summit/ Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx Community Files: http://ontolog.cim3.net/file/work/OntologySummit2014/ Community Wiki: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2014 Community Portal: http://ontolog.cim3.net/wiki/ (01)
|<Prev in Thread]||Current Thread||[Next in Thread>|
|Previous by Date:||[ontology-summit] Proceedings: OntologySummit2014 session-10 Track-C: Overcoming Ontology Engineering Bottlenecks-II - Thu 2014.03.20, Peter Yim|
|Next by Date:||[ontology-summit] Hackathon - Optimized SPARQL performance management via native API. Project prerequisites, Victor Chernov|
|Previous by Thread:||[ontology-summit] Proceedings: OntologySummit2014 session-10 Track-C: Overcoming Ontology Engineering Bottlenecks-II - Thu 2014.03.20, Peter Yim|
|Next by Thread:||[ontology-summit] Hackathon - Optimized SPARQL performance management via native API. Project prerequisites, Victor Chernov|
|Indexes:||[Date] [Thread] [Top] [All Lists]|