As the development of IT and scientific technology, very large amountsof knowledge data are continuously being created and the big data era can be saidto have arrived. Therefore, RDF store inserting and inquiring into knowledge baseshas to be scaled up in order to deal with such large sources of data. To this end, wepropose a scalable distributed RDF store based on a distributed database that usesbulk-loading for billions of triples to store data and to respond to user queries quickly.In order to achieve this purpose, we introduce a bulk-loading algorithm using theMapReduce framework and the SPARQL query processing engine to connect to alarge distributed database. Experimental results show that the proposed bulk-loadingalgorithm achieves 67.893K triples per second to load approximately 33 billion triples.Therefore, the experiment proves proposed RDF store can manage billions of triplesscale data.
Journal of Supercomputing
Distributed RDF store for efficient searching billions of triples based on Hadoop