Owing to the technological advancements in Semantic Web and sensor networks, a large amount of data has been produced in association with the open data policy. However, data stream management systems that process stream data have focused on the processing of a large amount of data with little priority on data identification, integration, and external linkage. Furthermore, entity resolution is focused mainly on static database-based technologies. In this study, a real-time stream data processing architecture that can perform the integration and entity resolution of streaming-type heterogeneous input data and interlink with external data is designed. To achieve this goal, a light adapter to integrate heterogeneous data into standard scheme and blocking technique to reduce comparison candidates are applied. The implemented data adapters shows 4 times higher throughput than open source data parsers and the entity resolution results with streaming data shows similar performance with the static data sets. The proposed streaming data entity resolution architecture is expected to form the basis of data integration research that can integrate various information sources of data efficiently, enrich internal data.
dc.language
eng
dc.relation.ispartofseries
Wireless Personal Communications
dc.title
Entity Resolution Approach of Data Stream Management Systems