Automatic Semantic Annotation of Sport News Using Knowledge Base and Extraction Patterns

Quang Minh Nguyen1, Hong Son Ngo1, Tuan Dung Cao1
1 Hanoi University of Science and Technology, No. 1, Dai Co Viet, Hai Ba Trung, Hanoi, Viet Nam

Main Article Content

Abstract

The World Wide Web is currently one of the most popular platforms for publishing, disseminating and consuming news. However, the huge number of daily published news items brings new challenges for both readers and publishers of web news systems in the process of finding or arranging information. Aiming to change the representation of data in a machine-readable semantic annotation, the semantic web technology promise to address these obstacles. Therefore, finding the solution for creating annotation with valuable semantics is a key point in the development of our news aggregation system. In this paper, we present a method for generating automatically semantic annotations of sport news items. It combines the results obtained through our continuous study of capturing different kinds of semantics which having from simple to more complex representation structure. Our approach relies on the detection of named entities as the ontology instances using knowledge base on sport. The instances are matched with pre-defined patterns to extract semantics. Experiments on corpus of sport news validates the advantages of the proposed method and shows that semantic annotations are generated with high precision and coverage.

Article Details

References

[1] Bechhofer, S., Carr, L., Goble, C., Kampa, S. and Miles-Board, T., The Semantics of Semantic Annotation. In Proceedings of the 1st International Conference on Ontologies, Databases, and Applications of Semantics for Large Scale Information Systems. 1151-1167
[2] Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The Semantic Web. Scientific American, 284(5), pp. 34-43.
[3] Cimiano, P., Ladwig, G., Staab, S., Gimme’ the context: context-driven automatic semantic annotation with C-PANKOW, in: Proceedings of the 14th International World Wide Web Conference, Tokyo, Japan, 2005.
[4] Dill, S., Eiron, N., Gibson, D., Gruhl, D., Guha, R., Jhingran, A., Kanungo, T., McCurley, K.S., Rajagopalan, S., Tomkins, A., Tomlin, J.A., Zienberer, J.Y., A case for automated large scale semantic annotation, J.Web Semantics 1 (1) (December 2003).
[5] Ding, Y., Sun, Y., Chen, B., Börner, K., Ding, L., Wild, D., Wu, M., DiFranzo, D., Fuenzalida, A.G., Li, D., Milojević, S., Chen, S., Sankarangarayanan, M., Toma, I., Semantic Web Portal: A Platform for Better Browse and Visualizing Semantic Data. Proceedings of the 2010 International Conference on Active Media Technology, Toronto, Canada
[6] Fernández, N. Semantic Annotation Introduction, (2010). Available at
[7] Handschuh, Staab, S., Studer, R., Leveraging metadata creation for the Semantic Web with CREAM, in Proceedings of the Annual German Conference on AI, September 2003
[8] Lacvik, M., Ciglan, M., Šeleng, M., Krajči, S., Ontea: Semi-automatic Pattern based Text Annotation empowered with Information Retrieval Methods, Tools for Acquisition, Organisation and Presenting of Information and Knowledge (2007), 119-129.
[9] Nguyen, Q-M., Cao, T-D,: A novel approach for automatic extraction of semantic data about football transfer in sport news.International Journal Pervasive Computing and Communications, Vol. 11 Iss: 2, pp. 233-252, ISSN: 1742-7371 (2015)
[10] Popov, B., Kirayakov, A., Ognyanoff, D., Manov, D., Kirilov, A., KIM–a semantic platform fo information extraction and retrieval, Nat. Lang.Eng. 10 (3/4) (2004) 375–392
[11] Talantīkite, H.N., Aïssani, D., Boudjlida, N.Semantic annotations for web services discovery and composition. Computer Standards & Interfaces. Vol. 31, N°6. 1108-1117(2009)
[12] Rayfield, J., Wilton, P., Oliver, S., “Sport ontology”.http://www.bbc.co.uk/ontologies/sports