Vietnamese information retrieval test collection development

Ba Ngoc Nguyen1,
1 Hanoi University of Science and Technology, Hanoi, Vietnam

Main Article Content

Abstract

Linguistic features processing can remarkably affect information retrieval effectiveness. Evaluation is an important phase in the development process of any information retrieval solutions. In the current state, due to the lack of test collections, it’s difficult to carry out research and share achieved results, because of that the development of Vietnamese test collections are required. This report will provide general information about information retrieval evaluation problems, the test collection development and evaluation processes. The achieved results show that information retrieval test collection development can take a lot of time and effort, but can be divided into many phases and can be developed incrementally.

Article Details

References

[1] Sanderson, M., et al., Best Practices for Test
Collection Creation, Evaluation Methodologies and
Language Processing Technologies. TrebleCLEF,
University of Sheffield (2009).
[2] Buckley C. and Voorhees E.M, Retrieval evaluation
with incomplete information. Proc. in ACM SIGIR
Conf. July 2004, pp. 25-32.
https://doi.org/10.1145/1008992.1009000
[3] Järvelin, K., et al, IR evaluation methods for retrieving
highly relevant documents, Proc. in ACM SIGIR.
ACM New York, USA (2000) pp. 41-48.
https://doi.org/10.1145/345508.345545
[4] Burges, C., et al, Learning to rank using gradient
descent, Proc. in Conference on Machine Learning,
Bonn, Germany (2005) pp. 89-96
https://doi.org/10.1145/1102351.1102363
[5] Järvelin, K. et al., Cumulated gain-based evaluation of
IR techniques, ACM TOIS, (2002) pp. 422-446.
https://doi.org/10.1145/582415.582418
[6] Al-Maskari, A., et al., The relationship between IR
effectiveness measures and user satisfaction, Proc. in
ACM SIGIR. ACM Press New York, USA, (2007)
773-774
https://doi.org/10.1145/1277741.1277902