Vietnamese information retrieval test collection development
Main Article Content
Abstract
Linguistic features processing can remarkably affect information retrieval effectiveness. Evaluation is an important phase in the development process of any information retrieval solutions. In the current state, due to the lack of test collections, it’s difficult to carry out research and share achieved results, because of that the development of Vietnamese test collections are required. This report will provide general information about information retrieval evaluation problems, the test collection development and evaluation processes. The achieved results show that information retrieval test collection development can take a lot of time and effort, but can be divided into many phases and can be developed incrementally.
Keywords
Information retrieval evaluation, web data collection, test collection development
Article Details
References
[1] Sanderson, M., et al., Best Practices for Test
Collection Creation, Evaluation Methodologies and
Language Processing Technologies. TrebleCLEF,
University of Sheffield (2009).
[2] Buckley C. and Voorhees E.M, Retrieval evaluation
with incomplete information. Proc. in ACM SIGIR
Conf. July 2004, pp. 25-32.
https://doi.org/10.1145/1008992.1009000
[3] Järvelin, K., et al, IR evaluation methods for retrieving
highly relevant documents, Proc. in ACM SIGIR.
ACM New York, USA (2000) pp. 41-48.
https://doi.org/10.1145/345508.345545
[4] Burges, C., et al, Learning to rank using gradient
descent, Proc. in Conference on Machine Learning,
Bonn, Germany (2005) pp. 89-96
https://doi.org/10.1145/1102351.1102363
[5] Järvelin, K. et al., Cumulated gain-based evaluation of
IR techniques, ACM TOIS, (2002) pp. 422-446.
https://doi.org/10.1145/582415.582418
[6] Al-Maskari, A., et al., The relationship between IR
effectiveness measures and user satisfaction, Proc. in
ACM SIGIR. ACM Press New York, USA, (2007)
773-774
https://doi.org/10.1145/1277741.1277902
Collection Creation, Evaluation Methodologies and
Language Processing Technologies. TrebleCLEF,
University of Sheffield (2009).
[2] Buckley C. and Voorhees E.M, Retrieval evaluation
with incomplete information. Proc. in ACM SIGIR
Conf. July 2004, pp. 25-32.
https://doi.org/10.1145/1008992.1009000
[3] Järvelin, K., et al, IR evaluation methods for retrieving
highly relevant documents, Proc. in ACM SIGIR.
ACM New York, USA (2000) pp. 41-48.
https://doi.org/10.1145/345508.345545
[4] Burges, C., et al, Learning to rank using gradient
descent, Proc. in Conference on Machine Learning,
Bonn, Germany (2005) pp. 89-96
https://doi.org/10.1145/1102351.1102363
[5] Järvelin, K. et al., Cumulated gain-based evaluation of
IR techniques, ACM TOIS, (2002) pp. 422-446.
https://doi.org/10.1145/582415.582418
[6] Al-Maskari, A., et al., The relationship between IR
effectiveness measures and user satisfaction, Proc. in
ACM SIGIR. ACM Press New York, USA, (2007)
773-774
https://doi.org/10.1145/1277741.1277902