The explosive growth of resources stored in various forms and transmitted over the internet has necessitated researches into information retrieval technologies. The major information retrieval mechanisms commonly employed include vector space model, Boolean model, Fuzzy Set model, and probabilistic retrieval model. These models are used to find similarities between the query and the documents to retrieve documents that reflect the query. These approaches are based on key-word, which uses lists of keywords to describe the information content. In this paper, a survey of these models is provided in order to understand their working mechanisms and shortcomings. This understanding is vital as it facilitates the choice of an information retrieval technique, based on the underlying requirements. The results of this survey revealed that the current information retrieval models fall short of the expectations in one way or the other. As such, they are not ideal for high precision information retrieval applications.
Published in | Advances in Networks (Volume 5, Issue 2) |
DOI | 10.11648/j.net.20170502.12 |
Page(s) | 40-46 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2017. Published by Science Publishing Group |
Information Retrieval, Model, Fuzzy, Boolean, Probabilistic, Query
[1] | B. Jansen and S. Rieh (2010). The Seventeen Theoretical Constructs of Information Searching and Information Retrieval. Journal of the American Society for Information Sciences and Technology. 61(8), 1517-1534. |
[2] | I. Sutskever, O. Vinyals and Q. Le (2014). Sequence to Sequence Learning with Neural Networks. |
[3] | M. Sanderson and W. Bruce (2012). The History of Information Retrieval Research. Proceedings of the IEEE. 100: 1444–1451. |
[4] | R. Baeza, and B. Ribeiro (2011). Modern Information Retrieval: Second edition. Addison-Wesley, New York, NY, USA. |
[5] | E. Elabd, E. Alshari, and H. Abdulkader (2014). Semantic Boolean Arabic Information Retrieval. The International Arab Journal of Information Technology. |
[6] | Q. Shatnawi B. Yassein B. and R. Mahafza (2012). A Framework for Retrieving Arabic Documents Based on Queries Written in Arabic Slang Language. Journal of Information Science, vol. 38, pp. 350-365. |
[7] | Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil (2014). A Latent Semantic Model with Convolutional-pooling Structure for Information Retrieval. In Proceedings of CIKM. |
[8] | R. Harastani (2010). Information Retrieval With Fuzzy Logic. Texmex. |
[9] | W. Onifade and J. Ibitoye (2016). Fuzzy Latent Semantic Query Expansion Model for Enhancing Information Retrieval. University of Ibadan, Nigeria. |
[10] | B. Yates and R. Neto (2012). Modern information retrieval. Addison Wesley, 2011. |
[11] | D. Turney, and P. Pantel (2010). From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research. |
[12] | N. Singh andK. Dwivedi (2012). Analysis of Vector Space Model in Information Retrieval. National Conference on Communication Technologies & its impact on Next Generation Computing. |
[13] | R. Kiros, Y. Zhu, R. Salakhutdinov, S. Zemel, A. Torralba, R. Urtasun, and S. Fidler (2015). Skip-thought vectors. |
[14] | R. Pascanu, C. Culcehre, K. Cho, and Y. Bengio, (2013). How to Construct Deep Neural Networks. |
[15] | M. Dragoni, Celia da Costa Pereira, G. B Andrea. Tettamanzi, (2012). A Conceptual Representation of Documents and Queries for Information Retrieval System using Light Ontologies. Expert Systems with Applications pp. 10376–10388, Elsevier. |
[16] | C. Exeler and H. Sack (2015). Linked Data Enabled Generalized Vector Space Model To Improve Document Retrieval. Hasso-Plattner-Institute for IT-Systems Engineering. |
[17] | R. Usbeck (2015). GERBIL: general entity annotation benchmark framework. In 24th WWW conference. |
[18] | T. Tietz, J. Waitelonis, J. Jager, and H. Sack (2014). Smart media navigator: Visualizing recommendations based on linked data. In 13th International Semantic Web Conference, Industry Track, pages 48{51}. |
[19] | I. Santos, B. Sanz C. Laorden and G. Bringas (2012). Enhanced Topic-based Vector Space Model for semantics-aware spam filtering. Expert Systems with Applications 39:437-444. |
[20] | H. Drucker (2013). Support Vector Machines for Spam Categorization. |
[21] | M. Kwak and G. Leroy (2013). Development and Evaluation of a Biomedical Search Engine using a Predicate-based Vector Space Model. |
[22] | S. Clark (2013). Topic Modelling and Latent Dirichlet Allocation. Machine Learning for Language Processing. |
[23] | D. Blei (2012). Probabilistic topic models. Communications of the ACM, 55(4):7784. |
[24] | S. Liangcai B. Long, M. Weiyi (2014). A Latent Topic Model for Complete Entity Resolution. 25th IEEE International Conference on Data Engineering. |
[25] | B. Stefan L. Charles V. Gordon (2014). Information Retrieval: Implementing and Evaluating Search Engines. MIT Press. |
[26] | D. Manning P. Raghavan S. Hinrich (2013). Introduction to Information Retrieval. Cambridge University Press. |
[27] | H. Paik, (2013). A novel TF-IDF weighting scheme for effective ranking. Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, Dublin, Ireland. |
[28] | R. Cummins, H. Jiaul, L. Yuanhua, A. Pólya (2015). Urn Document Language Model for Improved Information Retrieval. ACM Transactions on Information Systems (TOIS), v.33 n.4, p.1-34. |
[29] | P. Sojka and H. Schütze (2015). Introduction to Information Retrieval. Faculty of Informatics, Masaryk University. |
[30] | Y. Baeza, R. Ribeiro (2011). Modern Information Retrieval. |
[31] | Y. Kim, Y. Jernite, D. Sontag, M. Rush (2016). Character-Aware Neural Language Models. School of Engineering and Applied Sciences Harvard University. |
[32] | P. Wise, M. Henrion (2013). A Framework for Comparing Uncertain Inference Systems to Probability. Cornell University Library. |
[33] | E. Kyburgand, C. Teng (2015). Uncertain Inference. |
[34] | S. Zhang, H. Jiang, M. Xu, J. Hou, and L. Dai (2015). The Fixed- Size Ordinally-Forgetting Encoding Method for Neural Network Language Models. In Proceedings of ACL. |
[35] | T. Mikolov, A. Deoras, S. Kombrink, L. Burget, and J. Cernocky (2011). Empirical Evaluation and Combination of Advanced Language Modeling Techniques. In Proceedings of INTERSPEECH. |
[36] | M. Sundermeyer, H. Ney, and R. Schluter (2015). From feedforward to recurrent lstm neural networks for language modeling. Audio, Speech, and Language Processing, IEEE/ACM Transactions on 23(3):517–529. |
[37] | S. Goldwater (2015). Introduction to Computational Linguistics: N-gram language models. |
[38] | D. Matthew(2012). Adadelta: An adaptive learning rate method. |
[39] | G. Amati (2015). Divergence from Randomness Models. |
[40] | S. Hinrich (2011). Introduction to Information Retrieval. Institute for Natural Language Processing, Universit¨at Stuttgart. |
APA Style
Mang’are Fridah Nyamisa, Waweru Mwangi, Wilson Cheruiyot. (2017). A Survey of Information Retrieval Techniques. Advances in Networks, 5(2), 40-46. https://doi.org/10.11648/j.net.20170502.12
ACS Style
Mang’are Fridah Nyamisa; Waweru Mwangi; Wilson Cheruiyot. A Survey of Information Retrieval Techniques. Adv. Netw. 2017, 5(2), 40-46. doi: 10.11648/j.net.20170502.12
AMA Style
Mang’are Fridah Nyamisa, Waweru Mwangi, Wilson Cheruiyot. A Survey of Information Retrieval Techniques. Adv Netw. 2017;5(2):40-46. doi: 10.11648/j.net.20170502.12
@article{10.11648/j.net.20170502.12, author = {Mang’are Fridah Nyamisa and Waweru Mwangi and Wilson Cheruiyot}, title = {A Survey of Information Retrieval Techniques}, journal = {Advances in Networks}, volume = {5}, number = {2}, pages = {40-46}, doi = {10.11648/j.net.20170502.12}, url = {https://doi.org/10.11648/j.net.20170502.12}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.net.20170502.12}, abstract = {The explosive growth of resources stored in various forms and transmitted over the internet has necessitated researches into information retrieval technologies. The major information retrieval mechanisms commonly employed include vector space model, Boolean model, Fuzzy Set model, and probabilistic retrieval model. These models are used to find similarities between the query and the documents to retrieve documents that reflect the query. These approaches are based on key-word, which uses lists of keywords to describe the information content. In this paper, a survey of these models is provided in order to understand their working mechanisms and shortcomings. This understanding is vital as it facilitates the choice of an information retrieval technique, based on the underlying requirements. The results of this survey revealed that the current information retrieval models fall short of the expectations in one way or the other. As such, they are not ideal for high precision information retrieval applications.}, year = {2017} }
TY - JOUR T1 - A Survey of Information Retrieval Techniques AU - Mang’are Fridah Nyamisa AU - Waweru Mwangi AU - Wilson Cheruiyot Y1 - 2017/11/28 PY - 2017 N1 - https://doi.org/10.11648/j.net.20170502.12 DO - 10.11648/j.net.20170502.12 T2 - Advances in Networks JF - Advances in Networks JO - Advances in Networks SP - 40 EP - 46 PB - Science Publishing Group SN - 2326-9782 UR - https://doi.org/10.11648/j.net.20170502.12 AB - The explosive growth of resources stored in various forms and transmitted over the internet has necessitated researches into information retrieval technologies. The major information retrieval mechanisms commonly employed include vector space model, Boolean model, Fuzzy Set model, and probabilistic retrieval model. These models are used to find similarities between the query and the documents to retrieve documents that reflect the query. These approaches are based on key-word, which uses lists of keywords to describe the information content. In this paper, a survey of these models is provided in order to understand their working mechanisms and shortcomings. This understanding is vital as it facilitates the choice of an information retrieval technique, based on the underlying requirements. The results of this survey revealed that the current information retrieval models fall short of the expectations in one way or the other. As such, they are not ideal for high precision information retrieval applications. VL - 5 IS - 2 ER -