Analysis and Evaluation of Unstructured Data: Text Mining versus Natural Language Processing
Keywords:
Unstructured, NLP, Mining, XML, HTMLAbstract
Nowadays, most of information saved in companies is as unstructured models. Retrieval and extraction of the information is essential works and importance in semantic web areas. Many of these requirements will be depend on the storage efficiency and unstructured data analysis. Merrill Lynch recently estimated that more than 80% of all potentially useful business information is unstructured data. The large number and complexity of unstructured data opens up many new possibilities for the analyst. We analyze both structured and unstructured data individually and collectively. Text mining and natural language processing are two techniques with their methods for knowledge discovery form textual context in documents. In this study, text mining and natural language techniques will be illustrated. The aim of this work comparison and evaluation the similarities and differences between text mining and natural language processing for extraction useful information via suitable themselves methods.
How to cite this article:
Manocha N, Ankit. Analysis and Evaluation of Unstructured Data: Text Mining versus Natural Language Processing. J Engr Desg Anal 2021; 4(1): 15-19.
References
Jiawei H, Micheline K. Data Mining: Concepts and Techniques. Morgan Kaufmann Publisher. 2006; 2.
Ronen F, James S. The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. 2006; 1.
Gharehchopogh FS. Approch and Review of User Oriented Interactive Data Mining. IEEE, the 4th International Conference on Application of Information and Communication Technologies (AICT2010), 2010; 1-4.
Manu K. Text Application Programming (Programming Series). 2006; 1.
Ronen F, James S. The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. 2006; 1.
Gharehchopogh FS. Approach and Developing Data Mining Method for Spatial Applications. Proceedings of International Conference on Intelligent Systems & Data Processing (ICISD), India. 2011; 342-345.
Michael WB. Survey of Text Mining: Clustering, Classification and Retrieval. 2006; 2.
Shiqun Y, Yuhui Q, Jike G et al. A Chinese Text Classification Approach Based on Semantic Web. Fourth International Conference on Semantics Knowledge and Grid. 2008; 497-498.
Rui L, Minghu J. Chinese Text Classification Based on the BVB Model. 2008 IEEE, Fourth International Conference on Semantics Knowledge and Grid, 2008; 376-379.
Huo L, Yi F, Heping H. Dynamic Service Replica on Distributed Data Mining. 2008 IEEE International Conference on Computer Science and Software Engineering. 2008; 390-393.
Shiqun Y, Gang W, Yuhui Q et al. Research and Implement of Classification Algorithm on Web Text Mining. 2007 IEEE, Third International Conference on Semantics Knowledge and Grid. 2007; 446-449.
Fan W, Linda W, Rich S et al. Tapping into the power of text mining. 2005 USA, Communications of the ACM. 2008; 76-82.
Tan AH, Yu PS. Guest Editorial: Text and Web mining. 2004 USA, Applied Intelligence. 2004; 239-241.
Jan HW, Eibe F. Data Mining: Practical Machine Learning Tools and Techniques. Diane Cerra Publishers. 2005.
Weiss S, Indurkhya N, Zhang T et al. Text Mining: Predictive Methods for Analyzing Unstructured Information. 2004.
Kao A, Poteet S. Text Mining and Natural Language Processing-Introduction for the Special Issue. SIGKDD Explorations 2004; 7(1): 1-3.
Gupta V. A Survey of Text Mining Techniques and Applications. Journal of Emerging Technologies in Web Intelligence 2009; 1(1): 60-76.
Navathe, Shamkant B, Ramez E) Data Warehousing And Data Mining. In Fundamentals of Database Systems. Pearson Education pvt Inc, Singapore. 2000; 841-872.
Michael BW. Automatic Discovery of Similar Words. In Survey of Text Mining: Clustering, Classification and Retrieval. Springer Verlag, USA, LLC, 2004; 24-43.
Manning C, Schutze H. Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA. 1999.
Yates B. Challenges in the Interaction of Information Retrieval and Natural Language Processing. in Proceedings of 5th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing), Corea, Lecture Notes in Computer Science, Springer. 2004; 2945: 445-456.
Popowich F. Using Text Mining and Natural Language
Published
Issue
Section
Copyright (c) 2021 Journal of Engineering Design and Analysis
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
We, the undersigned, give an undertaking to the following effect with regard to our article entitled
“_______________________________________________________________________________________________________________________________________________________________________________
________________________________________________________________________________” submitted for publication in (Journal title)________________________________________________ _______________________________________________________Vol.________, Year _________:-
1. The article mentioned above has not been published or submitted to or accepted for publication in any form, in any other journal.
2. We also vouchsafe that the authorship of this article will not be contested by anyone whose name(s) is/are not listed by us here.
3. I/We declare that I/We contributed significantly towards the research study i.e., (a) conception, design and/or analysis and interpretation of data and to (b) drafting the article or revising it critically for important intellectual content and on (c) final approval of the version to be published.
4. I/We hereby acknowledge ADRs conflict of interest policy requirement to scrupulously avoid direct and indirect conflicts of interest and, accordingly, hereby agree to promptly inform the editor or editor's designee of any business, commercial, or other proprietary support, relationships, or interests that I/We may have which relate directly or indirectly to the subject of the work.
5. I/We also agree to the authorship of the article in the following sequence:-
Authors' Names (in sequence) Signature of Authors
1. _____________________________________ _____________________________________
2. _____________________________________ _____________________________________
3. _____________________________________ _____________________________________
4. _____________________________________ _____________________________________
5. _____________________________________ _____________________________________
6. _____________________________________ _____________________________________
7. _____________________________________ _____________________________________
8. _____________________________________ _____________________________________
Important
(I). All the authors are required to sign independently in this form in the sequence given above. In case an author has left the institution/ country and whose whereabouts are not known, the senior author may sign on his/ her behalf taking the responsibility.
(ii). No addition/ deletion/ or any change in the sequence of the authorship will be permissible at a later stage, without valid reasons and permission of the Editor.
(iii). If the authorship is contested at any stage, the article will be either returned or will not be
processed for publication till the issue is solved.