Analysis and Evaluation of Unstructured Data: Text Mining versus Natural Language Processing

Authors

  • Neetu Manocha Assistant Professor, Department of Computer Science and Applications, IB (PG) College, Panipat, Haryana, India.
  • Anki Student, Department of Computer Science and Applications, IB (PG) College, Panipat, Haryana, India.

Keywords:

Unstructured, NLP, Mining, XML, HTML

Abstract

Nowadays, most of information saved in companies is as unstructured models. Retrieval and extraction of the information is essential works and importance in semantic web areas. Many of these requirements will be depend on the storage efficiency and unstructured data analysis. Merrill Lynch recently estimated that more than 80% of all potentially useful business information is unstructured data. The large number and complexity of unstructured data opens up many new possibilities for the analyst. We analyze both structured and unstructured data individually and collectively. Text mining and natural language processing are two techniques with their methods for knowledge discovery form textual context in documents. In this study, text mining and natural language techniques will be illustrated. The aim of this work comparison and evaluation the similarities and differences between text mining and natural language processing for extraction useful information via suitable themselves methods.

How to cite this article:
Manocha N, Ankit. Analysis and Evaluation of Unstructured Data: Text Mining versus Natural Language Processing. J Engr Desg Anal 2021; 4(1): 15-19.

References

Jiawei H, Micheline K. Data Mining: Concepts and Techniques. Morgan Kaufmann Publisher. 2006; 2.

Ronen F, James S. The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. 2006; 1.

Gharehchopogh FS. Approch and Review of User Oriented Interactive Data Mining. IEEE, the 4th International Conference on Application of Information and Communication Technologies (AICT2010), 2010; 1-4.

Manu K. Text Application Programming (Programming Series). 2006; 1.

Ronen F, James S. The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. 2006; 1.

Gharehchopogh FS. Approach and Developing Data Mining Method for Spatial Applications. Proceedings of International Conference on Intelligent Systems & Data Processing (ICISD), India. 2011; 342-345.

Michael WB. Survey of Text Mining: Clustering, Classification and Retrieval. 2006; 2.

Shiqun Y, Yuhui Q, Jike G et al. A Chinese Text Classification Approach Based on Semantic Web. Fourth International Conference on Semantics Knowledge and Grid. 2008; 497-498.

Rui L, Minghu J. Chinese Text Classification Based on the BVB Model. 2008 IEEE, Fourth International Conference on Semantics Knowledge and Grid, 2008; 376-379.

Huo L, Yi F, Heping H. Dynamic Service Replica on Distributed Data Mining. 2008 IEEE International Conference on Computer Science and Software Engineering. 2008; 390-393.

Shiqun Y, Gang W, Yuhui Q et al. Research and Implement of Classification Algorithm on Web Text Mining. 2007 IEEE, Third International Conference on Semantics Knowledge and Grid. 2007; 446-449.

Fan W, Linda W, Rich S et al. Tapping into the power of text mining. 2005 USA, Communications of the ACM. 2008; 76-82.

Tan AH, Yu PS. Guest Editorial: Text and Web mining. 2004 USA, Applied Intelligence. 2004; 239-241.

Jan HW, Eibe F. Data Mining: Practical Machine Learning Tools and Techniques. Diane Cerra Publishers. 2005.

Weiss S, Indurkhya N, Zhang T et al. Text Mining: Predictive Methods for Analyzing Unstructured Information. 2004.

Kao A, Poteet S. Text Mining and Natural Language Processing-Introduction for the Special Issue. SIGKDD Explorations 2004; 7(1): 1-3.

Gupta V. A Survey of Text Mining Techniques and Applications. Journal of Emerging Technologies in Web Intelligence 2009; 1(1): 60-76.

Navathe, Shamkant B, Ramez E) Data Warehousing And Data Mining. In Fundamentals of Database Systems. Pearson Education pvt Inc, Singapore. 2000; 841-872.

Michael BW. Automatic Discovery of Similar Words. In Survey of Text Mining: Clustering, Classification and Retrieval. Springer Verlag, USA, LLC, 2004; 24-43.

Manning C, Schutze H. Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA. 1999.

Yates B. Challenges in the Interaction of Information Retrieval and Natural Language Processing. in Proceedings of 5th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing), Corea, Lecture Notes in Computer Science, Springer. 2004; 2945: 445-456.

Popowich F. Using Text Mining and Natural Language

Published

2021-09-15