A String-Matching Operation using Finite Automata and Online Interface for Bioinformatics Algorithms
Keywords:
Bioinformatics, A-BOM, Interface, Approximate Pattern MatchingAbstract
In this study, I present a new web interface for major bioinformatics algorithms and introduce a novel approximate string-matching algorithm. My web interface executes major algorithms on the field for the use of computational biologists, students or any other interested researchers. In the web interface, algorithms come under three sections: Sequence alignment, pattern matching and motif finding. In each section, I introduce algorithms in order to find best fitting one for specific dataset and problem. The interface introduces execution time, memory usage and context specific results of algorithms such as alignment score. The interface utilizes emerging open source languages and tools. In order to develop light and user-friendly interface, all parts of the interface coded with Python language. On the other hand, Django is used for web interface. Second contribution of the study is novel A-BOM algorithm, which is designed for approximate pattern matching problem. The algorithm is approximate matching variation of Backward Oracle Matching. I compare my algorithm with popular approximate string-matching algorithms. Results denote that A-BOM introduces 30% to 80% short runtime improvement when compared to current approximate pattern matching algorithms on long patterns.
How to cite this article:
Pattnaik S. A String-Matching Operation using Finite Automata and Online Interface for Bioinformatics Algorithms. J Engr Desg Anal 2020; 3(2): 1-7.
References
Pevsner J. Bioformatics and fuctional genomics.
Smith TF, Waterman MS. Identification of comman molecular subsequences. Journal of molecular biology. Academic Press Incorporated, London, 40-48. doi: 10.1016/00222836(81) 90087-5.
Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two protiens. Journal of molecular biology. Academic Press Incorporated, London. 1970; 40-48. doi: 10. 1016/00222836(81) 90087-5.
Bishop CM. Machine learning and pattern recognition. Information science and statistic springer, Heidelberg.
Dhaeseleer P. How does DNA sequence motif discovery work? Nature biotechnology 2006; 24(8): 959-961.
Ozcan G,Unsal OS. Fast bitwise pattern matching algorithm for DNA sequences on modern hardware. Turkish Journal of Electrical Engineering & Computer Sciences 2015; 23(5): 1405-1417.
Langmead B, Salzberg SL. Fast gapped read alignment Bowtie 2. Nature Methods 2012; 9(4): 357.
Knuth DE, Morris JH, Pratt WR. Fast pattern matching in Strings. Journal of Molecular Biology, SIAM Journal on Computing 1977; 323-350. DOI: 10.1137/0206024 9. Boyer RS, Moore JS, Pratt WR. A fast string searching algorithm. Journal of Molecular Biology 1977; 762-772.
Published
Issue
Section
Copyright (c) 2021 Journal of Engineering Design and Analysis
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
We, the undersigned, give an undertaking to the following effect with regard to our article entitled
“_______________________________________________________________________________________________________________________________________________________________________________
________________________________________________________________________________” submitted for publication in (Journal title)________________________________________________ _______________________________________________________Vol.________, Year _________:-
1. The article mentioned above has not been published or submitted to or accepted for publication in any form, in any other journal.
2. We also vouchsafe that the authorship of this article will not be contested by anyone whose name(s) is/are not listed by us here.
3. I/We declare that I/We contributed significantly towards the research study i.e., (a) conception, design and/or analysis and interpretation of data and to (b) drafting the article or revising it critically for important intellectual content and on (c) final approval of the version to be published.
4. I/We hereby acknowledge ADRs conflict of interest policy requirement to scrupulously avoid direct and indirect conflicts of interest and, accordingly, hereby agree to promptly inform the editor or editor's designee of any business, commercial, or other proprietary support, relationships, or interests that I/We may have which relate directly or indirectly to the subject of the work.
5. I/We also agree to the authorship of the article in the following sequence:-
Authors' Names (in sequence) Signature of Authors
1. _____________________________________ _____________________________________
2. _____________________________________ _____________________________________
3. _____________________________________ _____________________________________
4. _____________________________________ _____________________________________
5. _____________________________________ _____________________________________
6. _____________________________________ _____________________________________
7. _____________________________________ _____________________________________
8. _____________________________________ _____________________________________
Important
(I). All the authors are required to sign independently in this form in the sequence given above. In case an author has left the institution/ country and whose whereabouts are not known, the senior author may sign on his/ her behalf taking the responsibility.
(ii). No addition/ deletion/ or any change in the sequence of the authorship will be permissible at a later stage, without valid reasons and permission of the Editor.
(iii). If the authorship is contested at any stage, the article will be either returned or will not be
processed for publication till the issue is solved.