Data Analytics in Bioinformatics. Группа авторов

Читать онлайн.
Название Data Analytics in Bioinformatics
Автор произведения Группа авторов
Жанр Программы
Серия
Издательство Программы
Год выпуска 0
isbn 9781119785606



Скачать книгу

very less as a result the network becomes too simple and may not be able to model complex data. Similarly, if the model is trained using too many neurons, it may take excessively long time for training the input data and there is a high chance of overfitting of data. In such case the network may begin to model some random noise in the output, as a result the model generalizes the training data extremely well, but fails to generalize new or future data. Integration of ANN algorithm with different optimization algorithms minimizes the error rate produced by the classification model which in result improves the model performance.

      In this paper we have discussed the different applications of ANN related to different fields of bioinformatics. We have also made a comparative study between various machine learning algorithms and ANN algorithm to get some useful variants about how ANN works and what affects the performance of an ANN classification model and how the performance of the model can be improved to get more accurate result. The problems associated with the traditional approach to solve classification problem can be overcome with the concept of deep learning, which will allow faster learning by reducing the computational cost of the classification model even for the large dataset with inbuilt feature engineering that reduces the requirement of domain expertise. The observation from the study shows that the ANN and its variations can be used to solve complex problem of disease diagnostics or prognosis related with bioinformatics, resulting in the improved lifestyle and environment.

       References

      1. https://en.wikipedia.org/wiki/Bioinformatics.

      2. https://microbenotes.com/bioinformatics-introduction-and-applications/.

      3. https://en.wikipedia.org/wiki/Structural_biology.

      4. Mehmood, M.A., Sehar, U., Ahmad, N., Use of Bioinformatics Tools in Different Spheres of Life Sciences. J. Data Mining Genomics Proteomics, 5, 1–13, 2014.

      5. Singh, H., Bioinformatics: Benefits to Mankind. Int. J. Pharm. Tech. Res., 9, 4, 242−248, 2016.

      6. https://www.biotecharticles.com/Bioinformatics-Article/Applications-of- Bioinformatics-3270.html.

      7. Rhee, S.Y., Dickerson, J., Xu, D., Bioinformatics and its applications in plant biology. Annu. Rev. Plant Biol., 57, 335–360, 2006.

      8. https://www.analyticssteps.com/blogs/understanding-bioinformatics-application-machine-learning.

      10. https://microbenotes.com/biological-databases-types-and-importance/.

      11. https://ceoworld.biz/2019/12/13/data-preprocessing-what-is-it-and-why-is-important/.

      12. Herbert, K.G. and Wang, J.T.L., Biological data cleaning: A case study. Int. J. Inf. Qual., 1, 1, 60–82, 2007.

      13. Chen, Q., Duplication in biological databases: definitions, impacts and methods (Doctoral dissertation), University of Melbourne, Australia, 2017.

      14. Lee, G., Rodriguez, C., Madabhushi, A., An empirical comparison of dimensionality reduction methods for classifying gene and protein expression datasets, in: International Symposium on Bioinformatics Research and Applications, Springer, Berlin, Heidelberg, pp. 170–181, 2007.

      15. Lausen, B., Bioinformatics and classification: The analysis of genome expression data, in: Classification, Clustering, and Data Analysis, pp. 455–461, Springer, Berlin, Heidelberg, 2002.

      16. https://towardsdatascience.com/explore-the-world-of-bioinformatics-with-machine-learning-47c62c482aaf.

      17. https://www.kdnuggets.com/2018/10/simple-neural-network-python.html.

      18. https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_supervised_learning.html.19. https://www.mygreatlearning.com/blog/types-of-neural-networks/.

      20. https://towardsdatascience.com/multi-layer-neural-networks-with-sigmoid-function-deep-learning-for-rookies-2-bf464f09eb7f.

      21. Boutros, P.C. and Okey, A.B., Unsupervised pattern recognition: An introduction to the whys and wherefores of clustering microarray data. Briefings Bioinf., 6, 4, 331–343, 2005.

      22. Wang, J.-Y., Application of support vector machines in bioinformatics, National Taiwan University, Taiwan, 2002. 23. https://towardsdatascience.com/logistic-regression-classifier-8583e0c3cf9.

      24. Polaka, I., Tom, I., Borisov, A., Decision tree classifiers in bioinformatics. Appl. Comput. Syst., 42, 1, 118–123, 2010. 25. https://towardsdatascience.com/machine-learning-basics-with-the-k-nearest-neighbors-algorithm-6a6e71d01761.

      26. Qi, Y., Random forest for bioinformatics, in: Ensemble machine learning, pp. 307–323, Springer, Boston, MA, 2012.

      27. Samundeeswari, E. and Saranya, P., An artificial neural network model for prediction of survival time of breast cancer dataset. Int. J. Res. Eng. Appl. Sci., 6, 1, 161–168, 2016.

      28. Narayanan, A., Keedwell, E.C., Gamalielsson, J., Tatineni, S., Single-layer artificial neural networks for gene expression analysis. Neurocomputing, 61, 217–240, 2004.

      30. Won, H.-H. and Cho, S.-B., Paired neural network with negatively correlated features for cancer classification in DNA gene expression profiles, in: Proceedings of the International Joint Conference on Neural Networks, 2003, vol. 3, pp. 1708–1713, IEEE, Portland, Oregon, 2003.

      31. Thein, H.T.T. and Tun, K.M.M., An approach for breast cancer diagnosis classification using neural network. Adv. Comput., 6, 1, 1, 2015.

      32. Peterson, L.E. and Coleman, M.A., Machine learning-based receiver operating characteristic (ROC) curves for crisp and fuzzy classification of DNA microarrays in cancer research. Int. J. Approximate Reasoning, 47, 1, 17–36, 2008.

      33. Tabares-Soto, R., Orozco-Arias, S., Romero-Cano, V., Segovia Bucheli, V., Rodríguez-Sotelo,