Bioinformatics is a recently evolved field in biology. The definition is still evolving but in essence, bioinformatics is the “computational branch of molecular biology” (Claverie 2007, p. 9). This is because the field integrates computer technology and information processing systems to combine biological data and analyse biological problems. The primary aim of this new field is the ascertainment of biological insights and establishment of a global view to support a unified understanding of concepts in biology.
This found application in genetics by addressing the need for a database containing large volumes of biological information, especially on gene sequences. Eventually, bioinformatics allowed geneticists to access available data on genetics and combine new data for analysis to support continuity of knowledge building on genetics. (National Center for Biotechnology Information 2004) Bioinformatics Challenges for the Geneticist Bioinformatics create a number of challenges for geneticists. The aim of bioinformatics is three-fold.
First is storage of large bulks of information in accessible databases. Second is the development and application of computer systems to support analysis. Third is the development of computer systems allowing interpretation of analysed data in a biologically meaningful manner. (Luscombe, Greenbaum & Gerstein 2001) The second and third aims pose challenges for geneticists who need to be adept not only in accessing but also in developing and using information system tools in analysing and interpreting data in the context of genetic research.
Having existing information in a database is useless without the competence to access specific data and integrate this with new data as well as analyse and interpret these in the context of the particular genetic research. Bioinformatics Needs of Geneticists Bioinformatics supports the needs of geneticists. One justification for computational systems in genetic research is to develop a global view of experimental design to integrate research efforts and results in different countries.
This is necessary to link and bring together all existing knowledge on various areas of genetic research, especially nucleic and amino acid sequences. Another justification for bioinformatics in genetics research is database mining to facilitate the generation and testing of hypothesis on the functions and structures of gene and proteins by using available data as framework (Barnes & Gray 2003) Managing and Manipulating Genetic Data Generally, there are two ways of managing and manipulating data. One is the use of software. A number of software emerged supporting the development genetic linkage maps and other purposes.
Although software varies in function, the common aim is to aid in understanding genetic linkage information and automating the research process to support effective map building. (Weaver et al. 1992) The other is network systems comprised of a database and accessibility through network connections. Networks allow geneticists to access available information such as on genetic markers so they can combine this with new data for analysis and interpretation. This supports the continuity of global genetics research. (Cheung et al. 1996) Value of Bioinformatics
The essence of bioinformatics to genetic research is innovativeness and huge potential in developing novel approaches to genetics research. Bioinformatics has the potential to analyse and interpret data not only for purposes of completing the research purpose but also for the practical purpose of usefulness in diagnosis or therapy. (Jones & Phillip 2000) The potential of bioinformatics in translating genetic data into practical solutions for actual biological problems accounts for the expected increase in the value of the bioinformatics market to billions of dollars in the next five years (World Bioinformatics Market 2008).
Bioinformatics for Genetic Study Designs and Analysis Bioinformatics allowed the development of better study designs and analysis for genetic research that address previous methodological problems. Genetics and bioinformatics both adhere to collaborative investigations. Genetic research necessitates phenotypes and researcher expertise in mapping or sequencing studies while bioinformatics depends on high quality databases as well as access and integration tools and expertise. (Schmidt 2003) Collaborative designs constitute a common challenge in genetics and bioinformatics.
With bioinformatics, problem identification in research focuses on testing hypothesis such as gene identification in cancer research or linking new and previous data such as on mutations. Data gathering is through data mining by searching various databases using the Internet and networks. Data is then subject to integration and organisation according to the biological problem studied such as combining data on protein structure with its properties and functions contained in various databases. Data analysis considers breadth and depth.
Breadth refers to analytical processes comparing genes using algorithms while depth pertains to the determination protein encoding of a particular gene. Tools such as modelling and simulation support data interpretation and presentation. (Luscombe et al. 2001) In the case of gene expression research, this required the efficient analysis of microarrays and population. Bioinformatics provided two-colour microarrays as a more effective analytical design by covering twice the number of distant pair design profiles and population.
(Fu & Jansen 2006) In cancer epidemiology research, bioinformatics supported the integration of genetic susceptibility factors to create innovative study designs (Malats & Castano-Vinyals 2007). Non-Coding RNA Bioinformatics Bioinformatics plays an important role in non-coding RNA research. Hiro et al. (2006) explained that non-coding RNA genes have weak statistical signals and the potential of bioinformatics addresses this limitation by providing systems of searching and predicting non-coding RNA. Huang et al.
(2008) discussed that bioinformatics can support non-coding RNA search through CYK-type and covariance programs but there is need to improve further programs to accommodate arbitrary RNA structures. Bioinformatics and Cancer Genetics Bioinformatics supports genetic research on cancer. Kato and Kato (2006) reported that that bioinformatics supports cancer research through omics data functions to support knowledge on genetic biomarkers linked to cancer including ‘predisposition, diagnostic, prognostic, and therapeutic markers’ using data and text mining programs.
Barnes and Gray (2007) explained the contributions of bioinformatics to cancer genetics as cancer genomes, cancer genetics design, cancer gene mutations, and other breakthroughs. Bioinformatics and Gene Identification Biotechnology significantly contributes to the identification of disease genes by allowing the management of large bulks of data including DNA sequences and microarray data in identifying genetic functions that cause diseases.
Chen and Chen (2008) explained the role of bioinformatics in linking genes and disease outcomes through gene identification algorithms incorporates into analytical software. Algorithms direct calculation and data processing by providing a sequence of instructions in handling data. Tu et al. (2006) discussed the use of the network-based stochastic algorithm in inferring disease causing genes and identifying regulatory pathways. Bioinformatics in Single Gene Orders and Mutations Bioinformatics is also useful in studying single gene orders and mutations to ascertain genetic causes of monogenic diseases.
Barnes and Gray (2007) explained that the data storage and processing solutions of bioinformatics led to outcomes such as the genome-wide map of monogenic diseases, understanding of the nature of mutations in single gene orders, and implications of epigenetics on Mendelian traits. Cooper, Stenson and Chuzhanova (2006) explained that bioinformatics support studies of single gene orders and mutations via the Human Gene Mutation Database (HGMD) that contains a wide range of information on nuclear genes, particularly germ-line mutations that are associated with inherited diseases.
In 2005, the database contained 53,000 lesions together with data on DNA sequences, splice junction, and polymorphisms. The database provides core data with which new data is combined to build knowledge on monogenic diseases. Bioinformatics and Genetic Data Mining Bioinformatics supports data mining and analysis in genetics research. Wang et al. (2005) explained that bioinformatics support genetic data mining and analysis through various databases on areas of genetic studies such as the gene mutation databases as well as software or programs operating through context-based algorithms.
This enables analytical processes, to integrate genetic data, via clustering methods for microarrays, comparison of DNA structures, identification of sequence characteristics, discovering disease markers, indexing of pathways and sequences, among others analytical outcomes. Geneticists can select which algorithmic tool in bioinformatics to use in managing and analysing particular data involved in the study or integrating data from various databases. Since bioinformatics continues to evolve, new analytical tools would evolve to address current problems in data mining and analysis for genetic research.
The National Center for Biotechnology Information (2004) identified two breakthrough contributions of bioinformatics in genetic data mining and analysis. One is protein modelling. DNA sequences can encode proteins with particular functions but protein structures cannot be visually seen using x-ray crystallography or nuclear magnetic resonance spectroscopy making it difficult to study protein encoding. Bioinformatics allows protein modelling to visualize proteins in three-dimensions and by using templates enables the prediction of structures of similar proteins.
The other is genomic mapping for the management of sequence information, which is painstaking when manually made. Bioinformatics provide maps to guide geneticists in pointing the precise location of gene sequences. Bioinformatics in Improving Genetic Research Schmidt (2003) explained that bioinformatics improve genetic research by providing evolving and improving means of managing data explosion. Research advancements resulted to large data on nucleotide sequences comprising DNA and amino acid sequences comprising proteins stored in databases around the world.
However, manually pulling out data for use in succeeding studies and analysing vast data takes time. Manual study of outputs has become tedious. Bioinformatics improve genetic research by allowing geneticists to search databases for the gene composition, proteins and mutations as well as model the chemical and structural compositions of living cells to support various health implications such as cancer and drug studies. A researcher cannot be familiar with every known genetic interaction, so that bioinformatics becomes necessary in managing and interpreting systems-level information. Conclusion
Bioinformatics has transformed the field of biology, particularly genetics, by providing approaches, processes and tools to ease the difficulties of managing and analysing large bulks of data that have accumulated over decades of genetic research. Bioinformatics has made genetics an information and practice science from primarily a laboratory-based science. Furthermore, the continuously evolving field of biometrics holds the potential to facilitate further advancements in genetics through the development of new software, algorithms, and processes useful in filling research gaps in genetics.
However, geneticists play an important role in directing and enhancing the role of bioinformatics in genetics. This requires geneticists to understand bioinformatics to develop data processing and analytical tools addressing the different needs in various areas of genetic research. References Barnes, M. R,. & Gray, I. C. eds. , 2007. Bioinformatics for geneticists. 2nd ed. Hoboken, NJ: Wiley Publishing, Inc. Chen, Y. P. & Chen, F. , 2008. Using bioinformatics techniques for gene identification in drug discovery and development. Current Drug Metabolism, 9(6), pp. 567-573.
Cheung, K. H. , Nadkarni, P. , Silverstein, S. , Kidd, J. R. , Pakstis A. J. , Miller P. & Kidd K. K. , 1996. PhenoDB: an integrated client/server database for linkage and population genetics. Computers and Biomedical Research, 29(4), pp. 327-337. Claverie, J. , 2007. Bioinformatics for dummies. 2nd ed. Hoboken, NJ: Wiley Publishing, Inc. Cooper, D. N. , Stenson, P. D. & Chuzhanova, N. A. , 2006. The Human Gene Mutation Database (HGMD) and its exploitation in the study of mutational mechanisms. Current Protocols in Bioinformatics. Unit 1. 13. Available at: http://www. ncbi.
nlm. nih. gov/pubmed/18428754? dopt=Abstract [Accessed 14 October 2008] Fu, J. & Jansen, R. C. , 2006. Optimal design and analysis of genetic studies on gene expression. Genetics, 173(3), pp. 1993-1999. Hiro, K. , Akio, K. & Masaru, T. , 2006. Bioinformatics analyses of non-coding RNA. Protein, Nucleic Acid and Enzyme, 51(16), pp. 2420-2424. Huang, Z. , Wu, Y. , Robertson, J. , Feng, L. , Malmberg, R. & Cai, L. , 2008. Fast and accurate search for non-coding RNA pseudoknot structures in genomes. Bioinformatics, 24(20), pp. 2281-2287. Jones, P. B. & Phillip, B. C. , 2000.
The commercialization of bioinformatics. Electronic Journal of Biotechnology, 3(2). Available at: http://www. scielo. cl/scielo. php? pid=S0717-34582000000200002&script=sci_arttext [Accessed 14 October 2008] Katoh, M. & Katoh, M. , 2006. Bioinformatics for cancer management in the post-genome era. Technology in Cancer Research & Treatment, 5(2), pp. 169-175. Luscombe, N. M. , Greenbaum, D. & Gerstein, M. , 2001, What is bioinformatics? A proposed definition and overview of the field. Methods of Information in Medicine, 40, pp. 346–58. Malats, N. & Castano-Vinyals, G. , 2007.
Cancer epidemiology: study designs and data analysis. Clinical and Translational Oncology, 9(5), pp. 290-297 National Center for Biotechnology Information, 2004 Bioinformatics, Available at: http://www. ncbi. nlm. nih. gov/About/primer/bioinformatics. html [Accessed 14 October 2008] Schmidt, C. W. , 2003. Data explosion: bringing order to chaos with bioinformatics. Environmental Health Perspectives, 111(6), pp. 340-345. Tu , Z. , Wang , L. , Arbeitman, M. , Chen, T. & Sun, F. , 2006. An integrative approach for causal gene identification and gene regulatory pathway inference.
Bioinformatics, 22(14), pp. 489-496 Wang, J. T. L. , Zaki, M. J. , Toivonen, H. T. T. & Shasha, D. E. eds. , 2005. Data mining on bioinformatics. London: Springer-Verlog. Weaver, R. , Helms, C. , Mishra, S. K. & Donis-Keller, H. , 1992. Software for analysis and manipulation of genetic linkage data. American Journal of Human Genetics, 50(6), pp. 1267–1274. World Bioinformatics Market. 2008. ReportLinker. com. Available at: http://www. reportlinker. com/p092468/World-Bioinformatics–Market. html [Accessed 14 October 2008]