Biological data

Last updated October 14, 2023

Biological data refers to a compound or information derived from living organisms and their products. A medicinal compound made from living organisms, such as a serum or a vaccine, could be characterized as biological data. Biological data is highly complex when compared with other forms of data. There are many forms of biological data, including text, sequence data, protein structure, genomic data and amino acids, and links among others.

Biological Data and Bioinformatics
Types of Biological Data
Biomedical Databases
Bio-hacking and Privacy Threats
Bio-hacking
Genetic Samples as Personal Data
Applications of Deep Learning to Biological Data
Challenges to Data Mining in Biomedical Informatics
Complexity
Database Errors and Abuses
Biomedical Data Sharing
Attitudes Towards Data Sharing
Challenges to Data Sharing
References

Biological Data and Bioinformatics

Biological data works closely with Bioinformatics, which is a recent discipline focusing on addressing the need to analyze and interpret vast amounts of genomic data.

In the past few decades, leaps in genomic research have led to massive amounts of biological data. As a result, bioinformatics was created as the convergence of genomics, biotechnology, and information technology, while concentrating on biological data.

Biological Data has also been difficult to define, as bioinformatics is a wide-encompassing field. Further, the question of what constitutes as being a living organism has been contentious, as "alive" represents a nebulous term that encompasses molecular evolution, biological modeling, biophysics, and systems biology. From the past decade onwards, bioinformatics and the analysis of biological data have been thriving as a result of leaps in technology required to manage and interpret data. It is currently a thriving field, as society has become more concentrated on the acquisition, transfer, and exploitation of bioinformatics and biological data.

Types of Biological Data

Biological Data can be extracted for use in the domains of omics, bio-imaging, and medical imaging. Life scientists value biological data to provide molecular details in living organisms. Tools for DNA sequencing, gene expression (GE), bio-imaging, neuro-imaging, and brain-machine interfaces are all domains that utilize biological data, and model biological systems with high dimensionality.^[1]

Moreover, raw biological sequence data usually refers to DNA, RNA, and amino acids.^[1]

Biological Data can also be described as data on biological entities.^[2] For instance, characteristics such as: sequences, graphs, geometric information, scalar and vector fields, patterns, constraints, images, and spatial information may all be characterized as biological data, as they describe features of biological beings. In many instances, biological data are associated with several of these categories. For instance, as described in the National Institute of Health's report on Catalyzing Inquiry at the Interface of Computing and Biology, a protein structure may be associated with a one-dimensional sequence, a two-dimensional image, and a three dimensional structure, and so on.^[2]

CATH - Protein Structure Classification Database

Biomedical Databases

Biomedical Databases have often been referred to as the databases of Electronic Health Records (EHRs), genomic data in decentralized federal database systems, and biological data, including genomic data, collected from large-scale clinical studies. ^[3]^[4]

Bio-hacking and Privacy Threats

Bio-hacking

Bio-computing attacks have become more common as recent studies have shown that common tools may allow an assailant to synthesize biological information which can be used to hijack information from DNA-analyses.^[5] The threat of biohacking has become more apparent as DNA-analysis increases in commonality in fields such as forensic science, clinical research, and genomics.

Biohacking can be carried out by synthesizing malicious DNA and inserted into biological samples. Researchers have established scenarios that demonstrate the threat of biohacking, such as a hacker reaching a biological sample by hiding malicious DNA on common surfaces, such as lab coats, benches, or rubber gloves, which would then contaminate the genetic data.^[5]

However, the threat of biohacking may be mitigated by using similar techniques that are used to prevent conventional injection attacks. Clinicians and researchers may mitigate a bio-hack by extracting genetic information from biological samples, and comparing the samples to identify material unknown materials. Studies have shown that comparing genetic information with biological samples, to identify bio-hacking code, has been up to 95% effective in detecting malicious DNA inserts in bio-hacking attacks.^[5]

Genetic Samples as Personal Data

Privacy concerns in genomic research have arises around the notion of whether or not genomic samples contain personal data, or should be regarded as physical matter.^[6] Moreover, concerns arise as some countries recognize genomic data as personal data (and apply data protection rules) while other countries regard the samples in terms of physical matter and do not apply the same data protection laws to genomic samples. The forthcoming General Data Protection Regulation (GDPR) has been cited as a potential legal instrument that may better enforce privacy regulations in bio-banking and genomic research.^[6]

However, ambiguity surrounding the definition of "personal data" in the text of the GDPR, especially regarding biological data, has led to doubts on whether regulation will be enforced for genetic samples. Article 4(1) states that personal data is defined as "Any information relating to an identified or identifiable natural person ('data subject')"^[7]

Applications of Deep Learning to Biological Data

As a result of rapid advances in data science and computational power, life scientists have been able to apply data-intensive machine learning methods to biological data, such as deep learning (DL), reinforcement learning (RL), and their combination (deep RL). These methods, alongside increases in data storage and computing, have allowed life scientists to mine biological data and analyze data sets that were previously too large or complex. Deep Learning (DL) and reinforcement learning (RL) have been used in the field of omics research^[1] (which includes genomics, proteomics, or metabolomics.) Typically, raw biological sequence data (such as DNA, RNA, and amino acids) is extracted and used to analyze features, functions, structures, and molecular dynamics from the biological data. From that point onwards, different analyses may be performed, such as GE profiling splicing junction prediction, and protein-protein interaction evaluation may all be performed.^[1]

Reinforcement learning, a term stemming from behavioral psychology, is a method of problem solving by learning things through trial and error. Reinforcement learning can be applied to biological data, in the field of omics, by using RL to predict bacterial genomes.^[8]

Other studies have shown that reinforcement learning can be used to accurately predict biological sequence annotation.^[9]

Deep Learning (DL) architectures are also useful in training biological data. For instance, DL architectures that target pixel levels of biological images have been used to identify the process of mitosis in histological images of the breast. DL architectures have also been used to identify nuclei in images of breast cancer cells.^[10]

Challenges to Data Mining in Biomedical Informatics

Complexity

The primary problem facing biomedical data models has typically been complexity, as life scientists in clinical settings and biomedical research face the possibility of information overload. However, information overload has often been a debated phenomenon in medical fields.^[11] Computational advances have allowed for separate communities to form under different philosophies. For instance, data mining and machine learning researchers search for relevant patterns in biological data, and the architecture does not rely on human intervention. However, there are risks involved when modeling artifacts when human intervention, such as end user comprehension and control, are lessened.^[12]

Researchers have pointed out that with increasing health care costs and tremendous amounts of underutilized data, health information technologies may be the key to improving the efficiency and quality of healthcare.^[11]

Database Errors and Abuses

Electronic health records (EHR) can contain genomic data from millions of patients, and the creation of these databases has resulted in both praise and concern.^[4]

Legal scholars have pointed towards three primary concerns for increasing litigation pertaining to biomedical databases. First, data contained in biomedical databases may be incorrect or incomplete. Second, systemic biases, which may arise from researcher biases or the nature of the biological data, may threaten the validity of research results. Third, the presence of data mining in biological databases can make it easier for individuals with political, social, or economic agendas to manipulate research findings to sway public opinion.^[13]^[4]

An example of database misuse occurred in 2009 when the Journal of Psychiatric Research published a study that associated abortion to psychiatric disorders.^[14] The purpose of the study was to analyze associations between abortion history and psychiatric disorders, such as anxiety disorders (including panic disorder, PTSD, and agoraphobia) alongside substance abuse disorders and mood disorders.

However, the study was discredited in 2012 when scientists scrutinized the methodology of the study and found it severely faulty.^[15] The researchers had used "national data sets with reproductive history and mental health variables"^[14] to produce their findings. However, the researchers had failed to compare women (who had unplanned pregnancies and had abortions) to the group of women who did not have abortions, while focusing on psychiatric problems that occurred after the terminated pregnancies. As a result, the findings which appeared to give scientific credibility, gave rise to several states enacting legislation^[16] that required women to seek counseling before abortions, due to the potential of long-term mental health consequences.

Another article, published in the New York Times, demonstrated how Electronic Health Records (EHR) systems could be manipulated by doctors to exaggerate the amount of care they provided for purposes of Medicare reimbursement.^[17]^[4]

Biomedical Data Sharing

Sharing biomedical data has been touted as an effective way to enhance research reproducibility and scientific discovery.^[13]^[18]

While researchers struggle with technological issues in sharing data, social issues are also a barrier to sharing biological data. For instance, clinicians and researchers face unique challenges to sharing biological or health data within their medical communities, such as privacy concerns and patient privacy laws such as HIPAA. ^[19]

Attitudes Towards Data Sharing

According to a 2015 study^[19] focusing on the attitudes of practices of clinicians and scientific research staff, a majority of the respondents reported data sharing as important to their work, but signified that their expertise in the subject was low. Of the 190 respondents to the survey, 135 identified themselves as clinical or basic research scientists, and the population of the survey included clinical and basic research scientists in the Intramural Research Program at the National Institute of Health. The study also found that, among the respondents, sharing data directly with other clinicians was a common practice, but the subjects of the study had little practice uploading data to a repository.

Within the field of biomedical research, data sharing has been promoted^[20] as an important way for researchers to share and reuse data in order to fully capture the benefits towards personalized and precision medicine.^[19]

Challenges to Data Sharing

Data sharing in healthcare has remained a challenge for several reasons. Despite research advances in data sharing in healthcare, many healthcare organizations remain reluctant or unwilling to release medical data on account of privacy laws such as the Health Insurance Portability and Accountability Act (HIPAA). Moreover, sharing biological data between institutions requires protecting confidentiality for data that may span several organizations. Achieving data syntax and semantic heterogeneity while meeting diverse privacy requirements are all factors that pose barriers to data sharing.^[21]

Related Research Articles

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.

Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer engineering which uses bioengineering to build computers.

A DNA microarray is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Each DNA spot contains picomoles of a specific DNA sequence, known as probes. These can be a short section of a gene or other DNA element that are used to hybridize a cDNA or cRNA sample under high-stringency conditions. Probe-target hybridization is usually detected and quantified by detection of fluorophore-, silver-, or chemiluminescence-labeled targets to determine relative abundance of nucleic acid sequences in the target. The original nucleic acid arrays were macro arrays approximately 9 cm × 12 cm and the first computerized image based analysis was published in 1981. It was invented by Patrick O. Brown. An example of its application is in SNPs arrays for polymorphisms in cardiovascular diseases, cancer, pathogens and GWAS analysis. It is also used for the identification of structural variations and the measurement of gene expression.

Genetic testing, also known as DNA testing, is used to identify changes in DNA sequence or chromosome structure. Genetic testing can also include measuring the results of genetic changes, such as RNA analysis as an output of gene expression, or through biochemical analysis to measure specific protein output. In a medical setting, genetic testing can be used to diagnose or rule out suspected genetic disorders, predict risks for specific conditions, or gain information that can be used to customize medical treatments based on an individual's genetic makeup. Genetic testing can also be used to determine biological relatives, such as a child's biological parentage through DNA paternity testing, or be used to broadly predict an individual's ancestry. Genetic testing of plants and animals can be used for similar reasons as in humans, to gain information used for selective breeding, or for efforts to boost genetic diversity in endangered populations.

Health informatics is the field of science and engineering that aims at developing methods and technologies for the acquisition, processing, and study of patient data, which can come from different sources and modalities, such as electronic health records, diagnostic test results, medical scans. The health domain provides an extremely wide variety of problems that can be tackled using computational techniques.

Computational genomics refers to the use of computational and statistical analysis to decipher biology from genome sequences and related data, including both DNA and RNA sequence as well as other "post-genomic" data. These, in combination with computational and statistical approaches to understanding the function of the genes and statistical association analysis, this field is also often referred to as Computational and Statistical Genetics/genomics. As such, computational genomics may be regarded as a subset of bioinformatics and computational biology, but with a focus on using whole genomes to understand the principles of how the DNA of a species controls its biology at the molecular level and beyond. With the current abundance of massive biological datasets, computational studies have become one of the most important means to biological discovery.

Personalized medicine, also referred to as precision medicine, is a medical model that separates people into different groups—with medical decisions, practices, interventions and/or products being tailored to the individual patient based on their predicted response or risk of disease. The terms personalized medicine, precision medicine, stratified medicine and P4 medicine are used interchangeably to describe this concept though some authors and organisations use these expressions separately to indicate particular nuances.

Biomedical text mining refers to the methods and study of how text mining may be applied to texts and literature of the biomedical domain. As a field of research, biomedical text mining incorporates ideas from natural language processing, bioinformatics, medical informatics and computational linguistics. The strategies in this field have been applied to the biomedical literature available through services such as PubMed.

<span class="mw-page-title-main">Biobank</span> Repository of biological samples used for research

A biobank is a type of biorepository that stores biological samples for use in research. Biobanks have become an important resource in medical research, supporting many types of contemporary research like genomics and personalized medicine.

Personal genomics or consumer genetics is the branch of genomics concerned with the sequencing, analysis and interpretation of the genome of an individual. The genotyping stage employs different techniques, including single-nucleotide polymorphism (SNP) analysis chips, or partial or full genome sequencing. Once the genotypes are known, the individual's variations can be compared with the published literature to determine likelihood of trait expression, ancestry inference and disease risk.

Translational bioinformatics (TBI) is a field that emerged in the 2010s to study health informatics, focused on the convergence of molecular bioinformatics, biostatistics, statistical genetics and clinical informatics. Its focus is on applying informatics methodology to the increasing amount of biomedical and genomic data to formulate knowledge and medical tools, which can be utilized by scientists, clinicians, and patients. Furthermore, it involves applying biomedical research to improve human health through the use of computer-based information system. TBI employs data mining and analyzing biomedical informatics in order to generate clinical knowledge for application. Clinical knowledge includes finding similarities in patient populations, interpreting biological information to suggest therapy treatments and predict health outcomes.

In bioinformatics, a Gene Disease Database is a systematized collection of data, typically structured to model aspects of reality, in a way to comprehend the underlying mechanisms of complex diseases, by understanding multiple composite interactions between phenotype-genotype relationships and gene-disease mechanisms. Gene Disease Databases integrate human gene-disease associations from various expert curated databases and text mining derived associations including Mendelian, complex and environmental diseases.

PrecisionFDA is a secure, collaborative, high-performance computing platform that has established a growing community of experts around the analysis of biological datasets in order to advance precision medicine, inform regulatory science, and enable improvements in health outcomes. This cloud-based platform is developed and served by the United States Food and Drug Administration (FDA). PrecisionFDA connects experts, citizen scientists, and scholars from around the world and provides them with a library of computational tools, workflow features, and reference data. The platform allows researchers to upload and compare data against reference genomes, and execute bioinformatic pipelines. The variant call file (VCF) comparator tool also enables users to compare their genetic test results to reference genomes. The platform's code is open source and available on GitHub. The platform also features a crowdsourcing model to sponsor community challenges in order to stimulate the development of innovative analytics that inform precision medicine and regulatory science. Community members from around the world come together to participate in scientific challenges, solving problems that demonstrate the effectiveness of their tools, testing the capabilities of the platform, sharing their results, and engaging the community in discussions. Globally, precisionFDA has more than 5,000 users.

Genetic privacy involves the concept of personal privacy concerning the storing, repurposing, provision to third parties, and displaying of information pertaining to one's genetic information. This concept also encompasses privacy regarding the ability to identify specific individuals by their genetic sequence, and the potential to gain information on specific characteristics about that person via portions of their genetic information, such as their propensity for specific diseases or their immediate or distant ancestry.

Machine learning in bioinformatics is the application of machine learning algorithms to bioinformatics, including genomics, proteomics, microarrays, systems biology, evolution, and text mining.

DNA encryption is the process of hiding or perplexing genetic information by a computational method in order to improve genetic privacy in DNA sequencing processes. The human genome is complex and long, but it is very possible to interpret important, and identifying, information from smaller variabilities, rather than reading the entire genome. A whole human genome is a string of 3.2 billion base paired nucleotides, the building blocks of life, but between individuals the genetic variation differs only by 0.5%, an important 0.5% that accounts for all of human diversity, the pathology of different diseases, and ancestral story. Emerging strategies incorporate different methods, such as randomization algorithms and cryptographic approaches, to de-identify the genetic sequence from the individual, and fundamentally, isolate only the necessary information while protecting the rest of the genome from unnecessary inquiry. The priority now is to ascertain which methods are robust, and how policy should ensure the ongoing protection of genetic privacy.

Biocuration is the field of life sciences dedicated to organizing biomedical data, information and knowledge into structured formats, such as spreadsheets, tables and knowledge graphs. The biocuration of biomedical knowledge is made possible by the cooperative work of biocurators, software developers and bioinformaticians and is at the base of the work of biological databases.

Biomedical data science is a multidisciplinary field which leverages large volumes of data to promote biomedical innovation and discovery. Biomedical data science draws from various fields including Biostatistics, Biomedical informatics, and machine learning, with the goal of understanding biological and medical data. It can be viewed as the study and application of data science to solve biomedical problems. Modern biomedical datasets often have specific features which make their analyses difficult, including:

References

1 2 3 4 Mahmud, Mufti; Kaiser, Mohammed Shamim; Hussain, Amir; Vassanelli, Stefano (June 2018). "Applications of Deep Learning and Reinforcement Learning to Biological Data". IEEE Transactions on Neural Networks and Learning Systems. 29 (6): 2063–2079. doi:10.1109/tnnls.2018.2790388. hdl: 1893/26814 . ISSN 2162-237X. PMID 29771663. S2CID 9823884.
1 2 Wooley, John C.; Lin, Herbert S.; Biology, National Research Council (US) Committee on Frontiers at the Interface of Computing and (2005). On the Nature of Biological Data. National Academies Press (US).
↑ Nadkarni, P. M.; Brandt, C.; Frawley, S.; Sayward, F. G.; Einbinder, R.; Zelterman, D.; Schacter, L.; Miller, P. L. (1998-03-01). "Managing Attribute-Value Clinical Trials Data Using the ACT/DB Client-Server Database System". Journal of the American Medical Informatics Association. 5 (2): 139–151. doi:10.1136/jamia.1998.0050139. ISSN 1067-5027. PMC 61285 . PMID 9524347.
1 2 3 4 Hoffman, Sharona; Podgurski, Andy (2013). "The use and misuse of biomedical data: is bigger really better?". American Journal of Law & Medicine. 39 (4): 497–538. doi:10.1177/009885881303900401. ISSN 0098-8588. PMID 24494442. S2CID 35371353.
1 2 3 Islam, Mohd Siblee; Ivanov, S.; Robson, E.; Dooley-Cullinane, T.; Coffey, L.; Doolin, K.; Balasubramaniam, S. (2019). "Genetic similarity of biological samples to counter bio-hacking of DNA-sequencing functionality". Scientific Reports. 9 (1): 8684. Bibcode:2019NatSR...9.8684I. doi:10.1038/s41598-019-44995-6. PMC 6581904 . PMID 31213619. S2CID 190652460.
1 2 Hallinan, Dara; De Hert, Paul (2016), Mittelstadt, Brent Daniel; Floridi, Luciano (eds.), "Many Have It Wrong – Samples do Contain Personal Data: The Data Protection Regulation as a Superior Framework to Protect Donor Interests in Biobanking and Genomic Research", The Ethics of Biomedical Big Data, Law, Governance and Technology Series, Cham: Springer International Publishing, vol. 29, pp. 119–137, doi:10.1007/978-3-319-33525-4_6, ISBN 978-3-319-33525-4 , retrieved 2020-12-09
↑ "Statewatch.org" (PDF). StateWatch.org. Retrieved 3 July 2015.
↑ Chuang, Li-Yeh; Tsai, Jui-Hung; Yang, Cheng-Hong (July 2010). "Binary particle swarm optimization for operon prediction". Nucleic Acids Research. 38 (12): e128. doi:10.1093/nar/gkq204. ISSN 0305-1048. PMC 2896535 . PMID 20385582.
↑ Ralha, C. G.; Schneider, H. W.; Walter, M. E. M. T.; Bazzan, A. L. (October 2010). "Reinforcement Learning Method for BioAgents". 2010 Eleventh Brazilian Symposium on Neural Networks. pp. 109–114. doi:10.1109/SBRN.2010.27. ISBN 978-1-4244-8391-4. S2CID 14685651.
↑ Xu, Jun; Xiang, Lei; Liu, Qingshan; Gilmore, Hannah; Wu, Jianzhong; Tang, Jinghai; Madabhushi, Anant (January 2016). "Stacked Sparse Autoencoder (SSAE) for Nuclei Detection on Breast Cancer Histopathology Images". IEEE Transactions on Medical Imaging. 35 (1): 119–130. doi:10.1109/TMI.2015.2458702. ISSN 0278-0062. PMC 4729702 . PMID 26208307.
1 2 Holzinger, Andreas; Jurisica, Igor (2014), Holzinger, Andreas; Jurisica, Igor (eds.), "Knowledge Discovery and Data Mining in Biomedical Informatics: The Future is in Integrative, Interactive Machine Learning Solutions", Interactive Knowledge Discovery and Data Mining in Biomedical Informatics: State-of-the-Art and Future Challenges, Lecture Notes in Computer Science, Berlin, Heidelberg: Springer, vol. 8401, pp. 1–18, doi:10.1007/978-3-662-43968-5_1, ISBN 978-3-662-43968-5 , retrieved 2020-12-09
↑ Shneiderman, Ben (March 2002). "Inventing Discovery Tools: Combining Information Visualization with Data Mining". Information Visualization. 1 (1): 5–12. doi:10.1057/palgrave.ivs.9500006. hdl: 1903/6484 . ISSN 1473-8716. S2CID 208272047.
1 2 Mittelstadt, Brent Daniel; Floridi, Luciano (April 2016). "The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts". Science and Engineering Ethics. 22 (2): 303–341. doi:10.1007/s11948-015-9652-2. ISSN 1471-5546. PMID 26002496. S2CID 23142795.
1 2 Coleman, Priscilla K.; Coyle, Catherine T.; Shuping, Martha; Rue, Vincent M. (May 2009). "Induced abortion and anxiety, mood, and substance abuse disorders: isolating the effects of abortion in the national comorbidity survey". Journal of Psychiatric Research. 43 (8): 770–776. doi:10.1016/j.jpsychires.2008.10.009. ISSN 1879-1379. PMID 19046750.
↑ Kessler, Ronald C.; Schatzberg, Alan F. (March 2012). "Commentary on Abortion Studies of Steinberg and Finer (Social Science & Medicine 2011; 72:72–82) and Coleman (Journal of Psychiatric Research 2009;43:770–6 & Journal of Psychiatric Research 2011;45:1133–4)". Journal of Psychiatric Research. 46 (3): 410–411. doi:10.1016/j.jpsychires.2012.01.021.
↑ "Counseling and Waiting Periods for Abortion". Guttmacher Institute. 2016-03-14. Retrieved 2020-12-09.
↑ Abelson, Reed; Creswell, Julie; Palmer, Griff (2012-09-22). "Medicare Bills Rise as Records Turn Electronic (Published 2012)". The New York Times. ISSN 0362-4331 . Retrieved 2020-12-09.
↑ Kalkman, Shona; Mostert, Menno; Gerlinger, Christoph; van Delden, Johannes J. M.; van Thiel, Ghislaine J. M. W. (March 28, 2019). "Responsible data sharing in international health research: a systematic review of principles and norms". BMC Medical Ethics. 20 (1): 21. doi: 10.1186/s12910-019-0359-9 . ISSN 1472-6939. PMC 6437875 . PMID 30922290.
1 2 3 Federer, Lisa M.; Lu, Ya-Ling; Joubert, Douglas J.; Welsh, Judith; Brandys, Barbara (2015-06-24). Kanungo, Jyotshna (ed.). "Biomedical Data Sharing and Reuse: Attitudes and Practices of Clinical and Scientific Research Staff". PLOS ONE. 10 (6): e0129506. Bibcode:2015PLoSO..1029506F. doi: 10.1371/journal.pone.0129506 . ISSN 1932-6203. PMC 4481309 . PMID 26107811.
↑ Shneiderman, Ben (2016-07-21). "Inventing Discovery Tools: Combining Information Visualization with Data Mining1". Information Visualization. 1: 5–12. doi:10.1057/palgrave.ivs.9500006. hdl: 1903/6484 . S2CID 208272047.
↑ Wimmer, Hayden; Yoon, Victoria Y.; Sugumaran, Vijayan (2016-08-01). "A multi-agent system to support evidence based medicine and clinical decision making via data sharing and data privacy". Decision Support Systems. 88: 51–66. doi:10.1016/j.dss.2016.05.008. ISSN 0167-9236.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[:0-1] 1 2 3 4 Mahmud, Mufti; Kaiser, Mohammed Shamim; Hussain, Amir; Vassanelli, Stefano (June 2018). "Applications of Deep Learning and Reinforcement Learning to Biological Data". IEEE Transactions on Neural Networks and Learning Systems. 29 (6): 2063–2079. doi:10.1109/tnnls.2018.2790388. hdl: 1893/26814 . ISSN 2162-237X. PMID 29771663. S2CID 9823884.

[:7-2] 1 2 Wooley, John C.; Lin, Herbert S.; Biology, National Research Council (US) Committee on Frontiers at the Interface of Computing and (2005). On the Nature of Biological Data. National Academies Press (US).

[3] Nadkarni, P. M.; Brandt, C.; Frawley, S.; Sayward, F. G.; Einbinder, R.; Zelterman, D.; Schacter, L.; Miller, P. L. (1998-03-01). "Managing Attribute-Value Clinical Trials Data Using the ACT/DB Client-Server Database System". Journal of the American Medical Informatics Association. 5 (2): 139–151. doi:10.1136/jamia.1998.0050139. ISSN 1067-5027. PMC 61285 . PMID 9524347.

[:4-4] 1 2 3 4 Hoffman, Sharona; Podgurski, Andy (2013). "The use and misuse of biomedical data: is bigger really better?". American Journal of Law & Medicine. 39 (4): 497–538. doi:10.1177/009885881303900401. ISSN 0098-8588. PMID 24494442. S2CID 35371353.

[:1-5] 1 2 3 Islam, Mohd Siblee; Ivanov, S.; Robson, E.; Dooley-Cullinane, T.; Coffey, L.; Doolin, K.; Balasubramaniam, S. (2019). "Genetic similarity of biological samples to counter bio-hacking of DNA-sequencing functionality". Scientific Reports. 9 (1): 8684. Bibcode:2019NatSR...9.8684I. doi:10.1038/s41598-019-44995-6. PMC 6581904 . PMID 31213619. S2CID 190652460.

[:5-6] 1 2 Hallinan, Dara; De Hert, Paul (2016), Mittelstadt, Brent Daniel; Floridi, Luciano (eds.), "Many Have It Wrong – Samples do Contain Personal Data: The Data Protection Regulation as a Superior Framework to Protect Donor Interests in Biobanking and Genomic Research", The Ethics of Biomedical Big Data, Law, Governance and Technology Series, Cham: Springer International Publishing, vol. 29, pp. 119–137, doi:10.1007/978-3-319-33525-4_6, ISBN 978-3-319-33525-4 , retrieved 2020-12-09

[7] "Statewatch.org" (PDF). StateWatch.org. Retrieved 3 July 2015.

[8] Chuang, Li-Yeh; Tsai, Jui-Hung; Yang, Cheng-Hong (July 2010). "Binary particle swarm optimization for operon prediction". Nucleic Acids Research. 38 (12): e128. doi:10.1093/nar/gkq204. ISSN 0305-1048. PMC 2896535 . PMID 20385582.

[9] Ralha, C. G.; Schneider, H. W.; Walter, M. E. M. T.; Bazzan, A. L. (October 2010). "Reinforcement Learning Method for BioAgents". 2010 Eleventh Brazilian Symposium on Neural Networks. pp. 109–114. doi:10.1109/SBRN.2010.27. ISBN 978-1-4244-8391-4. S2CID 14685651.

[10] Xu, Jun; Xiang, Lei; Liu, Qingshan; Gilmore, Hannah; Wu, Jianzhong; Tang, Jinghai; Madabhushi, Anant (January 2016). "Stacked Sparse Autoencoder (SSAE) for Nuclei Detection on Breast Cancer Histopathology Images". IEEE Transactions on Medical Imaging. 35 (1): 119–130. doi:10.1109/TMI.2015.2458702. ISSN 0278-0062. PMC 4729702 . PMID 26208307.

[:2-11] 1 2 Holzinger, Andreas; Jurisica, Igor (2014), Holzinger, Andreas; Jurisica, Igor (eds.), "Knowledge Discovery and Data Mining in Biomedical Informatics: The Future is in Integrative, Interactive Machine Learning Solutions", Interactive Knowledge Discovery and Data Mining in Biomedical Informatics: State-of-the-Art and Future Challenges, Lecture Notes in Computer Science, Berlin, Heidelberg: Springer, vol. 8401, pp. 1–18, doi:10.1007/978-3-662-43968-5_1, ISBN 978-3-662-43968-5 , retrieved 2020-12-09

[12] Shneiderman, Ben (March 2002). "Inventing Discovery Tools: Combining Information Visualization with Data Mining". Information Visualization. 1 (1): 5–12. doi:10.1057/palgrave.ivs.9500006. hdl: 1903/6484 . ISSN 1473-8716. S2CID 208272047.

[:8-13] 1 2 Mittelstadt, Brent Daniel; Floridi, Luciano (April 2016). "The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts". Science and Engineering Ethics. 22 (2): 303–341. doi:10.1007/s11948-015-9652-2. ISSN 1471-5546. PMID 26002496. S2CID 23142795.

[:6-14] 1 2 Coleman, Priscilla K.; Coyle, Catherine T.; Shuping, Martha; Rue, Vincent M. (May 2009). "Induced abortion and anxiety, mood, and substance abuse disorders: isolating the effects of abortion in the national comorbidity survey". Journal of Psychiatric Research. 43 (8): 770–776. doi:10.1016/j.jpsychires.2008.10.009. ISSN 1879-1379. PMID 19046750.

[15] Kessler, Ronald C.; Schatzberg, Alan F. (March 2012). "Commentary on Abortion Studies of Steinberg and Finer (Social Science & Medicine 2011; 72:72–82) and Coleman (Journal of Psychiatric Research 2009;43:770–6 & Journal of Psychiatric Research 2011;45:1133–4)". Journal of Psychiatric Research. 46 (3): 410–411. doi:10.1016/j.jpsychires.2012.01.021.

[16] "Counseling and Waiting Periods for Abortion". Guttmacher Institute. 2016-03-14. Retrieved 2020-12-09.

[17] Abelson, Reed; Creswell, Julie; Palmer, Griff (2012-09-22). "Medicare Bills Rise as Records Turn Electronic (Published 2012)". The New York Times. ISSN 0362-4331 . Retrieved 2020-12-09.

[18] Kalkman, Shona; Mostert, Menno; Gerlinger, Christoph; van Delden, Johannes J. M.; van Thiel, Ghislaine J. M. W. (March 28, 2019). "Responsible data sharing in international health research: a systematic review of principles and norms". BMC Medical Ethics. 20 (1): 21. doi: 10.1186/s12910-019-0359-9 . ISSN 1472-6939. PMC 6437875 . PMID 30922290.

[:3-19] 1 2 3 Federer, Lisa M.; Lu, Ya-Ling; Joubert, Douglas J.; Welsh, Judith; Brandys, Barbara (2015-06-24). Kanungo, Jyotshna (ed.). "Biomedical Data Sharing and Reuse: Attitudes and Practices of Clinical and Scientific Research Staff". PLOS ONE. 10 (6): e0129506. Bibcode:2015PLoSO..1029506F. doi: 10.1371/journal.pone.0129506 . ISSN 1932-6203. PMC 4481309 . PMID 26107811.

[20] Shneiderman, Ben (2016-07-21). "Inventing Discovery Tools: Combining Information Visualization with Data Mining1". Information Visualization. 1: 5–12. doi:10.1057/palgrave.ivs.9500006. hdl: 1903/6484 . S2CID 208272047.

[21] Wimmer, Hayden; Yoon, Victoria Y.; Sugumaran, Vijayan (2016-08-01). "A multi-agent system to support evidence based medicine and clinical decision making via data sharing and data privacy". Decision Support Systems. 88: 51–66. doi:10.1016/j.dss.2016.05.008. ISSN 0167-9236.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]