COVID19 and other coronaviruses: how bioinformatics can help us understand emerging pathogens

Elia Brodsky - www.ebrodsky.site
Age of Awareness
Published in
4 min readMar 22, 2020

--

The corona family of viruses has been a cause of mild “common cold” symptoms for many years. The recent outbreak has been hypothesized before, but was never labeled as an imminent threat it poses to millions of people today, wreaking havoc across the globe.

Novel corona virus strain (COV-SARS-2)

Recent advances in bioinformatics are allowing us to collect and analyze data from an epidemic outbreak to understand the biology behind this coronavirus: how it mutated to adapt to humans, the process of entry into the cell and the dynamics of viral genomes as it spreads across the globe.

These recent advances include computational approaches to big data, efficient methods of data storage, analysis and visualization, user-friendly tools that can be used by non-engineers and a growing appreciation for the role of data in biological research.

Genomics of SARS-COV-2 virus origin, cell entry and replication

Our team has been working to make sense of the recently published research data, understand the important areas and topics to consider and share this information with others by preparing teaching materials, curated datasets and efficient tools that anyone can use. These are stored in various databases, including NCBI and several others but to find, download and use these for analysis, many have to learn to navigate these websites and be prepared to deal with this kind of data.

For example, we can use this data to learn about the origins of the novel coronavirus and understand why it is so much more infectious and dangerous than previously detected coronavirus strains. These changes in the viral genome will be more evident not only if we study the full genome, but also focus our attention on specific parts of the genome where changes lead to variation in human receptor binding and facilitate efficient cell entry.

This kind of information is important for vaccine design because it is important to train the immune system to recognize a virus. In the case of COVID-19, the virus uses the Spike glycoprotein to attach to the human ACE2 receptor. The glycoprotein is covered in sugar molecules, helping the virus stay undetected by the immune system. However, human cell receptors are different from bat or pangolin receptors. Also, different humans might have variation in the receptor surface or viral genomes might mutate to become more or less efficient at cell entry and replication. Therefore, we might be interested to compare this kind of variation at the protein structure level and find the most conserved sequence to use in vaccine production.

comparing bat RaTG13 and human COV-SARS-2 spike glycoproteins

These are the questions everyone is asking. But the data includes much more: other proteins are known to be involved in viral replication, post-translational modification and evasion of host immune response. These can be important for antiviral drug design. Also, emerging mutations can cause various changes to lung or other organ function, an important factor for accurate detection and treatment strategies. There are in fact too many questions that can be asked by a single group or individual. Therefore, it is important to collect, organize and make this kind of data available and easy to analyze for a diverse range of people. That’s why we will be offering guided tutorials, online training sessions and organized dashboards for the bioinformatics of viral genomic sequences.

Workshop on genomics of COVID-19: https://edu.tbioinfo.com/en-us/covid-19-bioinformatics

However, coronaviruses in bats, pangolins and other wild animals are not the only possible pandemic humanity will face in the near future. Flu, Ebola, enterovirus D68 and other diseases can quickly spread as the climate changes and natural habitats of wild animals are endangered by human activity. Bioinformatics won’t necessarily stop this process, but it can help us prepare and approach these challenges with deep insights and understanding. We will be offering a comprehensive training program for anyone (researchers, faculty, students or citizen scientists). To register, visit: https://edu.tbioinfo.com/bioinformatics-for-infectious-diseases-2020

https://edu.tbioinfo.com/bioinformatics-for-infectious-diseases-2020

--

--

Elia Brodsky - www.ebrodsky.site
Age of Awareness

Healthcare, Life Sciences, Data... In the past, startup co-founder @PineBiotech — big data, bioinformatics, healthcare