MY PROJECTS

Here are some of my projects in the field of Biotechnology and Biomedical research with a brief description of their scope. A good understanding of science in combination with application of advanced bioinformatics tools helped me to guide these projects to success.

In silico engineering using neural networks

Enzymes used in industry often require optimization of their pH and temperature. This is routinely done by generating and screening millions of mutants in the lab through the process called “random mutagenesis”. Random mutagenesis and subsequent screening take months of extensive lab work and require a rather big budget.

I created a neural network that could perform in silico random mutagenesis and screening in just a few seconds! In some cases only ONE amino acid change was sufficient to create a desired enzyme de novo, demonstrating that machine learning helps research to be smarter, faster and more affordable.

Genome mining

Genome mining can be used to identify novel products of interest (e.g. antibiotics), elucidate biochemical pathways and explore the diversity of protein domain organization in microbes.

See my recent publications to learn more about these topics and existing bioinformatics tools and approaches developed for genome mining:

https://pubmed.ncbi.nlm.nih.gov/28222763/

https://pubmed.ncbi.nlm.nih.gov/32850729/

https://pubmed.ncbi.nlm.nih.gov/33414312/

Ancestral Sequence Reconstruction

All enzyme sequences known today have evolved to optimise affinity and activity for a specific substrate. However, what has evolved in nature is often far from optimised for what is needed in industrial applications.

By taking a step back from a directed-evolution approach and using the knowledge inherent in phylogenetic trees I reconstructed several ancestral enzymes. These enzymes can degrade novel substrates, have a higher thermostability and a greater activity under industrial conditions.

Microbiome analysis

In this project I studied what happens to the plant microbiome when we treat our food with chemicals, i.e. fertilizers.

Using ion torrent next-generation sequencing for 16S rRNA-based bacterial community profiling, several important bacterial groups associated specifically with fertilized soils were identified. These bacterial groups serve as indicators to assess the sustainability of agricultural soil management and to monitor trends in soil conditions over time.

DRAGEN for diagnostics

A decade ago sequencing was still expensive and data throughput was low. With the latest generation of sequencers, prices have dropped dramatically and data output reaches an astounding 6 Tb.

Analyzing all this data in a timely manner is only possible by using new innovative algorithms and dedicated hardware such as a field-programmable gate array (FPGA).

DRAGEN is a recent example of a FPGA that performs SNV calling on the human genome in just 30 min as apposed to 6 hours using an industrial standard HPC cluster. Together with 3 mayor University Medical Centers in the Netherlands, I am currenlty setting up DRAGEN for diagnosics (oncology, pathology). Check my blog posts if you would like to learn more about it.

Codon optimization with unsupervised machine learning

In this project, I applied unsupervised machine learning to optimize the production and secretion of amylases for a biotech company specialized in food enzymes.

Gains in protein production reported by codon usage optimization can be up to a 100% increase. The standard approach is to codon optimize the entire protein using a single algorithm. Since signal peptides play a vital role in protein production, I decided to codon optimize the signal peptide separately from the mature protein. For this I developed a special algorithm that clusters signal peptides with low, medium and high productivity scores. With this novel approach, a 300% increase in amylase secretion was reached.

You can find more about codon optimization and its significance for protein production in my blog.

MY PROJECTS

Subscribe to Better Learn to Code newsletter