Review of the Drug Discovery Chemistry Conference 2023 in San Diego

Review of the Drug Discovery Chemistry Conference 2023, San Diego

by Dr. Serghei Glinca

The challenge of the industry remains – access to chemical matter. The DDC conference was full of highlights focusing  mainly on enabling technologies and platforms that unlock chemical matter.

Fragment-based drug discovery (FBDD)

The range of applications of FBDD ranged from non-covalent to covalent screening techniques. One of the best talks highlighting the application of crystallographic fragment screening on KEAP1 was by Marcel Verdonk. I want to emphasize that in this work the FBDD campaign was used in a versatile way. Crystallographic hits, although exhibiting low affinity, were used as starting points for fragment linking. The resulting compounds with lead-like properties were used to build up a focused HTS-library. This results in a nM compound. Stephen Fesik highlighted how to address the transcription factors Myc by targeting WDR5, which resulted in potent compounds which have been validated in animal studies. It seems that the beta propeller proteins are an emerging target family that are well amenable for FBDD.

Covalent Drug Discovery

The covalent space has been growing significantly over the past years, which is reflected by 24 talks at the conference. Dan Nomura has highlighted the application of chemoproteomic platforms using covalent fragments. It’s striking that covalent labeling leading at IDPs induces an even higher disorder leading to protein degradation. This has been demonstrated for Myc by targeting the intrinsically disordered Cys171. Prof. Pellecchia highlighted that going beyond cysteines is a viable strategy for gaining selectivity targeting e.g. lysines. Tuning experimental setups to enable covalent labeling of lysines seems to be critical for high-quality results. For example, sulfonyl chlorides can label lysines but can also react with the His-tag, which depends on the concentrations of the ligand vs. protein. It seems that library screening without the His-tag is a better idea. Joachim Broeker from Boehringer Ingelheim presented how BI-0474 was developed. FBDD and SBDD were enablers. The interesting part was that they grew their compounds by starting from a covalent screen of fragments of the S39C mutant. The fragments were bound to the switch II pocket and used the hits as non-covalent hits and grew them towards the Cys12, leading towards the reversible covalent inhibitor BI-0474

DNA-encoded Libraries (DEL)

Although I have not been participating at the DNA-encoded libraries (DELs) track, it seems that DEL screening is gradually replacing the “gold standard” to screen for Ro5 compounds in HTS. In discussions with colleagues, I learned that some companies are taking advantage of DEL screening, which enables access to a larger chemical space compared to HTS. The field is still evolving and Prof. Joerg Scheuermann from ETH showed the potential of the DEL technology for macrocycles. We’ll see more developments from the DEL space.

AI for Drug Discovery

Due to the overlap with other talks I could only join several talks of the AI for drug discovery track. Most of the AI/ML applications are still for relatively basic scenarios but machine learning complements drug discovery technologies quite well. The question that I often ask is whether we are better off with ML-based tools or with “just” executing the experiments and synthesizing molecules. Of particular note was the presentation by Bryce Allen from Differencitated Therapeutics. It was quite impressive how the preference for ligases of specific compounds can be predicted by their engine. Also, a target id case study by InSilico Medicines showed how fast hypotheses can be generated and tested using AI platforms, which was demonstrated for CDK20. I believe this is the strength of the AI/ML platforms, namely, generation of a higher number and potentially more precise hypotheses for experimentation. 

Overall, FBDD, AI and DELs are really exciting technologies that will deliver even more exciting drug discovery stories in future.


The Protein Data Bank has over 200,000 structures.

The Protein Data Bank has over 200,000 structures.

Structural biology is an important field in drug discovery as it helps researchers understand the three-dimensional structure of proteins and how they interact with other molecules. This knowledge is crucial in developing new drugs as it allows scientists to target specific proteins and understand how they contribute to diseases. 

One of the most important resources in structural biology is the Protein Data Bank (PDB). The PDB is a database that contains the 3D structures of thousands of proteins, which can be used by researchers to study the structure and function of these proteins. The PDB is a valuable tool for drug discovery as it allows scientists to analyze the structure of disease-causing proteins and identify potential drug targets. 

1. The deposited structural data helps to develop new drugs and supports biomedical research

The PDB allows for the comparison of different protein structures, which can help researchers identify common structural features that are important for disease progression. By understanding these features, scientists can develop new drugs that target these specific regions, which can help to improve the effectiveness of treatments. 

The publication by Goodsell et al. 2010 reported that 90% of the 210 new therapeutics approved by the US Food and Drug Administration (FDA) between 2010 and 2016 were discovered partly due to the PDB archive holdings. 

2. Database for AI/ML training

The public access to the PDB data laid the groundwork for the development of Artificial Intelligence and Machine Learning methods for predicting protein structures. Currently, the collection requires more than 1 TB of storage and containing more than 3,000,000 files associated with these PDB entries.  

In addition to providing structural information, the PDB also contains information on protein-ligand interactions, which can be used to understand how drugs bind to their targets. This information can be used to design new drugs that specifically target disease-causing proteins, improving their efficacy and reducing side effects. 

3. Over 50 years of research

Since its inception in 1971, structural biologists from all over the globe have contributed their experimentally determined protein and nucleic acid structure data. It is because of their efforts that this central, public database has attained this important milestone. 

We say thank you to all contributors and structural biologists who put tremendous efforts in solving 3D structures and achieving the milestone of 200,000 structures 



X-Ray Crystallography: Protein Sample Requirements

A precondition for a successful crystallographic screening is a pure, stable, and monodisperse protein sample.

In the process of protein production, controls such as SDS-gels already indicate towards the purity of a protein sample. Additionally, during standard concentration determination via UV/VIS, contamination by DNA or RNA can be assessed. In the following section, typical methods for assessment of protein quality are described with the focus on crystallography-grade protein quality. 

After protein production and before crystallization it is essential to check for protein quality. Initial assessment via native Mass-Spec can confirm the correct size of the purified protein with more precision than a SDS gel and can also give first indication regarding oligomerization states or very tightly bound cofactors as well as the presence of posttranslational modifications. Even if the purified protein has the correct molecular weight, it can still be unclear whether the protein is correctly folded.  

Protein folding is a key process in biology since it is ultimately responsible for their biological function. Proteins that are misfolded or not folded at all may lead to unsatisfactory results in subsequent experiments or significant loss of protein material due to aggregation over time. Ultraviolet circular dichroism (CD) is the method of choice to monitor changes of protein structure in solution providing information of  secondary protein structure, hence, the correct folding. 

Protein samples need not only to be correctly folded but also stable and monodisperse. Thermal shift assay (TSA) measures the melting temperature of a protein (Tm), which is an indication of protein stability. Several factors such as pH, salt, cofactor, or buffer composition influence protein stability. Ideally, the protein is measured and stored in a buffer where it is more stable.  

Melting curve of T. cruzi FPPS protein: the addition of Mg2+ increases protein stability by about 5°C (Francesca Magari)

Proteins also exhibit different stability against freeze-thaw cycles. This stability is a very important factor in inter-lab and general process optimization and can be improved via buffer optimization. Proteins that are not stable against freeze-thaw cycles have to be crystallized immediately after purification and cannot be sent on dry ice. The lifetime of such proteins can usually be extended via storage at 4°C or on wet ice but is much shorter than at -80°C. However, there are some rare cases in our experience, in which storage at lower temperatures is disfavored and the protein must be kept at room temperature or above. 

Protein can be stable in a certain buffer, but the composition of the sample is not necessarily monodisperse. Monodispersity means that the protein exists in solution as a single oligomeric species, i.e., monomer or dimer, and is free of non-specific oligomers and aggregates. This can be checked with dynamic light scattering (DLS). 

In summary, the quality of crystallography-grade protein material is significantly higher compared to assay-grade protein. A checklist for crystallography-grade protein samples are

  • free from DNA/RNA
  • stable: Tm > 30 °C
  • monodisperse
  • properly folded
  • resistant to freeze-thaw-cycles
  • high purity according to analytics


Fragment hit Identification in FBDD

In the field of fragment-based drug discovery (FBDD), biophysical methods such as NMR, native MS, high-concentration biochemical screens (HCS), thermal shift assays (TSA), or surface plasmon resonance (SPR) are usually used as pre-screening techniques to screen fragment libraries to identify the most promising binders.  

However, employing a subsequent cascade of these methods prior to X-ray crystallography has been proven rather misleading as they might miss an important fraction of binding fragments that could be observed in crystal structures. In addition, the overlapping hits resulting from biophysical methods compared to X-ray crystallography have been rather poor.  

Venn diagrams show low overlaps between hits identified in X-ray crystallography and other biophysical methods.

Why not a direct crystallographic fragment screen?

Among all possible screening methods, X-ray crystallography is not only the most sensitive method for detection of fragment is binding. It also reveals the geometry of binding as precise three-dimensional positions of atoms in a protein structure. This is essential to investigate the accessible chemical space in a protein-ligand complex for further development of the initial hit into a lead candidate. Highly resolved crystal structures can also reveal the exact position of water molecules and water networks in the binding site. In this regard, it has been shown that water molecules in fixed position and their displacement can be an important data in structure-based lead discovery.  

Remarkably, X-ray protein crystallography is not only able to detect high- but also low-affinity binders that cannot be detected by any other method.  

Only after having precise structural information about the initial hit and exploring the binding pocket, other biophysical methods can be used to further characterize the protein-ligand complex and improve affinity, potency, and binding kinetics of fragment expansion campaigns.  

Osborne, J.; Jhoti, H. et al., 2020

High-quality crystallization and high-performance soaking systems

Thanks to third-generation synchrotron sources and the latest methodological improvements in automated crystal mounting systems, data collection, and processing, it has become increasingly feasible to screen entire libraries of fragments efficiently. Although automation enables efficient data collection, the most important part before data collection is the development of a high-quality crystallization and high-performance soaking systems.  This could be compared to assay development for high-throughput screening campaigns. The quality of crystals in terms of diffraction and reproducibility has to deliver consistent and comparable results. It’s not about getting one crystal at sufficient quality, but rather hundreds of crystals, which enables a screening at all. This requires specific expertise, technology and a broad range of experience.  

With novel workflows and pipelines, like FastForward, it is possible to utilize the large amount of collected data in a smart way. In fact, starting from the raw data given by X-ray protein crystallography, it is possible to process and refine the datasets of fragment library screens toward high-quality 3D models to design higheraffinity compounds in a short period of time.   

Subscribe to stay tuned