X-Ray Crystallography: Protein Sample Requirements

A precondition for a successful crystallographic screening is a pure, stable, and monodisperse protein sample.

In the process of protein production, controls such as SDS-gels already indicate towards the purity of a protein sample. Additionally, during standard concentration determination via UV/VIS, contamination by DNA or RNA can be assessed. In the following section, typical methods for assessment of protein quality are described with the focus on crystallography-grade protein quality. 

After protein production and before crystallization it is essential to check for protein quality. Initial assessment via native Mass-Spec can confirm the correct size of the purified protein with more precision than a SDS gel and can also give first indication regarding oligomerization states or very tightly bound cofactors as well as the presence of posttranslational modifications. Even if the purified protein has the correct molecular weight, it can still be unclear whether the protein is correctly folded.  

Protein folding is a key process in biology since it is ultimately responsible for their biological function. Proteins that are misfolded or not folded at all may lead to unsatisfactory results in subsequent experiments or significant loss of protein material due to aggregation over time. Ultraviolet circular dichroism (CD) is the method of choice to monitor changes of protein structure in solution providing information of  secondary protein structure, hence, the correct folding. 

Protein samples need not only to be correctly folded but also stable and monodisperse. Thermal shift assay (TSA) measures the melting temperature of a protein (Tm), which is an indication of protein stability. Several factors such as pH, salt, cofactor, or buffer composition influence protein stability. Ideally, the protein is measured and stored in a buffer where it is more stable.  

Melting curve of T. cruzi FPPS protein: the addition of Mg2+ increases protein stability by about 5°C (Francesca Magari)

Proteins also exhibit different stability against freeze-thaw cycles. This stability is a very important factor in inter-lab and general process optimization and can be improved via buffer optimization. Proteins that are not stable against freeze-thaw cycles have to be crystallized immediately after purification and cannot be sent on dry ice. The lifetime of such proteins can usually be extended via storage at 4°C or on wet ice but is much shorter than at -80°C. However, there are some rare cases in our experience, in which storage at lower temperatures is disfavored and the protein must be kept at room temperature or above. 

Protein can be stable in a certain buffer, but the composition of the sample is not necessarily monodisperse. Monodispersity means that the protein exists in solution as a single oligomeric species, i.e., monomer or dimer, and is free of non-specific oligomers and aggregates. This can be checked with dynamic light scattering (DLS). 

In summary, the quality of crystallography-grade protein material is significantly higher compared to assay-grade protein. A checklist for crystallography-grade protein samples are

  • free from DNA/RNA
  • stable: Tm > 30 °C
  • monodisperse
  • properly folded
  • resistant to freeze-thaw-cycles
  • high purity according to analytics


Fragment hit Identification in FBDD

In the field of fragment-based drug discovery (FBDD), biophysical methods such as NMR, native MS, high-concentration biochemical screens (HCS), thermal shift assays (TSA), or surface plasmon resonance (SPR) are usually used as pre-screening techniques to screen fragment libraries to identify the most promising binders.  

However, employing a subsequent cascade of these methods prior to X-ray crystallography has been proven rather misleading as they might miss an important fraction of binding fragments that could be observed in crystal structures. In addition, the overlapping hits resulting from biophysical methods compared to X-ray crystallography have been rather poor.  

Venn diagrams show low overlaps between hits identified in X-ray crystallography and other biophysical methods.

Why not a direct crystallographic fragment screen?

Among all possible screening methods, X-ray crystallography is not only the most sensitive method for detection of fragment is binding. It also reveals the geometry of binding as precise three-dimensional positions of atoms in a protein structure. This is essential to investigate the accessible chemical space in a protein-ligand complex for further development of the initial hit into a lead candidate. Highly resolved crystal structures can also reveal the exact position of water molecules and water networks in the binding site. In this regard, it has been shown that water molecules in fixed position and their displacement can be an important data in structure-based lead discovery.  

Remarkably, X-ray protein crystallography is not only able to detect high- but also low-affinity binders that cannot be detected by any other method.  

Only after having precise structural information about the initial hit and exploring the binding pocket, other biophysical methods can be used to further characterize the protein-ligand complex and improve affinity, potency, and binding kinetics of fragment expansion campaigns.  

Osborne, J.; Jhoti, H. et al., 2020

High-quality crystallization and high-performance soaking systems

Thanks to third-generation synchrotron sources and the latest methodological improvements in automated crystal mounting systems, data collection, and processing, it has become increasingly feasible to screen entire libraries of fragments efficiently. Although automation enables efficient data collection, the most important part before data collection is the development of a high-quality crystallization and high-performance soaking systems.  This could be compared to assay development for high-throughput screening campaigns. The quality of crystals in terms of diffraction and reproducibility has to deliver consistent and comparable results. It’s not about getting one crystal at sufficient quality, but rather hundreds of crystals, which enables a screening at all. This requires specific expertise, technology and a broad range of experience.  

With novel workflows and pipelines, like FastForward, it is possible to utilize the large amount of collected data in a smart way. In fact, starting from the raw data given by X-ray protein crystallography, it is possible to process and refine the datasets of fragment library screens toward high-quality 3D models to design higheraffinity compounds in a short period of time.   

Subscribe to stay tuned