Project

Table of Contents

    Components

    mRNA

    mRNA, or messenger RNA, is a crucial molecule involved in the process of gene expression. It carries genetic information from the DNA in the cell’s nucleus to the ribosomes in the cytoplasm, where it serves as a template for protein synthesis. mRNA is transcribed from DNA through a process called transcription, where an enzyme called RNA polymerase creates a complementary RNA strand based on the DNA template.
    mRNA molecules are single-stranded and composed of a linear sequence of nucleotides. Each nucleotide in mRNA consists of one of four bases: adenine (A), cytosine (C), guanine (G), or uracil (U). The order of these bases in the mRNA sequence determines the order of amino acids during protein synthesis.
    Once synthesised, mRNA undergoes several modifications to ensure its stability and functionality. These modifications include adding a protective cap structure at the 5′ end and a poly(A) tail at the 3′ end. These modifications help in mRNA stability, nuclear export, and efficient translation. To make the mRNA we produce by IVT stable, improve its efficiency, and reduce its immunogenicity, we use modified nucleotides like N1-methyl pseudouridine and cap the mRNA with cap analogues[12].

    LNPs

    Lipid Nanoparticles (LNPs) are a versatile and promising delivery system in the field of medicine and biotechnology. Composed of lipids arranged in a spherical structure, LNPs can encapsulate and protect therapeutic molecules such as drugs or nucleic acids. This protective coating enhances the stability and bioavailability of the cargo molecules, allowing them to be efficiently delivered to target cells or tissues.

    LNPs have shown great potential in various applications, including gene therapy, vaccination, and targeted drug delivery, offering new avenues for treating diseases and advancing medical research.

    Aptides

    Aptides are an exciting new class of high affinity peptides, introduced by Kim et al. in 2012[10], and are extremely specific, showing binding affinities with the target molecules in the nM range. They have a great advantage over other targeting molecules currently in conventional use (like aptamers and antibodies). Antibodies have been in the limelight of the scientific community for their specific binding and accurate targeting. However, these wonder molecules are dogged by complex intellectual property issues, and further have issues in terms of penetrability. These issues are overcome by aptides, peptides having a maximum length of about 30 Amino Acids, meaning that they’re extremely small, which promotes excellent tissue penetration ability, an issue especially associated with larger molecules like antibodies.
    They can be generated through combinatorial methods and exhibit high affinity and selectivity towards their intended targets. Aptides have gained attention in various biomedical applications, including targeted drug delivery, diagnostic imaging, and therapeutic interventions. Their unique properties make them a promising tool in developing precision medicine and personalised therapeutics.[10][13] The small size of the aptide makes it have excellent application in targeted drug delivery, especially as this means that it can be conjugated easily to lipids of LNPs. It also reduces the chance that the immune system of the patient will recognise and reject it, though in LNP formulations, due to many aptide molecules, the pattern recognition properties of the immune system may react mildly to them.
    The conjugation of aptides to lipids to form specific liposomes and LNPs has yielded excellent results in the accumulation of the LNPs/ liposomes in the target tissue.[14]The aptide we’re using binds to the EDB of fibronectin, proven to be overexpressed in endometriotic tissue. This aptide is a mere 26 Amino Acids long and has about the same affinity to it as many full-size antibodies.

    https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb

    O-IL8-15

    This antibody (designed by SGC Karolinska[15]) is one in a very special format called ‘scFv’ or single-chain variable fragment. Single-chain variable fragment (scFv) antibodies are engineered antibodyfragments that retain the antigen-binding capability of full-length antibodies but in a smaller, single-chain format. scFv antibodies are composed of the variable regions of an antibody’s heavy and light chains, connected by a flexible peptide linker. This design allows them to be produced in microbial systems like bacteria or yeast, enabling easier and cost-effective production compared to traditional antibody formats. scFv antibodies have many applications, including diagnostics, therapeutics, and research tools. Their small size, high specificity, and modifiability make them valuable in targeting specific molecules and studying protein interactions in various biological systems.


    scFv model generated by AlphaFold2 (ColabFold with templates from PDB100)

    In fact, the small size of this antibody made it optimal for us to use in mRNA-based therapeutics, as larger mRNA sequences are harder to encapsulate in an LNP.

    scFv (red) docked to IL8 (green)  
    Docking generated using Free Docking by GRAMM

    Rationale

    The following questions would immediately come to the readers mind, when considering our project:

    • Why IL8?
    • Why antibodies/mRNA and not RNAi?
    • Why not full-length antibodies?

    We will deconstruct and analyse each one of these questions one by one.

    Why IL8?

    1. After an extensive literature search, we found out that there is more than one potential cytokine target for treatment of endometriosis. However, we found that targeting a few of these had inconclusive evidence when it came to efficacy. The following diagram from Reproductive Immunology] Kaori Koga – Immunology of Endometriosis_ Pathogenesis and Management (2021, Academic Press) – gives a depiction of the cytokines targetable in endometriosis. The accompanying text has references. We found that the pathway of action and its blockade had been best studied in IL8. Also, the most conclusive evidence for the efficacy of targeting was present mainly for IL8, as opposed to other cytokines.[16]

    Why antibodies/mRNA and not RNAi?

    1. Considering our delivery system and the first-pass effect, siRNA would block IL8 production in liver and spleen cells. However, even if mRNA goes to the liver and IL8 is blocked there, the antibody will circulate throughout the body and potentially reach the target tissue after some time. (In any case, it is not just the levels of IL-8 in the peritoneal fluid of women with endometriosis that have been raised; the levels of IL8 in the blood are also significantly higher than normal). [17]
    2. The chemokines, including IL8, are not produced by just one cell type: peritoneal macrophages, lymphocytes, ectopic endometrial implants, and mesothelial cells of the peritoneum all produces them. So, blocking off just one or two sources will be less effective. [16]

    Why not full-length antibodies?

    1. As per recommendations, longer mRNA constructs have a poorer encapsulation efficiency. This is why we have used the significantly smaller scFv constructs.

    For further questions that may occur to you, feel free to reach out to adityaka@iisc.ac.in, or igem.ug@iisc.ac.in . We will answer your questions the best we can!

    Design

    The design of a functional mRNA construct is not a trivial thing! We will go through, in detail, the process of designing just such a construct in the following sections:

    Promoter

    • We have used T7 promoter in our IVT mRNA constructs. We had a option to use T3 and SP6 promoters due to the popularly available IVT kits, but the T7 promoter has been used more extensively than the others, so we went ahead with that.
    • The mRNA must be capped with a special cap to prevent recognition of the 5’ triphosphate by the RIG-1 like receptor.
    • The ‘gold standard’ of capping reagents of mRNA in the market at present could very well be the CleanCap®AG reagent. However, for use of this reagent, the last 2 bases of the T7 promoter must be AG and NOT GG as this cap1 binds to the last 2 nucleotides in the mRNA sequence. For this reason, we are also unsure if ARCA or other such capping reagents would work with the modified T7 promoter, as we have used only CleanCap®AG for our experiments.
    • We have included the entire eINCV01R promoter region in this promoter (as mentioned in the references below) to ensure that the T7 RNAP binds efficiently to the T7 promoter.
    • To maximize transcription efficiency, further, we have included the highly conserved sequence of GGGAGA downstream of the T7. We would recommend, however, that any users of this part try out GGGATAAT instead, as this is said to provide higher activity. [18], [19], [20]

    UTR’s

    We had to contend with the following facts and use suitable UTRs for our sequence.

    • Our protein is not human! Hence, associated UTRs cannot be found on databases.
    • Our mRNA is a therapeutic one, and hence it must be expressed with a high efficiency and for quite some time!
    • As design of UTR’s is a rather lengthy process, for the purpose of this project, we used Moderna UTR’s

    PolyA tail

    PolyA tails are normally included after the 3’UTR in mRNA, as these UTRs act as ‘timers’ of sorts on the mRNA, determining when it gets degraded. However, due to synthesisability issues with larger tails, it may be prudent to make a construct where one can switch in different CDS via conventional cloning strategies. As listed under ‘Parts used’, we used such a construct from Genscript. One more very important thing to be kept in mind is that, even if this is done when the plasmid with a massive (>100 residues) polyA tail is used in bacteria (that is, cloned in one), chances of recombination are high. So, we recommend that one incorporates Pfizer’s ingenious strategy of including polyA stretches interspersed with random nucleotides to prevent the said recombination. [21]

    CDS

    Signal peptide 6.0

    N-terminal signal sequences direct the placement of new secretory and membrane proteins to the endoplasmic reticulum (ER) through a process that relies on the signal recognition particle (SRP). These sequences have a three-part structure: a hydrophobic core region, with sections on both its ends known as the n- and c-region. The c-region contains a site where signal peptidase (SPase) acts. Typically, these sequences are removed during protein synthesis, resulting in the formation of signal peptides and the main protein. Despite their core function, signal sequences can vary greatly in length and composition. This variation can influence processes like ER targeting, protein placement, and SPase action.

    The standard approach is to use the TPA signal peptide, but if the protein is human, use its own signal peptide. This strategy clearly would not work, as we reiterate that our protein IS NOT human! Here is where we took a gamble and, after an extensive literature review, found Signal P 6.0, open-source software that allows you to find the probability of cleavage of the signal peptide and the secretion of the protein outside the cell, depending on whether the cell is a prokaryote or a eukaryote. We found a list of commonly used signal peptides after another literature review, and we found that CD33 performed better than TPA! [22]

    Probability of cleavage and secretion via Sec/SP1 secretion pathway with the CD33 SP: CD33 Cleavage between 16/17 as expected Probability= 0.972596

    Probability of cleavage and secretion via Sec/SP1 secretion pathway with the tPA SP: tPA Cleavage between 22/23 as expected Probability= 0.949892

    Modifications

    N1-methyl Pseudouridine

    N1-methyl pseudouridine (m^1Ψ) in mRNA offers a distinct advantage due to its inability to be recognised by certain innate immune sensors. Specifically, m^1Ψ-modified mRNA evades detection by Toll-like receptors TLR7 and TLR8. These receptors, when activated, can trigger an immune response against the mRNA, leading to its degradation and reduced effectiveness as a therapeutic or vaccine agent. By incorporating m^1Ψ into mRNA, this unwanted immune activation is minimised, ensuring more efficient delivery and expression of the encoded protein or antigen. In addition to TLR7 and TLR8, m^1Ψ-modified mRNA also avoids recognition by other cellular sensors like RIG-I and MDA5, further enhancing its stability and translational efficacy in the cell.[23]

    Final design:

    All the components we mentioned in our design rationale can be seen in the above diagram.

    Lipid NanoParticles (LNPs)

    RNA is intrinsically unstable, as has been highlighted by us on previous occasions. Not only this, but cells have no system to take up RNA when it is not packaged suitably. So, we use LNPs, the current gold standard of RNA delivery. We, again, have not had the time or resources to develop new lipid compositions different from those used by Pfizer and Moderna. Hence, we went ahead with Moderna’s lipid composition:

    • ALC-0315, mal-PEG 2000, cholesterol and DSPC. ALC-0315 was the cationic lipid. It was necessary as RNA is negatively charged, and the cationic lipid helps in the uptake of the RNA. The molar ratios used were ionisable cationic lipid: neutral lipid: cholesterol: PEG-ylated lipid: 50: 10: 38.5: 1.5. DSPC is the neutral lipid.
    • We judged the number of ionisable lipid moieties based on the charge on the mRNA: (3 ionisable lipids for each phosphorus. [24],[25]

    Camelisation

    dAb’s: A Biological Overview

    • Domain antibodies (dAbs) are unique human variable domains (either VH or VL) that have been modified to prevent them from pairing up while keeping their specific antigen-binding capability.
    • This modification often uses a process called “camelisation,” where hydrophobic parts typically seen in the VH/VL interface are replaced with hydrophilic parts like those in camelid VHH, along with an extension in the CDRH3.
    • These molecules, similar to nanobodies in size and structure, have properties like high stability, solubility, and a brief half-life. They are also easily fused with other proteins and produced in large quantities using microbes.
    • While dAbs themselves have some therapeutic uses, they’re mainly explored as fusion proteins combined with other entities, such as full antibodies for dual specificity, an Fc domain, or an anti-albumin dAb, as seen in GSK/Domantis’ AlbudAb®s.[26]

    dAb’s: A structural overview

    dAb’s are generally engineered antibody fragments, approximately 120 amino acids long. They are essentially the variable regions of the heavy chain of a human antibody with some key amino acid changes elucidated in the design part. However, in terms of a general structure, they have the following features:

    • In each of the two variable domains of the scFv, there are three distinct regions known as complementary determining regions (CDRs) that are interconnected by framework regions (FRs).
    • The CDRs play a key role in binding to antigens, with their structure tailored to match the epitope. On the other hand, the FRs serve primarily as a support structure and show minimal variability compared to the CDRs. Notably, each CDR contributes differently to antigen binding.
    • For example, the heavy chain’s CDR3 is especially vital, contributing to 29% of the binding specificity, whereas the CDR2L’s contribution is a mere 4%.
    • dAb’s, similar to VH, contain nine beta-strands that create a standard IgV fold. The absence of the VL in nanobodies leads to significant structural differences, particularly in the FR2 region and hypervariable loops. In a standard VH region, the FR2 has four conserved hydrophobic amino acids that help in VL joining. However, in dAb’s, these hydrophobic residues are replaced with hydrophilic ones to prevent unwanted exposure to solvents. This change, coupled with the rotation of nearby residues and the protective folding of the CDR3 domain over this interface, enhances the solubility of dAb’s compared to VH domains and scFvs.[26],[27]

    The process of camelisation (scFv to Nanobody)

    The things to be kept in mind are:

    • In each of the two variable domains of the scFv, there are three distinct regions known as complementary determining regions (CDRs) that are interconnected by framework regions (FRs).

    The CDRs play a key role in binding to antigens, with their structure tailored to match the epitope. On the other hand, the FRs serve primarily as a support structure and show minimal variability compared to the CDRs. Notably, each CDR contributes differently to antigen binding. For example, the heavy chain’s CDR3 is especially vital, contributing to 29% of the binding specificity, whereas the CDR2L’s contribution is a mere 4%.

    • In the conventional VH region, the FR2 consist of four highly conserved hydrophobic amino acids (Val37, Gly44, Leu45, and Trp47)that in contribution with Gln39, Gly44, Tyr91, and Trp103 form a conserved hydrophobic interface of ~700 Å 2 to facilitate VL joining.

    The way to ‘camelise’ would be as follows: Remove the entirety of the sequence (except the VH)

    • Replace these four hydrophobic residues (Val37, Gly44, Leu45, and Trp47) with more hydrophilic amino acids (Phe37, Glu44, Arg45, and Gly47) to avoid the exposure of such a sizeable hydrophobic region to solvent.
    • In addition to this substitution, residues adjacent to this interface have rotated their side chains without deforming the Cα backbone to increase the VHH surface’s hydrophilicity. Furthermore, the CDR3 domain of VHH folds over this interface to shield the amino acids formerly covered by the VL partner. These alterations elucidate the augmented solubility of VHHs in comparison to the single VH domain and scFvs.

    This simple change is all it takes to get a preliminary dAb! [26], [27]

    Full-size Ab to scFv

    • In the conventional VH region, the FR2 consist of four highly conserved hydrophobic amino acids (Val37, Gly44, Leu45, and Trp47)that in contribution with Gln39, Gly44, Tyr91, and Trp103 form a conserved hydrophobic interface of ~700 Å 2 to facilitate VL joining.

    So, to make a preliminary version of such a molecule, do the following:

    • Take the VH and VL regions and go through the sequences. Change the amino acid positions to the ones mentioned above if it is not already this.
    • Join the 2 sequences with a linker: GGGGSGGGGSGGGGS.

    This simple change is all it takes to get a preliminary scFv!

    RNA sequence optimisation with Ribotree

    As mentioned in a previous section, RNA is not very thermostable/stable in general, mainly due to hydrolysis! So, how can we make RNA more stable? Prevent hydrolysis! In this regard, Prof. Rhiju Das from Stanford developed Ribotree, a program to optimise RNA sequences for more stability. It does this by Maximizing the Average Unpaired Probability of the bases in the RNA[27]. Why does this help? To understand that, we must first understand the mechanism of RNA sequence cleavage via hydrolysis:

    The mechanism involves:

    • Initiation by Deprotonation: The hydrolytic cleavage of an RNA backbone phosphodiester bond begins with the deprotonation of the 2′-hydroxyl group on the ribose sugar.
    • Transition State Formation: The deprotonated hydroxyl group then attacks the adjacent phosphate group, forming a pentacoordinate transition state. For this state to form, the RNA backbone must adopt a specific conformation where the 2′- hydroxyl group aligns with the departing 5′ oxyanion.
    • Cleavage and Strand Break: The 5′ oxyanion leaves, resulting in the formation of a 2′,3′-cyclic phosphate and a break in the RNA strand. This mechanism is universal to RNA and underlies the action of certain ribozymes and protein-based nucleases.

    Clearly, the hydrolysis mechanism is intrinsic to RNA and is influenced by factors like secondary structure. Structured regions within RNA, such as double-stranded areas, can restrict the molecule’s conformational flexibility, reducing its propensity for hydrolytic cleavage.

    In the results section, you will see a comparison of pre and post optimization RNA sequences, on a scale. Though we have not produced these sequences and tested them out due to multiple constraints, it is one of our main future goals to do so.

    Next up, Modelling