At the beginning of the new millennium, the Human Genome Project (HGP) released the complete human genome sequence (1,2). This milestone in 21st century biomedical research diverted research interests toward large-scale “omics” approaches, shifting away from the functional study of single genes or proteins to system-wide studies within the entire genome or proteome. Many technical advances were developed to support these demands, and analytical methodology has developed dramatically. For instance, the initial HGP began in 1990 and took a decade to collect the full sequence of a single human sample; however, next-generation sequencing techniques enable whole genomes to be sequenced within two weeks (3). These unprecedented advances in genomics were facilitated by the intrinsic characteristic of DNA/RNA base pairing utilized for PCR amplification and massive parallel analysis. Moreover, large-scale proteome-wide analyses have fundamental hurdles: no method is available to amplify specific proteins, and current mass spectrometry (MS)-based profiling technology has a relatively narrow dynamic range. Accordingly, while huge amount of raw genome sequence data is available, most of the gene products have biological functions that are not fully understood; whether all of the coding sequences are translated to gene products (proteins) also remains unclear. Furthermore, considering alternative splicing and post-translational modifications (PTM), the system-wide functional annotation of a proteome is complex task. To address this challenge, proteomics should identify the biological functions of all proteins in a living organism.
Living organisms express various functional proteins and maintain dynamic homeostasis. To reveal the function of each protein in this dynamic system, researchers have followed traditional genetics-based approaches: i) forward and ii) reverse proteomic approaches (4). In the forward proteomic approach, a test organism is exposed to non-specific shocks that perturb protein levels, generating unique phenotypic changes. The quantitative proteomic signatures from control and test subjects are compared to identify the proteins responsible for each specific phenotype. The disease model has a unique phenotype compared to normal samples, and mainstream biomarker discovery is a type of forward proteomic analysis. The function of a specific protein is perturbed using genetic knockout, knock down by RNA interference (RNAi), or small-molecule inhibitors, and then the phenotype and function of the protein are correlated during the reverse proteomic study. In both cases, global quantitation method is critical. This method required the development of the first and most commonly used small-molecule probes (Cy3-NHS, Cy5-NHS) used for quantitative protein visualization (5,6). The comparative quantitation strategy is called differential gel electrophoresis (DIGE). Two distinct colors of chemically reactive fluorescence probes are used to treat the vehicle and sample, respectively; an N-hydroxy succinimidyl (NHS) ester group reacts with the primary amines in any position of a protein. Each labeled proteome is mixed in equal amounts, followed by gel electrophoresis. The fluorescence intensity of each gel spot represents the relative amount of protein. Although DIGE is a straightforward method, it has serious limitations, including poor reproducibility and low throughput (7). Moreover, monitoring proteins that are extremely large, small, acidic, basic, not abundant or membrane-bound remains extremely difficult. To overcome some of these key issues, alternative approaches have been developed, such as protein microarrays (8,9), lab-on-a-chip (10,11), capillary electrophoresis (CE)/MS (12-14), or liquid chromatography (LC)/MS (15,16). Liquid chromatography has benefitted from significantly improved resolving power and robust reproducibility. In addition, coupling LC with high-resolution MS (LC/MS) allows precise structural and sequential analysis within a complex proteome, making LC/MS an indispensable tool for modern proteomics research (15,16). In MS-based quantitative proteomics, another small-molecule probe was introduced by the Aebersold group (17). Unlike DIGE, the authors envisioned that they could combine quantitation and identification during MS analysis using isotope-coded affinity tags (ICAT). ICAT probes are both “heavy (deuterated)” and “light (non-deuterated)” versions of chemically identical probe, differing only in exact mass. In principle, ICAT probes can covalently bind to specific amino acid side chains, and the labeled peptides are enriched, maximizing the signal-to-noise ratio during quantitation. After being inspired by isotope labeling techniques, the Mann group introduced a more general metabolic labeling method called stable isotope labeling by amino acids in cell culture (SILAC) (18). Due to its simplicity and accuracy, these isotope labeling quantitation methods have been extended from cell culture to whole organisms (19) and human tumor tissue (20).
While conventional proteomic profiling techniques have focused on monitoring system-wide protein expression levels, directly assessing the functional state of proteins in a living organism requires alternative profiling methods that measure enzymatic activity because this activity is closely correlated with protein function. Small-molecule probes have been extensively used during biological study. Unlike molecular biology-based methods, such as RNAi or antibody-based probes, small molecules often lack specificity. However, the interaction partners of small molecules are not random. Instead, the structure and functional groups are recognized by specific proteins with similarities at any sequence level, 3D structure, enzyme intermediates, etc (21,22). To capitalize on the properties of small molecules, Cravatt and Sorensen proposed and demonstrated activity-based protein profiling (ABPP) (23-25). ABPP is a chemical proteomic strategy that utilizes small-molecule probes to form covalent bonds at the active site of an enzyme and profile the functional state of the intact enzyme. Several excellent reviews have described this principle and the early work (26-28). In this review, we will focus on the development of strategies for activity-based probe (ABP) and will highlight recent progress.
STRATEGIES FOR DESIGNING ACTIVITY-BASED PROBE (ABP)
ABPs are small-molecule probes that make covalent bonds to target proteins and are employed by means of visualization or enrichment. Since the first rationally designed ABP was reported (29), various functional groups have been exploited to modulate target enzyme classes, generating several categories (27,30). Here, we classify three ABP designing strategies (Fig. 1).
Fig. 1.Three strategies to design activity-based probes and general workflow of activity-based protein profiling.
Enzymatic intermediate inspired ABP development
ABPs inspired by mechanistic intermediates of enzyme have been extensively studied. This type of ABP utilizes electrophilic functional groups and targets the active-site or an adjacent nucleophilic residue within the reaction pocket. Depending on the catalytic reaction of each enzyme, the nucleophilicity of the active-site residues are usually conserved. Accordingly, early efforts focused on assessing the nucleophilicity for numerous enzyme classes. Various functional groups, including bromo/chloro-acetamide, epoxide, electrophilic ketone, fluorophosphonate, phosphonate, 5'-p-fluorosulfonylbenzoyl, and vinyl-sulfone groups, have been used for enzyme superclass-specific ABPs. Representative ABPs are listed in Table 1.
Table 1.ABPs for specific enzyme class
Hydrolases compose a large class of enzyme, comprising almost 1% of mammalian genomes while participating in numerous physiological processes (31). These enzymes mediate substrate hydrolysis reactions via base-activated serine/cysteine nucleophiles. With an intermediate step from this mechanism, the Cravatt group prepared a series of ABP containing fluorophosphonates (FP) and demonstrated their use for visualizing enzymatic activity (29). Notably, ABP can discriminate between the active and inactive forms of serine hydrolase instead of the total content. In addition, FP-based ABP selectively labeled material in a complex proteome mixture. Several irregularities have persisted, including proteases that exhibit low labeling efficiency. Alternative chemical probes based on peptidic arylphosphonates show enhanced labeling efficiency and alternative target engagements (32).
Kinases are enzymes that mediate phosphate transfer, making them key therapeutic targets; profiling their activity is gaining a crucial role during drug discovery. More than 518 kinases have been reported, consisting of 134 families (33). However, all kinases share a common substrate: adenosine triphosphate (ATP). While the protein kinase superfamily has a conserved ATP binding pocket, developing an ABP for kinases remains challenging due to the lack of nucleophilic residues near the active site. Instead of engaging a precise active site residue, researchers from GlaxoSmithKline and ActiveX Bioscience designed ABP for kinases that target a conserved adjacent lysine residue located within the ATP binding pocket (34,35). One example uses an acyl phosphate functional group, and they demonstrated that it could target at least 75% of known human kinases in a single experiment. These results revealed the power of ABPP compared with conventional antibody-based kinase activity measurements, but a few shortcomings remain. First, kinase ABPs do not use an enzymatic substrate, in contrast to an ideal ABPP process. Therefore, kinase ABPs suffer from diverse cross-reactivity with other proteomic nucleophiles. Efforts to generate a direct kinase substrate ABP were reported recently, but that ABP was far more specific for histidine kinase due to the intrinsic auto-phosphorylation reaction of histidine kinase (36). Second, these methods showed relatively low affinity for the ATP binding site. Generally applicable kinase ABPs are needed for proteome-wide kinase activity profiling.
Small molecule enzyme inhibitor mimetic ABP development
Mechanistic intermediate-inspired approaches have a limited scope of enzyme targets because most enzyme activity does not require nucleophilic addition or a substitution reaction. In an alternative approach, ABP has also benefited from small-molecule enzyme inhibitors. Though some of this class probes does not produce a reactive intermediate state, all of these ABPs could profile enzyme activities. Like the drug-repositioning strategy, transforming irreversible inhibitors into ABP is one effortless tactic. Secure introduction of the reporter entity remains the only problem. Biotin is the most popular reporter tag used for enrichment, while fluorophores are used for detection. However, attaching a tag moiety to an inhibitor is not straightforward because covalently attaching the biotin or fluorophore may interfere with the original inhibition activity or the modified compound may no longer be cell-permeable due to the bulky tag. Accordingly, various bio-orthogonal tagging reactions have been studied to resolve these problems (37,38). The most successful bio-orthogonal reactions involve “click chemistry”, specifically alkyne-azide cycloaddition. Via click ligation, ABPs containing either azide or alkynes can be selectively labeled with biotin or a fluorophore.
The Pezacki group designed ABPs for fatty acid synthetase (FASN) using an FDA approved drug, Orlistat (39). Orlistat, or tetrahydrolipstatin, is obesity drug that irreversibly inhibits FASN activity via covalent bond formation between a beta-lactone motif and the active site (S2308). Instead of covalently attaching the biotin or fluorescent tag, the Orlistat-based ABP incorporated an alkyne at the end of an alkyl chain. From a FASN activity profile taken during a hepatitis C virus (HCV) infection, the authors discovered a selective increase in FASN activity when both the HCV core and the NS4B protein were present, suggesting that an ample amount of fatty acids may be required for efficient HCV replication; the FASN activity might be diagnostic for HCV infections.
Compared to the irreversible inhibitor-based methods, reversible enzyme inhibitors require an additional functional moiety to label the target protein. One prominent tactic is to make a covalent bond using photoactivatable group, such as benzophenone or diazirine. Photo-crosslinking is advantageous because the labeling event is not biased by the reactivity of the amino acid side chains; the reaction is mediated by its radical intermediate. After the reversible enzyme inhibitor is bound to its target, photo activation covalently labels the proximal protein. Consequently, ABPs based on reversible inhibitors usually consist of a photoactivatable group, a reporter tag, and an inhibitor. This concept was applied to an ABP designed for histone deacetylase (HDAC) (40), and an adenylating enzyme (41).
The Cravatt group designed ABPs for HDAC based on the scaffold of a class I/II HDAC inhibitor: suberoylanilide hydroxamic acid (SAHA). Because SAHA is a reversible HDAC inhibitor, they modified the structure to include a photoactivatable group (benzophenone) and an alkyne tag. The ABP for HDAC (SAHA-Bpyne) was applied to aggressive and nonaggressive melanoma cells, successfully profiling all HDAC classes. Among the HDACs, SAHA-Bpyne could reveal two proteins (HDAC6, CoREST) that were significantly altered depending on the aggressiveness.
The Aldrich group designed ABPs for an adenylating enzyme in a similar manner. With a reversible inhibitor for the adenylating enzyme (MbtA), 5’-O-[N-(salicyl)sulfamoyl]adenosine (Sal-AMS), the authors introduced benzophnone and alkyne tag, producing Sal-AMS ABP. Sal-AMS ABP was applied to study the mechanism of the anti-tubercular activity displayed by Sal-AMS.
The limited amount of engaging enzymes hinders small-molecule inhibitor mimetic ABPs, reducing their generality. Notably, medicinal chemists have been keen to discover an ultra-selective enzyme inhibitor in recent decades, but chemical biologists are currently developing small-molecule ABPs that can engage a broad range of enzyme superfamilies.
ABP discovery from un-biased screening
The last strategy is a novel ABP discovery from un-biased screening. Diversity-oriented organic synthesis and high throughput screening (HTS) has been lead a biologically active molecule discovery (4) and imaging probe development (42-44). Compared to hypothesis-driven strategies, these strategies rely on chemical diversity. Combinatorial probe designs reveal novel ABPs for various enzymes. Initially, the Cravatt and Sorenson groups prepared a small combinatorial library of sulfonates, addressing the amount of chemical diversity necessary to target diverse enzyme classes (45). Un-biased screening toward the whole proteome revealed that a sulfonate functional group could target at least six mechanistically district enzyme classes depending on the chemical structure of probe. Encouraged by these results, the same group evaluated the targeting capacity of the probes by expanding their chemical diversity with a di-peptide library and a α-chloroacetamide (α-CA) group, identifying more than 10 classes of enzyme that could be targeted by α-CA probes (46). Other than reactive functional group-dependent enzyme engagements, un-biased screening could elucidate novel ABPs for specific enzymes. The Bogyo group discovered ABPs for caspase with a positional scanning combinatorial library (PSCL) (47). Using solid-phase chemistry, they prepared a library of nitrophenyl acetate (NP) capped tetrapeptide acyloxymethyl ketone (AOMK) probes to discover a novel ABP for caspases. This probe and its decedents monitored caspase activity in cell-free and live cells to study the intrinsic death pathway (apoptosis) (47,48).
Because ABPP is getting significant attention in many fields, ABPs are not limited to enzyme activity profiles. Several research groups have demonstrated clever ideas to resolve significant biological problems. In the next section, we will cover ABPs used for purposes other than measuring activity.
DISCOVERIES OF SPECIFIC PROTEIN-SMALL MOLECULE, PROTEIN-PROTEIN INTERACTION AND MORE
Identifying the functional targets of biologically active molecules is critical for understanding the mode of interaction between a protein and a small molecule (4,62). Traditionally, affinity matrix techniques have been used to identify or deconvolute targets. These in vitro lysate assay conditions lose intact biological context information, leading to false-positive target identifications. The Chang group proved the importance of target ID methods by carefully comparing the affinity matrix method and in situ labeling using a chloroacetamide (CDy2) probe (60). The binding partners for a small-molecule probe that fluoresces after recognizing different muscle cell states were different depending on the target ID technique, with tubulin and aldehyde dehydrogenase-2 (ALDH2) for the affinity matrix and the in situ labeling method, respectively. Like other ABPs, CDy2 covalently bound with a cysteine residue in the active site of ALDH2 (63). However, CDy2 exhibited extraordinary selectivity for a single protein, ALDH2, which is one isoform of aldehyde dehydrogenase in a complex proteome. Studying ABPs containing a benzyl halide (59) and α,β-unsaturated ketone (61) revealed ABPs that labels a single protein in whole proteome.
Fig. 2.Enzymatic reaction mediated bio-imaging applications. left: Selective direct labeling of target protein; right: Non-selective enzymatic labeling of adjacent proteins.
These ultra-selective small-molecule probes evoke the “magic bullet” concept, as coined by Paul Ehrlich (64). A magic bullet is an ideal drug molecule that targets a disease-causing protein, and the above ABPs fit this criterion. While medicinal chemists have attempted to design drug molecules that interact with single target proteins, most drug molecules suffer from off-target effects, and their binding partners have not been identified. Consequently, the Yao group developed ABPs for known drug molecules to reveal the targets in situ (65,66). They synthesized ABPs with Dasatinib using alkyne and photoactivatable tags; this anti-cancer drug molecule targets protein kinases. Numerous unknown targets were revealed, and the authors demonstrated the power of ABPs for studying drug on/off-target identification.
In addition to interactions between protein and small molecules, protein-protein interactions are critical for various post-translational modifications (PTM). The Kapoor group designed two peptide based ABPs using benzophenone and alkyne tags with or without trimethylation of lysine-4 at the histone H3 N-terminus (H3K4Me3 vs H3K4) (58). A quantitative comparison of enriched proteins between two ABPs via SILAC identified novel proteins that interact during histone methylation, including MORC3.
The last application of ABPs is enzymatic reaction-mediated bio-imaging (Fig. 2). Two types of catalytic imaging methods are known. The first is a selective direct labeling method. The Chang group screened 43 chemically reactive fluorescent probes in live cells; one fluorescein derivative containing a benzyl halide group showed selective labeling for glutathione-S-transferase omega-1 (GSTO1) (59). With this fluorescent probe and protein pair, they demonstrated a universal bio-imaging tag: the omega-tag (67). Similar to GFP or FlaSH/ReAsH tag (68), the omega-tag could visualize proteins in situ, but its fluorescence signal is based on an enzymatic reaction rather than the fluorescence of the tags. The second approach is the non-selective enzymatic labeling of adjacent proteins. The Ting group envisioned that a highly reactive radical species could be utilized for spatially and temporally restricted labeling (69). Ascorbate peroxidase (APEX) oxidizes numerous phenol derivatives to form phenyl radicals with short lifetimes (1＜ msec) and narrow labeling radii (＜20 nm). The caged form of ABP containing a phenol group and enrichment tag is activated only after an enzymatic reaction, and the active ABP forms a covalent bond with the adjacent proteins to be employed for bio-imaging. To validate their technique, the authors expressed APEX in the mitochondria matrix region and demonstrated the selectivity in super-resolution for bio-imaging and proteomic profiling.
Defining the biological functions of proteins requires selective tools for monitoring their expression levels and activities. As an alternative to conventional antibody-based methods, methods based on small-molecule probes allow in situ investigation without disrupting the living conditions. Chemical proteomics has advantages including its combinatorial target engagement strategies that can be expanded depending on the chemical structure of the probes. The development of small-molecule probes has seen prominent and rapid progress in recent years. While major research has focused on measuring enzymatic activity for ABPP, numerous challenges remain. First, the targeted enzyme classes for ABPs must be extended. Many ABPs target enzymes based on the nucleophilicity of the active site. After combining ABPs with bio-orthogonal chemistry, a variety of ABPs will be available for all enzymatic classes. Second, the demand for analytical techniques must be fulfilled, integrating the ABP profile with conventional genomic/proteomic/metabolomic data. The vast quantity of biological data from omics studies is easily accessible, and ABPs will provide copious data from another dimension. Because combining information in a single platform is not always straightforward, our analytical tools and techniques must be upgraded for specific probes. Fluorescence imaging and high-content screening (HCS) are some of the platforms that provide next-generation omics data, and the importance of analysis techniques is consequently emphasized (70-73). As the field matures and approaches a fuller understanding of biological processes, small-molecule probes will be employed as universals tools to provide valuable insights.