This action might not be possible to undo. Are you sure you want to continue?
CHEMLWRY CC, 1990 by The American Society for Biochemistry
and Molecular Biology, Inc.
Vol. 265, No. 11, Issue of April 15, pp. 6104-6111, 1990 Printed in CJ.S. A.
Characterization a Key Proenzyme
of the Gene for Human Plasminogen, in the Fibrinolytic System*
(Received for publication, November 27, 1989)
The organization and structure of the gene coding for plasminogen has been determined by a combination of in vitro amplification of leukocyte DNA from normal individuals and isolation of unique clones from three different human genomic libraries. These clones were characterized by restriction mapping, Southern blotting, and DNA sequencing. The gene for human plasminogen spanned about 52.5 kilobases of DNA and consisted of 19 exons separated by 18 introns. DNA sequence analysis revealed that the five kringle structures in plasminogen were coded by two exons. The nucleotides in the introns at the intron-exon boundaries were GT-AG analogous to those found in other eukaryotic genes. Three polyadenylation sites for plasminogen mRNA were also identified. When the amino acid sequences deduced from the genomic DNA and cDNAs of plasminogen were compared with that of the plasma protein determined by amino acid sequence analysis, an apparent amino acid polymorphism was observed in several positions of the polypeptide chain. Nucleotide sequence analysis of the amplified genomic DNAs and genomic clones also revealed that the plasminogen gene was very closely related to several other proteins, including apolipoprotein(a). This protein may have evolved via duplication and exon shuffling of the plasminogen gene. The presence of another plasminogen-related gene(s) in the human genomic library was also observed.
plasminogen by releasing an NHz-terminal fragment (A4, 8,000) called a preactivation peptide (2). Lys-plasminogen is more readily activated by plasminogen activators and binds to fibrin with greater affinity than native Glu-plasminogen
Downloaded from www.jbc.org at INSTITUTE OF MICROBIAL TECHNOLOGY LIBRARY: (CSIR), on July 27, 2011
Plasminogen is a glycoprotein that circulates in plasma as a proenzyme. It is converted to plasmin by tissue plasminogen activator (tPA)’ in the presence of a fibrin clot or urokinase (1). Plasmin then digests the insoluble fibrin clot into soluble fragments during tissue repair and recanalization. The molecular weight of native Glu-plasminogen is about 93,000, as estimated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Plasmin also converts Glu-plasminogen to Lys* This work was supported in part by Research Grant HL 16919 from the National Institutes of Health. The costs of publication of this article were defraved in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTMIEMBL Data Bunk with accession number(s) 505286. $ Supported by International Research Fellowship 3 F05 TW03433-OlSl from the John E. Fogarty International Center for Advanced Study in the Health Sciences. Present address: Dept. of Molecular Biology, University of Aarhus, C. F. Mollers Alle 130, DK8000 Aarhus C, Denmark. 3 Supported by National Research Service Award 5 F32 HL07816 from the National Institutes of Health. 1 The abbreviations used are: tPA, tissue-type plasminogen activator; kb, kilobase( bp, base pair(s).
The primary structure of human plasminogen (791 amino acids) has been established by amino acid sequence analysis (5-7) and cDNA cloning (8,9). It is a single-chain glycoprotein consisting of a preactivation peptide (77 amino acid residues), five tandem structures called “kringle” domains (about 90 residues each), an activation cleavage site (between Arg-561 and Val-562), and a catalytic domain including the serine protease triad of His-603, Asp-646, and Ser-741. The kringle structures are also found in a number of other proteins, such as tPA, urokinase, factor XII, prothrombin, and apolipoprotein(a). The last protein is highly homologous with plasminogen and contains up to 37 tandem repeats of plasminogen kringle 4 (10, 11). The first kringle in plasminogen (12, 13) and the second kringle in tPA (14, 15) function as a binding site for fibrin. The function of the kringles in the other proteins has not been established. Since several cases of plasminogen abnormalities and deficiencies have been identified in association with thrombosis (16), it was important to determine the structure and organization of the normal gene in order to compare it with abnormal genes. Knowledge regarding the gene for plasminogen could also provide some insight as to its regulation as well as its evolution in relation to other closely related genes, such as the gene coding for apolipoprotein(a). In previous studies, cDNAs (8, 9, 17) and several genomic clones (8, 17) coding for plasminogen were isolated and the sequence of the DNA coding for a portion of kringle 4 in the human gene was reported (8). In the present studies, the sequence of the 5’- and 3’-flanking regions, the exons, and the intron-exon boundaries of the entire gene coding for human plasminogen are presented and compared with several closely related proteins.
Restriction endonucleases, nuclease Bal-31, and T4 DNA ligase were purchased from Bethesda Research Laboratories or New England Biolabs. T7 DNA polymerase and sequencing kits were purchased from the United States Biochemical Corp. The Klenow fragment of Escherichia coli DNA polymerase, bacterial alkaline phosphatase, ATP, deoxynucleotides, dideoxynucleotides, M13mp18, -Ml3mpi9, pUC18, and pUC19 were supplied by Bethesda Research Laboratories. “P-Labeled nucleotides were obtained from Du PontNew England Nuclear, and [(u-““S]dATP was provided by Amersham Corn. Two human nenomic libraries cloned into Charon 4A (18) and EMbL3 (19) were kindly provided by Drs. Tom Maniatis and Shinji Yoshitake, respectively. Additional human leukocyte and lung fibroblast genomic libraries were obtained from Clontech and Stratagene, respectively. Oligonucleotides were synthesized using a nucleotide synthesizer (Applied Biosystems Inc.) and kindly provided by Dr. Patrick S. H.
The 19 exons are shown with wide vertical bars and are numbered with Roman numerals.Organization of the Gene for Human Plasminogen 6105 Chou.5 and 3.4 kb was also employed. the four deoxynucleotide triphosphates each at 200 pM.jbc.5 kb. The 5’ and 3’ portions of the gene for plasminogen were also established by in vitro amplification employing the polymerase chain reaction (25). Sequence data were obtained by employing at least two overlapping independent fragments.3 2. One to five pg of genomic DNA was amplified in a lOO-~1 reaction mixture containing 50 mM KCl. To obtain genomic clones containing certain exons.5 mM MgC12. Additional restriction fragments from the inserts were also subcloned into M13mp18 or M13mp19 to obtain overlapping sequences. The 2.7 kb (17) were principally employed. Yim Foon Lee. WA. and approximately 90% of the sequence was carried out on both strands. University of Washington.0 units of Taq DNA polymerase obtained from New England Biolabs or Perkin-Elmer-Cetus. ZymoGenetics. The Hep G2 cDNA library was kindly provided by Dr. 10 mM Tris-HCl (pH 8.9-kb cDNA. and Jeff Harris. followed by centrifugation and banding on a cesium chloride step gradient (21).4 3.5 of the primers 100 (Fig. Oligonucleotides were synthesized as sequencing primers to obtain DNA sequence of the second strand for several regions in the gene.4-kb cDNA resulted from the utilization of an alternative polyadenylation site.7-kb cDNA was isolated from a normal liver cDNA library and started with nucleotide FIG. on July 27. The genomic DNA inserts were sequenced by the dideoxy method (22) employing [(u-“‘S]dATP and buffer gradient gels (23).9 kb (8) and 2.2 2. 1.2 4. The DNA sequence was determined two or more times. and 2.. 2). It contained the same 3’ end as the 1. EcoRI restriction map and location of the exons in the gene for human plasminogen. Phage DNA was prepared by the liquid culture lysis method (20). Two cDNAs of 1.5-5.2 2. TABLE Nucleotide PCR fraanent 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 sequences Size kb 1.7 2.4).2 2. This 3.1 1. gelatin at 200 pg/ml. Digestions with nuclease Bal-31 were also performed to generate DNA fragments that provided overlapping sequences with restriction fragments (24). Genomic DNA samples were prepared from the leukocytes of normal individuals by standard techniques (26). The six overlapping X phage clones with DNA inserts coding for plasminogen are also shown.4 3. 2.6 2. Genomic DNA inserts were isolated by digestion of the phage DNA with EcoRI or Sal1 and EcoRI endonuclease followed by subcloning into plasmid pUC18 or pUC19. A third cDNA of 3. 2011 employed for of the gene coding for plasminogen 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ 3’ GATCGAATTCCGCAGACATTCCACC CACAGAATTCCATGGCATATGTATTTTTACTAC CTGCGAATTCTGGCAACCACTAATCTAC GGGTATTCACATAGTCATCCAGAGGCTCTCC GATGAAGCTTGTAGTTTTATTTGAAAAGAAAGGT ATTAAAGCTTGTCGAGATATGGTCCACTTCAA TGTAAGCTTCAGAGTGCAAGACTGGGAATGGAAAG TTGGAAGGAATGTATCCATGAGCGTGTGGG GGGACCCACTTTCTGGGCACTGCTGGCC CCATAAGCTTGTATGCCTAAATGGGTGAATTC AAGCAGCTGGGAGCAGGAAGTAT TTTTCAAATAAAACTACATCTCTCATC ATTAAAGCTTACAAGTAGCAAGCAAACGGT GTAAAGCTTTCCATTCCCAGTCTTGCACTCTGA ATTTGAATTCATCCATTTCAGTTTTCTTCTTC TGTAAGCTTTTGATTTCAAGAACAGGGC GGGACCCACTTTCTGGGCACTGCTGGCC GGGTATTCACATAGTCATCCAGAGGCTCTCC GATGAAGCTTGTAGTTTTATTTGAAAAGAAAGGT GTAAAGCTTTCCATTCCCAGTCTTGCACTCTGA CATCGAATTCTGCCTTGCTAATAGCAAGC TTTACATGTGTAAAAATCACTCAACAGAAT TAGTAAGCTTCTTTATTTATGTCCAAATGCCCG TATTAAGCTTACCGTTTGCTTGCTACTTGTAA TGTAAGCTTCAGAGTGCAAGACTGGGAATGGAAAG ACACTCAAGAATGTCGCAGTAGTCATATCTC GGTAGTCAAGAGGAGCTTCCTCCCTGCAGC ACAGAGTTCGGTGGATTGGACTCTTCCATTCAG GGAAGAGTCCAATCCACCGAACT CACAGTCACTTGCAGTTTTGCTTTTCTCTG GGTAGTCAAGAGGAGCTTCCTCCCTGCAGC CACAGTCACTTGCAGTTTTGCTTTTCTCTG . Genomic clones containing the gene for human plasminogen were obtained by screening human genomic libraries by the in situ hybridization technique using a partial cDNA or the 5’ and 3’ portions of the cDNA coding for human plasminogen. Fred Hagen. It was isolated from a human Hep G2 cDNA library and extended beyond the stop codon in the smaller cDNAs by 750 nucleotides. respectively) were amplified by the polymerase chain reaction. and these fragments are listed as PCR 1-13 for the 5’ end and PCR 14-16 for the 3’ end of the gene. while the 14 EcoRI restriction sites are shown with narrow uertical bars. appropriate restriction fragments from the cDNA or synthetic oligonucleotides were used for further screening or for identification of isolated clones by Southern blot analysis.2 3. two oligonucleotide primers each at l-10 PM.org at INSTITUTE OF MICROBIAL TECHNOLOGY LIBRARY: (CSIR). I amplification Downloaded from www.8 3. Inc. The 5’ and 3’ portions of the gene (14.8 1. Seattle.8 1. Each sample was placed in a small Eppendorf tube and overlaid with 75 ~1 of mineral oil to prevent evaporation.3 1. Dr.
The solid vertical arrows indicate the cleavage site for the signal peptide and the cleavage site for the conversion of plasminogen to plasmin. The 5’ and 3’ ends of each exon are enclosed in brackets.noncoding region with sequences that are apparently involved in mRNA processing. The DNA sequence upstream from the cytosine listed as nucleotide 1 is shown in the left margin with negative numbers.6106 Organization of the Gene for Human Plasminogen Downloaded from www. The sequences used for the preparation of amplifying primers are underlined or ouerlined and begin with an asterisk. the exons. on July 27.org at INSTITUTE OF MICROBIAL TECHNOLOGY LIBRARY: (CSIR). as well as the 3’. 2. The sites of polyadenylation at the 3’ end of the gene are shown with a diagonal slash.and 3’-flanking regions. while those in the mature protein are shown with positive numbers in the left margin. and the intronjexon boundaries for the gene coding for human plasminogen. . The amino acids in the signal sequence are also shown with negative numbers. The amino acid sequence predicted by the coding region of each exon is indicated above the corresponding DNA sequence employing the oneletter amino acid code.jbc. CCAAT boxes and TATAA sequences are underlined. Nucleotide sequence of the 5’. 2011 FIG.
2. as described above.5). Riverside Scientific Enterprises. RESULTS AND DISCUSSION The Middle clones were of (Xl.jbc.1. A human leukocyte library (18) and a lung fibroblast library (Stratagene) were also screened employing the appropriate 5’ or 3’ region of the cDNA (8. X3) containing initially isolated from Portion the Gene for the gene for approximately Plusminogen-Three human plasminogen 2 x lo6 phage et al. DNA sequences were analyzed by the Genepro program (Version 4. 1). The phage clones that were shown to contain the nucleotide sequences coding for exons that matched the corresponding regions of the cDNAs for plasminogen were employed for further analysis. the isolated phage clones were first amplified by the polymerase chain reaction.8 or 1. 89 mM boric acid. Several portions of the amplified phage DNA were then subjected to DNA sequence analysis.5% agarose (IBI) gel containing 0. WA) employing a Tandy 3000 computer. DNA sequence analysis revealed that these genomic clones contained the middle portion of the gene for plasminogen extending from exons VII to XVII (Fig.9 kb (8) or 2. These three clones were found to be unique by restriction enzyme digestion and Southern blotting analysis. 17) to obtain genomic clones containing the 5’ or 3’ portion of the gene coding for plasminogen. Two more clones obtained from the human fibroblast library (19) and human leukocyte library contained nucleotide se- . 1) (8). cooling to 6070 “C for 2 min to anneal the primers.7 kb (17) and synthetic oligonucleotides were used for further screening and isolation of additional clones for identification by Southern blot analysis. At the end of the last cycle. Since these three clones did not contain the 5’ and 3’ portions of the gene. -continued The samples were then subjected to 25 or 30 cycles of amplification by heating at 94 “C for 1 min to denature the DNA. and incubating at 72 “C for 3 min to extend the annealed primers. a 5. as described. This corresponded to the central part of the cDNA coding for the polypeptide chain of plasminogen extending from the second half of kringle 2 (Lys-204) to the middle portion of the catalytic chain (Gly-690).org at INSTITUTE OF MICROBIAL TECHNOLOGY LIBRARY: (CSIR). on July 27. appropriate restriction fragments from the cDNA of 1.8). X2. and 20 mM EDTA buffer (pH 7.5 rg/ml of ethidium bromide in 89 mM Tris base. 2011 FIG. the samples were incubated at 72 “C for 7 min to ensure the completion of the final extension step. After precipitation with ethanol and resuspension in 100 ~1 of 10 mM Tris-HCI and 1 mM EDTA buffer (pH 7. Seattle. (18) using of the AU/Hoe111 genomic library of Lawn the cDNA of 1. To select the correct genomic clones coding for plasminogen and to exclude those for a plasminogen-related gene(s).or lo-p1 aliquot was applied to a 0.Downloaded from www.9 kb as a probe (Fig.
Thus. leukocyte DNA from normal individuals was amplified by the polymerase chain reaction employing oligonucleotide primers (Table I). The sequence of all the intron-exon splice junctions (Table II) agreed with the GT-AG rule of Breathnach et al. At present. 13 overlapping DNA fragments were prepared from the 5’ end (PCR 1-13) and three fragments from the 3’ end (PCR 14-16) of the gene by the polymerase chain reaction (Fig. including forward and reverse “CCAAT” boxes and “TATAA” sequences (Fig.org at INSTITUTE OF MICROBIAL TECHNOLOGY LIBRARY: (CSIR). 9. Kl-K5 refer to kringles l-5 in the A chain. and Ser residues in the B chain are circled. The overlapping clones spanned about 52.” L P U : G TRK~IRLLKLS~PflUITGK”lp~ 690 + GRGFTGQTEGUGT. DNA sequence analysis also confirmed that the nucleotide sequences of these genomic clones were identical with those obtained from the amplified DNA generated by the polymerase chain reaction. which is similar to the average size of 150 bp found in other eukaryotic genes (30). 17) indicated that the gene consisted of 19 exons (I-XIX) interrupted by 18 introns (AR) (Fig. 9. it is not known whether or not either of these sequences functions as a promoter element. 1). DNA sequence analysis of these fragments revealed that the 5’ portion contained the genomic sequence coding from the signal peptide to the first half of kringle 2 (including exons I-VI). Two sequence elements of CTGGGA common to acute-phase reactant genes (33. Nucleotide Sequences of the Exons and Intron-Exon Boundaries--The DNA sequence of 7853 nucleotides coding for human plasminogen and the flanking regions of the gene is shown in Fig. Sequences in the 3’-flanking region are also thought to play a role in polyadenylation and mRNA processing. that they originated from a plasminogenrelated gene(s). the COOH terminus of the protein and the 3’-noncoding region of the gene. 2).s UTKKqL FS L S LL R GQ TN -3 + KEpaGG +t ” v . M.jbc. Asp. 2.5 kb. on July 27. Comparison of the DNA sequence of the gene with the cDNA sequence (8.” GpL”AFEKGK” 0 m741 GLGUST” . The average size of the 19 exons was 146 bp. and two were type 0 (introns L and R) (29).FCETRGAUUYNPSP NOR” 8 E . Thr-346) are shown by diamonds. The amino acid residues are numbered starting with the aminoterminal glutamic acid residue as number 1 and ending with residue 791. E R s Downloaded from www. J. c-c ECE i. 1). C.and 3’-Flanking RegionsThe DNA sequence analysis revealed that the 5’-flanking region of the gene for plasminogen contained two clusters of regulatory elements for transcription (32).' t 298 T.9. The preactivation peptide (PAP) is generated primarily by the cleavage between Lys-77 and Lys-78 (shown with an open straight arrow) by plasmin. the genomic sequences of the plasminogen-related gene(s) were utilized to design primers for the amplification of the 5’. 34) were found in the 5’-flanking region of the gene for plasminogen.42iTp 1 s PP~~~~TV a L L.. respectively (Fig. 2011 PAP 43& E EF~CA D~Eh3. cc~T?Q~usa- K3 acid sequence for plasminogen and the locathe 18 introns in the gene for plasminogen.Tf5 T SIGNAL j-y-19 s t: P : F . and P). eight were type II (introns B.: nAEN ’ ‘nG CFH F f+ R k U Q 582 : G”” R. +sK : Q. Two additional X phage clones (h4 and X5) were then isolated by rescreening of the genomic library of Lawn et al.~E. c-c D ~. DyLKRPNTTv Y KN * vch .5 and 3.QP”Gy c .F ~~ .and 3’-flanking regions. This sequence extended about 960 base pairs upstream from the cytosine which was arbitrarily labeled as nucleotide 1.. since there were a number of nucleotide changes and in-frame stop codons in the apparent exons from both the 5’ and 3’ regions when their sequences were compared with the plasminogen cDNA sequence (8. Altogether. The restriction digestion and mapping of the genomic inserts in these clones with endonuclease EcoRI was consistent with that obtained by the amplified DNA by the polymerase chain reaction. A potential CAYTG signal (35) was identified 13 bp downstream from .u”~ ‘SP”STEQLRP TKCE A chain ct lain G G T “Y C-L 0 s G _ ” uL . ” ‘“““‘. 3).. however. N. The exons varied in size ranging from 75 to 387 nucleotides. Primers were also prepared from the 5’ and 3’ ends of the existing clones (X1-X3). 17). The positions of the introns (A-R) are indicated by solid arrows at or between specific human tion of coding FIG. G. The sequence of these primers was matched to the appropriate regions of the cDNA (8. 3. The first exon contained the 5’-noncoding region and coded for a typical signal peptide including a hydrophobic core. and one more clone (X6) was obtained by screening a human lung fibroblast library using the 3’ end of the cDNA (Fig.L Q c : s (I LI ‘k pN HPE T~~~~~~~R 791 RPNKPGUYURUSRFUTUIEGUMRNN~ quences that apparently corresponded to the 5’ and 3’ regions of the cDNAs. E. K. These fragments covered approximately 14. D.‘” G c-c A” H UPUS P I GP E” R”” LL”“ppp RGK K~” NG bE. H. Eight of the splice junctions were type I (introns A. while the active site His.~~“K y TT D” DPEKRY *I. 0.346 347*R P P ELT.: ‘DOD UC C--c L. and Q).fc$ -. 17).6108 Organization of the Gene for Human Plasminogen K2 “SGLE:QRUDSQ. Preliminary sequence analysis of these clones indicated. c c--c : G NK TK~~rr.” . (18) using the 5’ portion of the cDNA as a probe. the gene for plasminogen is the largest of the known serine proteases involved in blood coagulation and fibrinolysis (31). (27) and with the consensus sequence of Mount (28). Amino amino acids. F. The signal peptide (shown in a box with negative numbers) contains 19 amino acids and is cleaved by signal peptidase at the Gly-Glu peptide bond. In addition. Carbohydrate attachment sites (Asn-289.5 kb of genomic DNA from the 5’ and 3’ portions of the gene. I. F . Nucleotide Sequences of the 5’. and the 3’ portion coding for exons XVIII and XIX. Exon XIX was the largest of the 19 exons and included the coding region for the active site Ser. no “GC” boxes were present. However. 1). The conversion of plasminogen to plasmin occurs by the cleavage between Arg-561 and Val-562 (shown by an open curued arrow). The 5’ and 3’ Portions of the Gene for Plasminogen-To obtain the correct 5’ and 3’ portions of the gene coding for plasminogen.
n~le I Krlngle 2 Krlngle 3 Krlngl-3 Krlngle 5 tPA . 3). '.TTTCAG?. . . .. .. A consensus sequence of YGTGTTYY. Location of the introns in the genes for five kringlecontaining proteins. ‘.ATTCAGATTTCCAAAC.. which is required for efficient formation of the 3’ terminus of mRNA (36). -... 4.. . . urokinase. : :*. ..ACACAGGTACTTTTGG.: +. ... .CCCCAGATTCTCACCT. :2.GTCCAGxGGAATGTAT.. An alternative polyadenylation site for the cDNA reported by Forsgren et al.GAGTGTGAA?GTCAGG. (28). ..:=. the conserved AATAAA sequence. : : ... . Act.TTTCAG%TGCCGTA. . . however. Cnt.. y-carboxyglutamic acid. .7...-T:*. . . EGF.:‘Kp =.TTTCAGCACCACCTGA.. Solid arrows indicate the location of the introns in plasminogen.TTCCTTCCAiiGTAAGT. including tPA. The splice junctions of the introns within the kringles were usually type II.. and this shuffling occurs primarily at type I intron-exon splice junction boundaries. .. . :'-v.TCACCTGCAcGTATTT. :... Klb-K5b.*. : t .GAAAAGAA? GTGAGT.-. GCTTGGAGG GTATGT . These results are consistent with the concept that the kringle-containing proteins as well as other proteins with specific domains have evolved in part by gene duplication and exon shuffling (41). .ACAGACCTACGTAAGA.:A .$. ..* : 1.jbc. . Potential CAYTG signals for this cDNA were found two nucleotides upstream and ten nucleotides downstream from the second polyadenylation site.* : *. . . The exact size of exon I (shown in parentheses) is not known... TCCGAAGAA? GTAAGA .@q tt t Factor XII : signa... ... .. ... tPA (tissue-type plasminogen activator). The second kringle in prothrombin.:. A potential CAYTG signal for this cDNA was present 26 bp downstream from the alternative poly(A) site. . preactivation peptide.:.. t “3 Kringle 1 Kringle 2 .'.* p. -. Each of the five kringles was coded by two separate exons with a single intron inserted in the middle of each structure.AGCTAAGCzGTACTC. PAPl. urokinase-type plasminogen activator (&A). . A C AAG GTGAGT A B C 0 E F G H I J K L M N 0 . . . epidermal growth factor. . : . . ...2. 2 *g.GCTCCCAC=GTAAGC. on July 27. was not present within 50 nucleotides downstream from the first AATAAA sequence.TTCCAGETTGGAATG. PAPS. ‘.. Size bp (169) 136 107 115 140 121 119 163 146 160 182 149 94 121 75 141 107 146 387 EXOII Boundary sequence Intron Boundary sequence EXOll Junction type” I 11 I II I II I II I II I 0 1: II II I 0 5’-Noncoding PAP1 PAP2 Kla Klb K2a K2b K3a K3b K4a K4b K5a K5b Act His Asp Cnt Loop Ser + 3’-noncoding I II III IV V VI VII VIII IX X XI XII XIII XIV xv XVI XVII XVIII XIX Consensus* . .:2. urokinase-type plasminogen activator (38). . . prothrombin. The third polyadenylation site for the cDNA obtained from a human Hep G2 library that contained an extra 750 nucleotides of 3’-noncoding DNA was also identified in the gene. amino-terminal half of the kringle.* . . . . Kla-K5a. . : : -. and prothrombin (40).org at INSTITUTE OF MICROBIAL TECHNOLOGY LIBRARY: (CSIR). amino-terminal half of the preactivation peptide. was present 32 bp downstream from the AATAAA sequence for the third polyadenylation site.:: .I”. Data are taken from the following references: tPA (37). . (9) was found 31 nucleotides downstream from the first polyadenylation site.+T**. . Plasminogen Signal P*P ~r. . .. . Loop.: :. Organization of the Gene for Plasminogen-Intron A was located between the nucleotide sequence coding for the signal sequence and the first half of the preactivation peptide.i. however. . .-.CTGCAG?%CATTCCAA. and the first kringle in prothrombin (Fig. .. Gla.A-*=’ 4. : : : : .. . factor XII. .CCTCAGTGTGGTAGGT..CCNCAG II III IV V VI VII VIII IX x 4 XIII XIV xv XVI XVII XVIII XIX Downloaded from www. .ACCCAAATGeGTATGT. ::: . : '-. . . .CCACAGCGGCCCCTTC...Organization of the Gene for Human Plasminogen 6109 TABLE II Nucleotide sequence at the splice junctions and size of exons Interrupted codons by introns are underlined.y::*. . . activation cleavage site.’ . This was the same pattern as that found in other genes containing one or more kringle structures.TTAGAACAAG GTAAGA.. : . CTGAAATCAG GTAAGA . .. .-.CTCTAGGTCM&GAGA.CCTGCCGTC.CTTCAGTGTATCTCTC. : \*..*.* '..* : . .TTCCAG%CCTGACA. 2011 - GT ’ Sharp *Mount (29). ... *. . R .GGAAAAAAATGTAAGC.. . . . .. .CAGTTGCCAGGTAAGC. This consensus sequence.:..*.GAAACCCA.iGTGAGA. ..TTCCCTGCAiiGTAAGT..a.... .. . .* c.... . while the second intron (intron B) in the gene for plasminogen was located in the middle of the preactivation peptide (Fig. .. : . Type II EGF 1 Type I EGF 2 Kringle FIG..’ f -.. . carboxyl-terminal half of the kringle.TTTCAG-AAATTTGGAT.. and factor XII. factor XII (39). .TTCAAGCAACACCTCC... 4) (37-40). was en- . TT T . PAP..TTCTAGGTCCCCAAGG. . while the introns between the kringles were type I (Table II). disulfide loop prior to the active-site serine.f '-'.. This sequence was identical to the consensus sequence in four of the five nucleotide positions.'. . ..GTACAGACTGTATGTT.GTATAGETGACAGTG.. . connecting region to the A chain.. . ... .. ... : t-2 *.. carboxyl-terminal half of the preactivation peptide. .CCCCGCTGcGTGAGT. . .(-&q t -f t t signal Type I EGF Krlngls 1 Krlngle 2 SIgnal EGF UPA Kringle Signal Prothrombin Gla .* if :.. ...
especially tPA. Results obtained from linkage studies also support this conclusion (44. .‘. : . on July 27. 5 and 6). . d Restriction fragment length polymorphism. ‘. (9). 2 matches the cDNA prepared from human liver of Hep G2 cells. An exon coding for apolipoprotein(a) (or a very closely related gene) from the region that included the potential active site Asp residue was also amplified by the polymerase chain reaction during these studies and identified by preliminary sequence analysis.. Apparent Polymorphism in the coded by a single exon (40). unpublished data. and this intron sequence is removed during the processing of the apolipoprotein(a) mRNA. 5.jbc. An alignment of portions of the gene for plasminogen (PLG) and the cDNA for apolipoprotein(a).org at INSTITUTE OF MICROBIAL TECHNOLOGY LIBRARY: (CSIR). apolipoprotein(a) may have evolved via exon shuffling (41) and the deletion of exons II-IX in the plasminogen gene. PAP.. . Comparison of the structures for human plasminogen and apolipoprotein(a). . suggesting that it is a pseudogene. One of the related genes differed from the plasminogen cDNA in that it contained in-frame stop codons in the apparent exons.. . . 6. 5 *. ‘-’ . TABLE III Apparent polymorphisms of the amino acid residues in plasminogen as deduced from the genomic sequence and cDNA sequence (8. . ’ From Forsgren et al. suggesting that the internal intron in this kringle may have been lost during evolution. and intron I were inserted between serine at position -4 and alanine at position 347 in the gene for plasminogen.9 kb) cDNA* (2.::. . ct I 37 xp. These results also support the conclusion that the kringle-containing proteins are a family of proteins that have evolved from one or more common ancestral genes. since the nucleotide sequence shown in Fig. t “. b E. . . and the open arrows indicate those predicted in apolipoprotein(a) by homology in the amino acid and cDNA sequences of the two proteins. (11). 45). the 3’ end of the cDNA for apolipoprotein(a)..6110 Organization of the Gene for Human Plasminogen FIG. exons II-IX. The serine at position -4 and alanine at position -3 are adjacent to each other in the cDNA for apolipoprotein(a) (10). Solid arrows indicate the location of introns in the gene coding for plasminogen. However.7 kb) cDNA’ Gene RFLPd CANQ) CMQ) CANQ) Mae11 XmnI AAT AAT AAT AAC AAC AAT C-238 TGT TGC TGC V-272 GTE GTT GTG F-295 TTC TTC TTT TTC E-342 CAA(Q) CANQ) CAN&) CAN&) N-453 GAT(D) GAT(D) GAT(D) @T(N) V-563 GTA GTA GTG GTA G-743 GGT GGT GGT GGG 3’. The gene organization for the light chain of plasminogen was also similar to that of other serine proteases. 2 are consistent with the concept that the gene described in the present study is the one that expresses plasminogen.. . as well as a number of nucleotide substitutions. :: . .NC45 GGAAC GGAAC GGGAC GGGAC 3’-NC49 CGAGG CGTGG CGTGG CGTGG ’ From Malinowski et al. An intron may be present between these 2 amino acid residues in the gene for apolipoprotein(a).* i: . other very closely related genes have been identified during these studies and portions have been subjected to preliminary DNA sequence analysis. . Indeed.. Additional evolution of the plasminogen gene would involve multiple duplications of exons X and XI coding for plasminogen kringle 4 generating up to 37 kringles present in apolipoprotein(a) as well as a number of small insertions and deletions. Intron A. . followed by a recombination event that linked the signal peptide precisely at intron A to kringle 4 precisely at intron I (Figs. (8). Mulvihill and M. 2011 Plasminogen Signal PAP Kringle I Kringle 2 Kringle 3 Kringle4 Kringle 5 Apolipoprotein Signal . Alternatively.& .i Kringle . The intron/exon boundaries for this genomic DNA fragment were found to occur exactly in the same positions as those of exon XIV in the gene for plasminogen corresponding to amino acids 608 and 654 in the plasminogen polypeptide chain. exons II-IX in the plasminogen gene or portions of this DNA may be present in the gene for apolipoprotein(a) as a large intron. These genes are all apparently localized on chromosome 6. AuaII-HaeIII urokinase. .t .* f. 9. . I 0 (a) 4 : Kringle *. and factor XII (37-39). The data shown in Fig. .(-q FIG. Gene for Plasminogen-A number of minor differences were found when the DNA sequence of the gene for human plasminogen was compared with that of the cDNAs isolated in different laboratories (8. preactivation peptide. because the results obtained employing the part of the gene for plasminogen containing exon X or XIV (42). (10) and Tomlinson et al. Martzen. The gene coding for plasminogen is also closely related to that of apolipoprotein(a).. as previously discussed by McLean et al. although the positions of the introns relative to the amino acid sequence were slightly different.*:I::‘. 17). as well as the peptide sequence (5-7) Protein E-53 D-88 N-91 CDNA (1. : f ‘-‘a *. band q26-27.. . Downloaded from www. and a cDNA fragment containing kringles l-3 of plasminogen (43) are in agreement.
Higuchi. Yoshitake for kindly providing genomic libraries. G.. R. Schwartz. J. (1989) T/work Hhekstas. Note Added in Proof-The primary structure of an additional member of the plasminogen gene family (human hepatocyte growth factor) containing four kringles has been reported (Nakamura.4670-4674 16. Acad. however.149-152 Frank.. J. P. Gannon. M. M. 32... W.. Boast. Gersdorf.. and Man&is. S... Chem.. E..871-881 51. Whitton. (1982) Nucleic Acids Res. 254. 440-443 and Miyazawa. and Yang. T. 79. M. N. (1979) J.. Dykes. R. G. Tomlinson. denet.. Malinowski. 6i. B.5463-5467 23. Swenson for help in the preparation of the manuscript. Degen.. and L..3963-$965 -’ 24.. 50. L.. W. This substitution would be in addition to the well known differences in carbohydrate in plasminogen that also leads to changes in electrophoretic mobility (51).. A. W. Tateno. F. Sci. (1987) Nature 330. (1985) Nucleic Acids Res. R. (1981) Prog.___ Weitkamp. R. Saito. 40. Gibson. T. D. L. Maniatis and S.. S. R. E.. (1981) Biochim. R. (1986) Proc. (1984) Electrophoresis 4. and Enquist... Veerman. E. Nicklen. V. S. Hirsh.338-350 46. Sugimura.. E. A. C. A. H. (1988) Hum. F. P. (1987) J. Comeau. D._ _ Y ” 79. Sparkes.. Fowlkes. A.... M. E. C!. 42. A. 78.. and Shimizu.8772-8776 . Tomlinson. 383 33. H. T. x2-?. H.. M.. R. G. M. pp. and Magnusson.. J. Hobart. S.489-494 6. Sci. Marder. BreathnaiL. K. 167-181 Enzymol.. A. K. A.) 18. J. in the substitution of amino acids that are different from those determined by amino acid sequence analysis of the plasma protein (Table III). F. Biochem.. W. (1984) Proc. B. S. Nakayama. R. Y. von Zonnevelt. Murrav. C.. and Erlich. A. 41. and these DNAs have an additional Hue111 site which does not exist in other DNAs. Rlden. M.. Hagiya. Hornunn. W.‘Lond.. Sakata. Five of the nucleotide substitutions occurring in the coding region had no influence on the amino acid sequence... C. and Fujikawa. (1978) Prog. IIe-67 located in the second half of the preactivation peptide region encoded by the three nucleotides (CAA) in the genomic DNA sequence and the cDNA (9) was not identified by amino acid sequence analysis (5). Shinmyozu. Reu. J.. Duba. 4243-4250 9. P. Menzel. 5355-5369 Riccio. 44... G. Larsson. Lerch. J. some of the differences.. BioDhvs. MacGillivray. Forsgren. 3-13 2. 39. (1984) Nature 309. Rijken.254-260 10.. Acknowledgments-We thank Drs. L. A. G. Kera. L. U. Commun. K. 75. H. G. R.. and Powell. A. S.-O... K. Sottrup-Jensen. K. S. C. Karam. and Blasi. K. R. 17) (Table III). 13... Natl. A. diblett. 163169 _-_ Downloaded from www. F. U. (1983) Proc. Genet. (1985) Nucleic Acids Res.. Gelfand. and Davie. Motulskv. B. Seki.. and Davie. Nelson. as was the case for two changes in the 3’-noncoding region located 45 and 49 bp downstream from the stop codon in the cDNA (9).179-182 McLauchlan. F. H. I. Lergier.. E. Saiki. Chem. Several differences in the genomic DNA sequence and the cDNAs did result. M. J.. pp.. and Gillessen. Chem. and Schultz. N. D... H. (1987) in Herno&& and Thrombosis (Colman. (1980) Eur. S.377-387 13. A.. O’Hare.. 78. and Asp and Asn (residue 88) sometimes were difficult to differentiate when phenylthiohydantoin derivatives were separated by two-dimensional chromatography. C.Deeen. H. J. 5. S. Takio. W. Buetow. (1989) Biochem.419-423 ’ 47. Sci.. A. Nishimukai. J. J.. S. D.. Blake. (1983) Biochemistry S... Biophys. Petersen.. P. Espling for technical assistance. S. Takamatsu. and Utermann. Parker. J. 132. M. Robbins.5759-5763 27. Y. N. 3. G. (1977) Eur. and Yamasawa. S. 12. Natl. 967-973). Solowiejczyk.. P. and Lund.. J.. E. (1987) Am... R. S. McLean. E. Dr. T. (1982) J. J. Acad. an AvaII site in exon XIX (at Gly-743) was not present in some of the genomic DNAs. Zajdel. and Shows. k. Ichinose. 3. U. G. R.. Stoffel. Biol. Hirono. T. Biol. E. Okigaki. J. W. 13. K. Nishizawa. K. H. Naka. (1978) Cell 15. J.. and MacGillivray. D.. Genet.. S. (1981) Cell 23.. Lijnen. (1988) Hum. R. E. and Coulson. Davie. Klisak. 365-378 4.. S. Ballantine. Mohandas. 257. Silhavy. T. J.. E. (1979) Ann. S. G.. Maruvama. W.. K. R. Poncz. L. and Sakata...) 49.. K. Korinek. Lippincott. Using the isoelectric focusing technique. on July 27. D. S. Blake.-J. and Chambon. B. PA 32.. A. 3. G.. G. Takahashi. (1981) Proc. and Polesky. S. Y. D. Arakaki.. D. E. D. J. E. The substitution of a charged residue. Some of these differences may be due to amino acid sequencing artifacts. Also. (1981) Annu. Shimonishi. K.” Sadler: E. E.8710-8714 Adrian...459-472 29. S. J. T. 40.. T. Wiman. (1980) Am. 136621 RWR _--.. (1986) J. We also wish to thank Dr. K. Wiman. M. and Surrey. D. U.. and Hone. C. F. 264. (1989) Nature 342.3736-3750 20. M... A. Petersen. E. R. Sci. Res. G. Ichinose. D. 11 (abstr. 43.. Y. Anderson.5957-5965 J. H. (1988) Science 239. and Davie. G. L. J. Acad. Natl. A. I.. Genet.. Lawn. C. J.. E. M. Tsubouchi. (1981) Methods 3. A. Fibnnol.‘J. U. 262. T. F. Sci. and Pannekoek. 107. Biol.jbc.. Ml -ma 48. W.. M. Gohda.. (1982) Proc. P.. Philadelphia. E. Mullis.... (1989) Hum.. R. (1984) Proc.. D. fiickli. McLean.... Dist%he. Swisshelm. 36. Clemmensen. W. U. such as Asn-453. In addition. C. Cleve. Aoki. J. D. 163.. Bowman. T.. P. several variant alleles for plasminogen have been reported in different populations (46-50). H:. and Kurachi. 40..2087-2097 22. M. (1989) J. Scanu. H.. O. Biochem. A.w . Felss. Thrombol. K. Natl. and Lawn.5bl Lindahl. L. W.. 129-137 7... Mount.. Humphries. Fritsch. W. Marcus.. J. Scharf. Thorsen. and Lusis. Fibrinol. E. 2912-2919 5. These apparent restriction fragment-length polymorphisms might be helpful in studying various normal and abnormal genes. and Collen. Biochem..643-646 30. 79. S. Elgh. Espling.. M. J. REFERENCES 1. 34. W. L. may contribute in part to the differential electrophoretic mobility of the gene products and to the heterogeneity of plasminogen. D. J. Biol. T.. S. and Alper. E. F..7-13 14. B.. Verde. J. Sebastio.3.. E. C. J. Petersen. B.1347-1368 Ny. Hoylaerts. J.167-175 Berget.. E. 38.6165-6177 ~I Gilbert.. Natl. Daikuhara.. Acta 668.4853-4857 28. S. T. 50. R. A. Chem. M.. L. Natl. W. Donovan. Sci. and Wall&n. and Magnusson. (lG78) Nature’271.Organization of the Gene for Human Plasminogen 6111 9.. Hum. J. &net. Cold Spring Harbor.. Inuest. W. S. d. 82. Sci. S. W. F.422-425 49. McLean. 76. (1975) Eur. R. and Davie. (1983) Nature 306. Sadler. 37. 213. Sottrup-Jensen. E.. Breathnach. M. Acad. Kuang. A. M. (1984) Biochem. Martzen.. 81. I. Sadler for assistance in the initial screening of the genomic library. Acad. R... J... 10. J.4298-4302 25. Benoist. 140-141. G. H. Mullis. W. W. D.487-491 26. E. (1984) Experiments with Gene Fusions. 35.. 11.. such as Asp-453 for the uncharged Asn residue (Table III). A. E.. K. Israelsson. 349- 22. T... B. Lawn.... Claeys. Acad. K.. L. B.) 17. Tashiro. S. T. Glu and Gln (residues 53 and 342). resulting in amino acid changes.. P.. (1988) Fibrinolysis 2.1157-1174 19. Chem.. H. For instance. U. H. Raum. 2011 15. Davie. S. C. E. S. Gaffney. and Davie.. eds) 2nd Ed. F. F. Genet. 495 (kbstr.535-537 31. E.. H. B. (1986) Gene (Amst. S.. Sanger. n 9. 8d. Natl.. Yoshitake. Acad. M. ‘Hum. 83. . E. A. Acad. Chem.. NY 21. and Lawn. 22. Eiochem. T. while other genomic DNAs lacked these sites. M. (1978) Proc. M.. R. J. Dyer. and Clements. M. H.80-82 45. Berman. 42. (1985) Biochemistry 24. Chung for helpful discussions. Cold Spring Harbor Laboratory. I. and Hedkn.org at INSTITUTE OF MICROBIAL TECHNOLOGY LIBRARY: (CSIR). 191-209 8. (1984) Biochemistry 23. Eaton. Some of the nucleotide substitutions described above were also confirmed by restriction digestion of amplified genomic DNAs from normal individuals showing that apparent polymorphisms exist in the gene for plasminogen. and Rutter. Sci. Castellino. Eddv. Ton&won. 74. 417-420 50. (1978) Prog... J.. M. Bell. F. Thrombol. R. K. ic. 81. M. W. Chen. D.“” 1. (1977) Proc.. J. 80. Clin. A. Wallen.. Sakiyama. and Castellino. M. Martzen..2759-2771 Cool. 242-267. Takahashi. Horn. 5’. and Crabtree. Biaein. Hum. Hayes. N.. (1981) VOX Sang. and Dr. Ichinose. Guttormsen. Alternatively. (1987) Biochemistrv _ 26. and Kitamura. Hagen for a Hep G2 cDNA library. T. Sharp. Natl. Grimaldi. Genet. A.. G. may be the result of polymorphisms in the normal human population. A Mae11 site in exon VII (at Cys-238) and a XmnI site in exon VIII (at Phe295) were found in some of the amplified genomic DNAs. J. D. B. Fibrinol. Foster.. K. and Salzman. Schach.. (1987) FEBS L&t. T. and Chambon.