基因組學課件整理
1.什么是SNP和SSLP?
SNP:即單核苷酸多態(tài)性,是由于基因組中等位位點上單個核苷酸改變而導致的核酸序列多態(tài)性(Polymorphism)。
SSLP:簡單序列長度多態(tài)性,是一系列不同長度的重復序列,包括衛(wèi)星DNA,小衛(wèi)星,微衛(wèi)星(STR)。
2.知識整理:
一.基因組介紹
1,Gene: A DNA segment containing biological information and hence coding for an RNA and/or polypeptide molecule.
Genome: The entire genetic complement of a living organism.
n Prokaryocyte
n Eukaryocyte: nuclear genome + organelle(chloroplast, mitochondrion) genome
2,Transcriptome: Coding RNA; the product of genome expression
3,Proteome: The proteome comprises all the proteins present in a cell at a particular time.
The proteome means all the proteins being made by the transcriptome
4,基因組學的發(fā)展和研究現(xiàn)狀
二 基因組作圖
繪制遺傳圖譜的實驗基礎是什么?即連鎖分析。
1,基因組做圖的目的:利用鳥槍法測定含有重復序列的DNA大分子方面存在困難:①利用鳥槍法需要將DNA打成片段,進行測序后再進行拼接;這對于較大的基因組尤其是人的基因組來說是困難的,因為隨著片段數(shù)的增加,所需要分析的數(shù)據(jù)越來越復雜;②鳥槍法存在的第二個問題是當分析基因組的重復區(qū)域時會發(fā)生錯誤,導致部分重復區(qū)域被遺遺漏或是將同一染色體或是不同染色體的兩個片段錯誤的連接在一起??偠灾?,在測序時需要首先建立一個圖譜,通過標明基因和其他顯著特征的位置,為測序提供引導。
2,基因組做圖的類型:遺傳圖譜和物理圖譜
3,遺傳圖譜的含義:應用遺傳學技術(shù)構(gòu)建的能在基因組上顯示基因和其他序列特征位置的圖譜。遺傳學技術(shù)包括雜交育種技術(shù)實驗。連鎖分析是遺傳做圖的基礎。
4,物理圖譜的含義:
5,遺傳圖譜與物理圖譜的比較:
遺傳作圖(Genetic mapping)也稱連鎖圖譜(linkage map)
作圖方法:“連鎖分析(linkage analysis)”包括雜交實驗(cross-breeding experiments),家系(pedigrees)分析等。根據(jù)遺傳實驗計算標記間的相對距離。
標記:性狀、基因或DNA分子標記。
圖距單位:厘摩(centi-Morgan, cM), 每單位厘摩定義為1%交換率。
物理作圖(Physical mapping)
作圖方法:采用分子生物學技術(shù)測定標記間的絕對距離,直接將DNA分子標記、基因或克隆標定在基因組實際位置。
圖距單位:物理圖的距離依作圖方法而異,如輻射雜種(radiation hybrid)作圖的計算單位為厘鐳(cR), 限制性片段作圖與克隆作圖的圖距為DNA的分子長度,即堿基對(base pair)。
6,用于遺傳學做圖的DNA標記:
① 限制片段長度多態(tài)性(RFLP):利用Southern 雜交和PCR方法;
② 簡單序列長度多態(tài)性(SSLP):包括衛(wèi)星DNA,小衛(wèi)星DNA和微衛(wèi)星DNA(又稱為STR)。
微衛(wèi)星比小衛(wèi)星更適宜做標記,一是因為小衛(wèi)星不是均勻分布,長分布在染色體末端的端粒區(qū);二是因為PCR方法更適宜于對微衛(wèi)星DNA的分型,微衛(wèi)星的多態(tài)性更高。
常用PCR技術(shù)結(jié)合毛細電泳技術(shù);
③ 單核苷酸多態(tài)性(SNP):最緊密的DNA標記。多數(shù)SNP是雙等位基因。研究SNP多用寡核苷酸雜交分析。篩選策略有:DNA芯片和液相雜交技術(shù)(熒光淬滅技術(shù))。
7,衛(wèi)星DNA: (satellite DNA) 是一類高度重復序列。DNA在介質(zhì)氯化銫中作密度梯度離心,離心速度可以高達每分鐘幾萬轉(zhuǎn);此時DNA分子將按其大小分布在離心管內(nèi)不同密度的氯化銫介質(zhì)中,小的分子處于上層,大的分子處于下層;從離心管外看,不同層面的DNA形成了不同的條帶。根據(jù)熒光強度的分析,可以看到在一條主帶以外還有一個或多個小的衛(wèi)星帶。這些在衛(wèi)星帶中的DNA即被稱為衛(wèi)星DNA,這種DNA的GC含量一般少于主帶中的DNA,浮力密度也低。
小衛(wèi)星DNA:(minisatellite),有時又稱可變串連重復(variable number of tandem repeats, VNTR),其重復單位的長度為數(shù)十個核苷酸,常位于端粒和近端粒區(qū)。
微衛(wèi)星DNA(microsatellite)或簡單串聯(lián)重復(simple tandem repeats, STR ),其重復單位為1-4個核苷酸,由10-50個重復單位串聯(lián)組成,散布在整個基因組。
8,不同模式生物的連鎖分析:
對果蠅和小鼠等物種:通過有計劃的育種試驗;
對人類,通過家系分析;
對不發(fā)生簡述分裂的細菌的連鎖分析:結(jié)合,轉(zhuǎn)導和轉(zhuǎn)化。
9, Deficiencies of genetic maps:
l Limited resolution(分辨率)
l Limited accuracy (精確度)
Recombination hot spot(重組熱點)
Exchange frequency differences between genders(性別差異)
Numerous exchanges between two locus(兩位置之間多次改變)
三 物理圖譜
問題:限制性做圖與RFLP有什么區(qū)別?FISH在物理圖譜中起什么作用?
限制性作圖是物理作圖法,可以得到兩酶切位點之間的物理間隔距離(kb);RFLP是一種個體基因組中的多態(tài)性標記,由酶切位點堿基變異引起的酶切長度多態(tài)性,由連鎖分析這些多態(tài)性標記在親代和子代間的重組頻率來得到RFLPs之間的遺傳圖距。FISH是將熒光標記的DNA片段通過雜交定位到染色體上,觀察不同DNA片段在染色體上的位置和物理距離。
1,物理圖譜的含義: Physical maps - identify exact location of DNA sequence in the genome
2,物理做圖的原理(種類):
Principles for physical mapping (p88):
l The earliest physical map—— cytogenetic map (10Mb)
l Restriction mapping——restriction map (Kb) 其規(guī)模受限于限制片段的大??;方法有電泳和光學作圖(包括凝膠拉伸和分子梳理);
l STS mapping任何一個唯一的DNA序列均可以作為STS。
獲得STS 的方法,有表達序列標簽(EST)、SSLP和隨機基因組序列
? Clone-based mapping --------可用細胞流速儀進行檢測
? RH (Radiation hybrid)——輻射雜種 (1Mb)
l FISH (Fluorescent in situ hybridization)——熒光原位雜交
3,物理圖譜中大片段的克隆載體:
l Plasmid (質(zhì)粒) 10kb
l λ噬菌體 15kb
l 粘粒Cosmid 50kb
l P1噬菌體 可達125kb
l PAC(P1人工染色體) 可達300kb
l YAC(酵母人工染色體) 200~2000kb 如含1Mb插入片段的32,000個克隆的人基因組YAC庫。
l BAC (細菌人工染色體) 100~300kb 如含300kb插入片段的30萬個克隆,覆蓋人基因組30倍。
BAC是HGP通用的標準大片段克隆載體。
4,限制性做圖的方法:提取DNA——稀有的限制性內(nèi)切酶切割——DNA片段分離鑒定
(光學方法有凝膠伸展和分子梳理技術(shù))
5,熒光原位雜交:(P94)
The position at which the probe hybridizes to the chromosomal DNA is visualized by detecting the fluorescent signal emitted by the labeled DNA.
Flow of FISH:
①Probe:
l ~100kb (from BAC clone of human genome)
l be tagged directly with fluorophores, with targets for antibodies or with biotin (By nick translation or PCR using tagged nucleotides).
②Interphase(間期) or metaphase(中期) chromosome attached to glass
③Blocking the repetitive DNA
④Hybridizing
⑤Detection by fluorescent microscope
Development of FISH:
? radioactively labeled in situ hybridization
l sensitivity
l Resolution
? Fluorescence In Situ Hybridization
l repetitive DNA sequences
l Mechanically stretched chromosomes (resolution reaches 200~300 kb)
l Non-metaphase chromosomes (Resolution down to 25 kb)
Application of FISH:
? Medical application
l Discover cytogenetic variation: deletion, translocation on chromosomes.
l Detection pathogen from the samples of patient's tissue.
? Academic research
l Genome mapping
l Genome comparison
6,STS序列標簽做圖的原理:
STS are short sequences that are operationally unique in the genome and are used to generate mapping reagents.
Principle for STS mapping (p96):
Collection of overlapping DNA fragments; Checking for the breaking frequency of two STSs
The most common sources of STSs:(P98)
? ESTs (expressed sequence tags)
? SSLP
? Random genomic sequeces
7,放射雜交做圖(RH):Radiation hybrid (RH) map: A genome map in which STSs are positioned relative to one another on the basis of the frequency with which they are separated by radiation-induced breaks. The frequency is assayed by analysing a panel of human–hamster hybrid cell lines, each produced by lethally irradiating human cells and fusing them with recipient hamster cells such that each carries a collection of human chromosomal fragments. The unit of distance is centirays (cR), denoting a 1% chance of a break occuring between two loci. (p98)
輻射雜交制圖流程:
輻射雜種細胞系(嵌板,panel)產(chǎn)生 →確定STSs →PCR 體系及反應條件→對PCR結(jié)果數(shù)據(jù)處理→構(gòu)建RH 圖譜
作圖單位:厘鐳(CentiRay)——DNA分子暴露在N拉德(rad)X射線劑量下兩個分子標記之間發(fā)生1%斷裂的頻率。
8, 克隆文庫與輻射雜種細胞系作為STS作圖試劑的比較:
克隆文庫
輻射雜種細胞系
外源片段含量
1段
多段
文庫所需克隆數(shù)
多
少
可否直接測序或構(gòu)建克隆重疊群
可
不可
四 基因組做圖和數(shù)據(jù)挖掘
1,基因組做圖的策略:
? 重疊群法(clone contigs method)——up to down
? 鳥槍法(whole-genome shotgun method)——bottom to up
2,全基因組鳥槍法測序使用物種:小基因組,包括原核生物,病毒等;
限制因素:
3,基因組測序的難點:①Repeats:Tandem repests;Genome-wide repeats; ②Gaps
4, DNA測序方法學:
①Chain termination DNA sequencing (Sanger et al, 1977): the sequence of a single-stranded DNA molecule is determined by enzymatic synthesis of complementary polynucleotide chains, these chains terminating at specific nucleotide positions;聚丙烯酰胺凝膠電泳檢測
②Chemical degradation method (Maxam and Gilbert, 1977): the sequence of a double-stranded DNA molecule is determined by treatment with chemicals that cut the molecule at specific nucleotide positions. 聚丙烯酰胺凝膠電泳檢測
③焦磷酸測序:可以用來快速去頂很短的序列;無需電泳
5, 連續(xù)DNA序列的組裝:
① 通過全基因組鳥槍法拼接序列:
優(yōu)點是測序速度快,能夠在遺傳或是物理圖譜不存在的情況下工作;(主要特征為:最少利用了兩個不同類型載體構(gòu)建的克隆文庫;確保其中一個克隆文庫中所包含的片段長于所研究基因組中最長的重復序列)
② 用克隆重疊群法組裝序列:
可以通過染色體步查方法建立克隆重疊群(該法費時費力);另一種方法是使用克隆指紋圖譜技術(shù):限制性圖譜;重復DNA指紋圖譜;重復DNA的PCR;STS含量做圖
6, Clone contigs: A collection of clones whose DNA fragments overlap.
How to sorting clone contigs:
a) Chromosome walking: A technique that can be used to construct a clone contig by identifying overlapping fragments of cloned DNA.
b) Clone fingerprinting: Any one of several techniques that compare cloned DNA fragments in order to identify ones that overlap.
7, A scaffold(骨架) is a portion of the genome sequence reconstructed from end-sequenced whole-genome shotgun clones. Scaffolds are composed of contigs and gaps.
A contig (克隆重疊群)is a contiguous length of genomic sequence in which the order of bases is known to a high confidence level.
8, 序列間隙和物理間隙:P118;如何填補
9,大規(guī)模自動測序方法的改進:
? thermal cycle sequencing (熱循環(huán)測序)
? Fluorescent primers are the basis of automated sequence reading
? Capillary Electrophoresis (毛細管電泳, CE) instead of Polyacrylamide Gel Electrophoresis (聚丙烯酰胺凝膠電泳, PAGE):
新的非常規(guī)測序方法:
? Pyrosequencing (p115)焦磷酸測序方法
l Sequencing-By-Synthesis
l ultra high throughput sequencing
原理:第一步——測序引物和PCR擴增的、單鏈的DNA模板雜交,與酶—DNA聚合酶(DNA polymerase)、ATP硫酸化酶(ATP sulfurylase)、熒光素酶(luciferase)、三磷酸腺苷雙磷酸酶(apyrase)—和底物—adenosine 5′ phosphosulfate (APS)、熒光素(luciferin)孵育。
第二步——四種dNTP(dATPS,dTTP,dCTP,dGTP)之一被加入反應體系,如與模扳配對(A—T,C—G),此dNTP與引物的末端形成共價鍵,dNTP的焦磷酸基團(PPi)釋放出來。
第三步——ATP硫酸化酶在APS存在的情況下催化焦磷酸形成ATP,ATP驅(qū)動熒光素酶介導的熒光素向氧化熒光素(oxyluciferin)的轉(zhuǎn)化,氧化熒光素發(fā)出與ATP量成正比的可見光信號。
第四步——ATP和未摻入的dNTP由三磷酸腺苷雙磷酸酶降解,淬滅光信號,并再生反應體系。
第五步——然后加入下一種dNTP。
? DNA chip——based on DNA hybridization
? Solexa and GS20
五. 不同模式生物的基因組介紹:
【1】微生物基因組介紹
1.1 病毒基因組介紹
1,病毒種類:真病毒,朊病毒和亞病毒(擬病毒和類病毒);
2, 病毒起源假說:
? 逆向假說:病毒可能曾經(jīng)是一些寄生在較大細胞內(nèi)的小細胞。隨著時間的推移,那些在寄生生活中非必需的基因逐漸丟失。
? 細胞起源假說:一些病毒可能是從較大生物體的基因中“逃離”出來的DNA或RNA進化而來的。
? 共進化假說:病毒可能進化自蛋白質(zhì)和核酸復合物,與細胞同時出現(xiàn)在遠古地球,并且一直依賴細胞生命生存至今。
3, 病毒基因組的多樣性:
Nucleic acid
DNA
RNA (自己攜帶RNA復制酶)
Shape of Genome
linear
circular
segmented
Chain of Nucleic acid
Double strands
Single strand
Partial double strands
Polarity
Sense (+): can be translated or transcribed directly by host cell
Antisense (?)
Double sense (+/?)
4,病毒基因組的特點:
? One kind of nucleic acid—DNA (commonly double strand) or RNA (commonly single strand)
? The size of virus genomes varies greatly (3×103編碼4個蛋白質(zhì)~1.2 × 106bp編碼100個蛋白質(zhì)). Genome of dsDNA virus is generally bigger than that of RNA virus.
? Overlapping gene
? Generally genes of virus have single copy.
? Most of the genome are coding sequences. Genome of phage is continuous, while genome of eukaryotic virus is discontinuous (gene with intron).
DNA病毒的基因組特點:
? Size:
l dsDNA genome (herpesvirus, poxvirus etc.) is bigger (120~280kb);
l ssDNA genome (parvovirus) is smaller (5kb);
? High coding efficiency in small DNA virus:
l Overlapping gene
l Utilize both strands for coding.
? Inverse terminal repeat (ITR) in genome of DNA virus is important for replication initiation by formation of hairpin.
? End replication problem for linear DNA genome:
l Terminal protein priming (adenovirus)
l site-specific nicking priming (poxvirus)
RNA病毒的基因組特點:
? dsRNA genome is segmented (such as reovirus).
? Many positive ssRNA viruses (for example SARS coronavirus) genomes have 5’-Cap and 3’-Poly(A).
? Most ssRNA genome is a single molecule, but there still have some exceptions (for example influenza virus).
? Overlapping gene, variable splicing, frameshift are common in RNA viruses.
Overlapping genes (OGs) are defined as adjacent genes whose coding sequences partially or entirely overlap. Many OGs have been identified in the genomes of prokaryotes, viruses, and mitochondria. Overlapping gene pairs can be divided into three types: unidirectional, convergent, and divergent .
5,病毒基因組復制中的問題與多樣性對策:
? 復制模型
①circle dsDNA virus: θreplication;σ replication
②Circular ssDNA virus
③Linear dsDNA virus
? 引物與引發(fā)
? 5‘末端的完整性
6,Variation of virus genome:
? Genetic drift: SNP
? Antigenic shift
? Rearrangement of segmented genome
? Genetic recombination
7,Diversity of expression strategies
? Timing control—病毒感染的級聯(lián)調(diào)節(jié)
? Protein biosynthesis of eukaryotic RNA virus
l segmented gene
l Splicing and assembly of peptide
l IRES序列:細小核糖核酸病毒基因組存在內(nèi)部核糖體進入位點(Inner Ribosomal Enter Site)
l Nesting subgenome RNAs
l Discontinuous mRNA
1.2 原核生物基因組介紹
1.基因組特點:
①Size: generally less than 5Mb; but also have exceptions, e.g. 30Mb for Bacillus megatherium (巨大芽孢桿菌).
②Most prokaryotic genome is circle, but some is linear.
③Compact genome organizaiton. Less non-coding sequences. Both strands have coding sequences.
④Operon is the representative structure of prokaryotic genome.
⑤Structure gene is generally single copy, but there are also some exceptions e.g. rrn coding for rRNA.
⑥The genome of E. coli is replicated bidirectionally from a single origin, identified as the genetic locus oriC.
⑦Lateral gene transfer(基因橫向轉(zhuǎn)移):Transfer of a gene from one species to another
2,大腸桿菌基因組物理結(jié)構(gòu):環(huán)狀;基因可以雙向復制;含有操縱子;基因連續(xù),不含有內(nèi)含子;
3,最小基因組:至少需要265~350個基因。至少包含能維持生命活動所必須的功能基因和調(diào)控基因,以及繁殖所用的基因。
4,原核基因組序列的破譯使得菌種分類的概念變得更復雜了:因為原核生物間能通過多種方法進行基因交換,但根據(jù)其生物化學和生理學特性,這些原核生物屬于不同的物種。基因流是物種概念的核心,但并不適于原核生物。單個物種的不同品系可以有完全不同的基因組序列,甚至有個別品系特異性的基因。
【2】真核生物基因組介紹
2.1 核基因組
1, 核基因組的特點:
1. Size:變化范圍大,107~1011bp
2. Ploid level (倍數(shù)性):generally diploid (二倍體)
3. Each eukaryotic chromosome contains many replicons
4. monocistronic mRNA
5. repeat sequences
6. Discontinuous gene: exon, intron, splicing and alternative splicing
7. Non-coding region:90%
8. Gene density: gene dessert vs. gene island
9. Rare overlapping gene, gene within gene
2,假基因Pseudogene:
? 已經(jīng)失活的無功能的基因拷貝,常用ψ表示。
? 類型及形成的原因:
①常規(guī)假基因(conventional pseudogene):DNA復制和突變引起,常位于同源基因有功能拷貝的附近。
a,無意突變:基因內(nèi)部出現(xiàn)終止密碼子;b, 啟動子突變失活;c, 剪接信號缺陷;d, 偶爾也可能通過一個有利突變而激活
②加工的假基因(processed pseudogene):功能基因的mRNA經(jīng)過逆轉(zhuǎn)錄產(chǎn)生cDNA插入基因組形成。
a,無內(nèi)含子。
b,無啟動子。來源于RNA聚合酶III轉(zhuǎn)錄物的假基因除外,因為它們的啟動子位于mRNA序列內(nèi)部,如Alu序列。
3, 為什么染色體帶型和等容線模型暗示了基因并非平均分布于真核生物染色體上?P207
4,真核生物中的重復序列:P219
5,串聯(lián)重復序列:
Type
Length of the repeat unit
Length of the cluster
Location and role
Satellite
<5bp ~ >200bp
~Mb
centromere
Minisatellite
<25bp
<20kb
telomere
Microsatellite
1~4bp, <13bp
<150bp
Whole genome -wide
形成原因:Replication slippage復制滑移;Accumulation of mutations(突變累積) in saltatory replications(跳躍復制)
6,DNA轉(zhuǎn)座的兩種機制:Replicative transoposition;Nonreplicative transoposition
7,DNA transposons of prokaryotes:
? 插入序列(inserted sequence)
? 復合轉(zhuǎn)座子(composite transposons) 在DNA轉(zhuǎn)座子的兩端有一對IS成分,內(nèi)含1個或多個基因,常為抗生素抗性基因。復合轉(zhuǎn)座子借助其它IS轉(zhuǎn)座酶以保守方式轉(zhuǎn)座。
? Tn3-型轉(zhuǎn)座子(Tn-type transposons) 具有自己的轉(zhuǎn)座酶基因,無須IS順序轉(zhuǎn)座,Tn3因子為復制型轉(zhuǎn)座。
? 可轉(zhuǎn)座的噬菌體(transposable phage) 這是一類細菌病毒,復制轉(zhuǎn)座是其正常生活史中一個內(nèi)容。插入后可以切離。
8,LTR元件:
? Retrovirus
? Endogenous retroviruses( ERVs,內(nèi)源逆轉(zhuǎn)錄病毒)are retroviral genomes integrated into vertebrate chromosomes. Some are still active, but most are decayed relics.
? Retrotransposons(逆轉(zhuǎn)錄轉(zhuǎn)座子)have sequences similar to ERVs but are features of non-vertebrate eukaryotic genomes
9,LTR元件的形成機制:
? LTR (long terminal repeats) contains transcriptional promoter and enhancer sequences: U3(含強啟動子)-R(正向重復序列)-U5(與轉(zhuǎn)錄終止和加polyA有關(guān))
? Formation of cDNA with directed LTR during retrotransposition
? 4-nt direct repeat formed in the integration site in genome
10,Retroposons(逆轉(zhuǎn)座子,返座元)
? LINE (long interspersed nuclear elements,長分散核因子): contains reverse transcriptase.
Example of LINE in human genome—L1
? SINE(short interspersed nuclear element,短分散核因子): its transposition depends on reverse transcriptase provided by other autonomous retroelements.
Example of SINE in human genome—Alu:
11,Transposition mechanisms of LINE and SINE:
? LINE:LINE with full length contains DNA endonuclease and reverse transcriptase gene.
l 通過切開靶位點雙鏈,提供了引物末端。
l 反轉(zhuǎn)錄轉(zhuǎn)座子作為模板合成cDNA
? SINE:transposed by “borrowing” enzymes from other autonomous retroelements
12. C值悖論:在大的真核生物基因組中、有較多的重復序列、更多的間接序列和更大的基因;(p211)
13.CpG island:
? CpG islands are stretches (>200bp) of unmethylated DNA with a higher frequency of CpG dinucleotides (>50%) when compared with the entire genome.
? most housekeeping genes have CpG islands at the 5' end of the transcript.
? Estimated over 30000 CpG island in human genome.
? CpG island methylation is correlated with gene inactivation and has been shown to be important during gene imprinting and tissue-specific gene expression
2.2. organelle genomes器官基因組
1.物理特性:
? Organelle genome is usually circular, but there is a great deal of variability in different organisms.
? Copy number:
l 人類:800×10=8000
l 酵母:65×100=6500
? Mitochondrial genome sizes are variable and are unrelated to the complexity of the organism
2.兩類線粒體基因組的特點:
? 人類
? 基因組較小(16.6kb)
? 結(jié)構(gòu)緊湊,間隔序列很少,
? 含有個別重疊基因
3.葉綠體基因組的特點
? 大?。何锓N間變化不大,組成相似,大小相近(100~200kb),包含約200個基因,如rRNA、tRNA、核糖體蛋白質(zhì)基因、光合作用有關(guān)基因。
? 數(shù)目
l 綠藻中約1000個拷貝
l 高等植物中每個細胞約200個拷貝
? 特征:有兩段較大的反向重復序列(IR區(qū) ),編碼rRNA,可以防止分子內(nèi)重組,保持穩(wěn)定的組成。
4,The origins of organelle genomes
? endosymbiont theory (內(nèi)共生假說 )
? Animation
六 基因獲取和功能研究
1,基因表達受那些環(huán)節(jié)的調(diào)控:
2,ESTs(Expressed Sequence tags )是從已建好的cDNA庫中隨機取出一個克隆,從5’末端或3’末端對插入的cDNA片段進行一輪單向自動測序,所獲得的約60-500bp的一段cDNA序列。
3,Transcription map的含義:
? Marker: EST and complementary DNA
? Total transcribed sequences is less than 3% of whole genome. Most sequences including most repeats, introns, pseudogenes and intergenic spaces don’t express.
? Static vs. dynamic transcription map.
? Transcription map is the bridge between structural genome and functional genome.
? Disadvantages: the function of regulation sequences cannot be discovered by cDNA.
4,轉(zhuǎn)錄圖譜的制作方法(Flow chart of large scale EST sequencing):
1)
cDNA文庫的構(gòu)建
2) 隨機單輪測序
3) 文庫與序列質(zhì)量檢驗
4) 聚類和重疊群分析
5) ORF的尋找
6) 功能分類和注釋 (Gene Ontology)
7) 表達譜分析
8) 可變剪接分析
5,轉(zhuǎn)錄圖譜的意義(Significance of ESTs research):
? Construction of gene map (gene expression profile)
? Separation and identification of new gene
? Comparative analysis of gene expression
? Discovery of new SNP
? e-hybridization and e-PCR
? Alternative splicing
6, 5’和3’EST的特點:
? 5’-EST:
l Short 5’UTR (~300bp), high conservation in coding region and convenient for searching ORF and new gene
l More regulation information
l Convenient for clustering and assembling of ESTs
? 3’-EST
l 20~200bp poly(A) tail in mRNA is convenient for the synthesis of the first cDNA chain using oligo(dT) primer
l 3‘UTR has long specific non-coding sequences (~770bp in average) with low conservation
l 10% mRNAs have repeats in 3’-end which can be SSR marker;
l High specificity between organisms and high polymorphism between individuals
7,從EST獲得全長cDNA的方法(RACE的原理):From EST to full length cDNA(P194)
Rapid amplification of cDNA ends: a PCR-based technique for mapping the end of a mRNA molecule.
? 3‘-RACE for 5’-EST(P145)
? 5’-RACE for 3’-EST
8,基因表達差異研究方法:
Large scale analysis of gene expression differences
l SSH(Suppression subtractive hybridization,抑制性減法雜交技術(shù)) 流程要求掌握!
l cDNA microarray (p172)
l SAGE (serial analysis of gene expression,基因表達系列分析 ) (P171)
9,From tradition to large scale techniques:
Based on hybridization: gene chip/ cDNA microarray
SSH + cDNA microarray
Based on direct sequencing of small fragment of representative cDNA: SAGE (Serial analysis of gene expression )
EST
SSH
SAGE
Microarray
大規(guī)模測序
是
是
是
是
原理
SSH是差減雜交與PCR結(jié)合的簡單、快速分離差異基因的方法。其運用:
1,雜交動力學原理,即豐度高的單鏈DNA在退火時產(chǎn)生同源雜交的速度快于豐度低的單鏈DNA,從而使不同豐度的單鏈DNA得到均衡;
2,抑制PCR則利用鏈內(nèi)退火優(yōu)于鏈間退火的優(yōu)點,使非目的基因片段兩端反向重復序列在退火時產(chǎn)生類似發(fā)卡的互補結(jié)構(gòu), 無法作為模板與引物配對,選擇性地抑制了非目的基因片段的擴增,從而使目的基因得到富集、分離.
來自轉(zhuǎn)錄物內(nèi)特定位置的一小段寡核苷酸序列(9-11個bp)含有鑒定一個轉(zhuǎn)錄物特異性的足夠信息,可以作為區(qū)別轉(zhuǎn)錄物的標簽(tag);
通過簡單的方法將這些標簽串聯(lián)在一起,形成大量多聯(lián)體(concatemer),對每個克隆到載體的多聯(lián)體進行測序并應用SAGE軟件分析,可確定表達的基因種類,并可根據(jù)標簽出現(xiàn)的頻率確定基因的表達豐度(abundance),還可發(fā)現(xiàn)新基因。
方法
1,提取實驗組和對照組mRNA合成雙鏈cDNA,經(jīng)識別4堿基的限制性內(nèi)切酶切割
2,實驗組cDNA平均分為2份,分別連接2個接頭
3,進行2輪差減雜交和抑制性PCR
4,獲得富集的目的基因
優(yōu)點
采用兩次差減雜交和兩次PCR,保證了高特異性 (假陽性率可降至6%); 在雜交過程中可使不同豐度基因均衡化,從而獲得低豐度差異表達基因;操作相對簡便,是目前分離新基因的主要方法。
缺點
起始材料需要mg級量mRNA; SSH差減克隆片段較小,獲取cDNA全長序列有一定難度。
標簽擴增、連接的效率存在差別; cDNA鏈未能合成到AE酶切位點;標簽序列如果為保守序列則無法判斷它代表何種基因;標簽出現(xiàn)頻率與豐度比例并不完全相同。
發(fā)現(xiàn)新基因
是
是
是
否
有無序列
有
有
無
無
SAGE的流程:①用生物素?;腛ligo-dT引導合成cDNA第一鏈,再合成雙鏈cDNA。用專門識別4 bp堿基的錨定酶(anchoring enzyme, AE),如NIaⅢ (識別位點為CATG) 消化合成的雙鏈cDNA釋放5’序列,而生物素?;?’端仍被吸附在鏈(霉)親和素蛋白磁珠(streptavidin-coatedbeads)上;
②分離與磁珠結(jié)合的具3‘端poly(A)尾巴的cDNA片段,與標簽酶(tagging enzyme, TE, 含有ⅡS類限制酶位點)的接頭(A和B)連接,酶切位點一般位于識別位點下游約20 bp處,再用錨定酶(anchoring enzyme, AE ,如NIaIII酶)處理樣品,釋放帶有接頭的SAGE標簽;
③帶有接頭的SAGE標簽經(jīng)DNA聚合酶 (Klenow)補平后,由連接酶產(chǎn)生帶有兩個接頭的雙標簽(ditag),對雙標簽PCR擴增后,再用錨定酶消化,得到了尾尾相連的SAGE雙標簽,雙標簽的兩端含有錨定酶的酶切位點;
④去除接頭的SAGE雙標簽彼此連接形成長短不一的多聯(lián)體,電泳分離后收集大小適中的片段克隆到高拷貝的質(zhì)粒載體,由此組成SAGE庫(SAGE library)。
10,基因芯片:固相載體上的寡核苷酸陣列
原理:
方法:Spotted Microarrays
In Situ Oligo Synthesis
Microfluidics
Integrated Chips
應用:基因表達分析; SNP檢測分析;篩選/鑒定特殊序列等
問題:基因芯片與Microarray是一樣的嗎?有什么區(qū)別?
廣義的基因芯片泛指寡核苷酸的微陣列,而狹義的基因芯片指原位合成的寡核苷酸的微陣列,主要用于檢測SNP;而microarray一般指cDNA的微陣列,由點樣制成,用于檢測基因表達譜。
七 Gene Cloning and Function Research
1,克隆目的基因的策略及代表方法:
①Functional cloning:Using information about the function of a known protein that could be involved in a genetic disease. This approach has very limited application.
②Phenotype cloning: Large scale mutation by transposon tagging (轉(zhuǎn)座子標簽法)
Gene expression differences analysis
③Positional cloning:Using only information about the gene's approximate chromosomal location obtained from gene mapping
④Positional candidate cloning:Using information from map position and the gene's possible function, homology, and expression pattern. This approach has been quite successful and will dominate other strategies.
2, What is positional cloning?
v The core problem for positional cloning—gene localization.
v Expression of gene’s position on chromosome:
§ Cytogenetic location—describe the rough position on chromosome.
§ Molecular location—A gene’s molecular address pinpoints the location of that gene in terms of base pairs.
v Methods to localization gene
Cytogenetic analysis(細胞遺傳學分析);Genome scan using molecular markers
定位克隆的流程:
4, 如何對基因定位:
v Cytogenetic abnormality
v Genome scanning (全基因組掃描) : Looking for the markers closest to the disease gene. (采用DNA分子多態(tài)性標記,以較大間距在大量樣本、家系或同胞對中進行全基因組掃描,通過連鎖分析或關(guān)聯(lián)分析將相關(guān)基因定位到某些染色體區(qū)域;在這些區(qū)域再選擇高密度的遺傳標記,做精細分析,進一步縮小定位區(qū)域;查找該定位區(qū)域內(nèi)的所有基因,從中選擇可能的候選基因進行基因變異檢測。)
3 factors in gene location by genome scan:
v Sample
v Genetic DNA polymorphism markers
§ RFLP markers
§ Microsatellite or STR markers
§ Single nucleotide polymorphism (SNP)
v Statistic methods
§ Linkage analysis (連鎖分析)
§ Association analysis (關(guān)聯(lián)分析)
5, 連鎖不平衡(linkage disequilibrium): Linkage disequilibrium is a term used in the study of population genetics for the non-random association of alleles at two or more loci, not necessarily on the same chromosome.
v P(disease & M) ≠ P(disease) x P(M)
連鎖分析(linkage analysis):利用家系遺傳信息中的重組率計算兩位點之間的染色體圖距。根據(jù)疾病有無合適的遺傳模式,可分別進行參數(shù)分析和非參數(shù)分析。
參數(shù)分析:需要設定遺傳模式,基因頻率和外顯率,計算優(yōu)勢對數(shù)分數(shù)(LOD)值??筛咝Оl(fā)現(xiàn)疾病基因的連鎖標記,但如果模型設定錯誤,可能導致結(jié)論錯誤。主要適用于已知遺傳模式的單基因遺傳病基因定位。
非參數(shù)分析:對患病家系中的成對患病成員,比較其基因組同一座位上獲得來自共同祖先的同一等位基因的頻率,如果與孟德爾獨立分離預期頻率差異顯著,則認為該等位標記與致病基因之間存在連鎖不平衡。可適用于多基因疾病,可發(fā)現(xiàn)多個連鎖不平衡位點,但不能得到其與疾病基因之間的圖距。
6, 全基因組關(guān)聯(lián)分析:基于觀察標記位點等位基因和疾病基因之間的是否存在連鎖不平衡(linkage disequilibrium, LD)的分析法。標記位點與致病基因之間越近、突變率越低、雜合率越高,用標記檢出致病基因位點的幾率就越高。
7,連鎖分析與關(guān)聯(lián)分析的區(qū)別:
v 關(guān)聯(lián)分析通過比較樣本間標記位點等位基因頻率與疾病相關(guān)基因頻率的相關(guān)性來判斷他們之間連鎖不平衡現(xiàn)象存在與否以及相關(guān)性強弱。
v 連鎖分析通過檢測家系中等位基因與疾病基因的遺傳特性來判斷是否他們之間是否連鎖以及連鎖程度(圖距)。
8,Lod值:
v Lod得分是在一定重組率下兩個位點相連鎖的似然性與兩個位點不連鎖的似然性比值的對數(shù)值 L(θ<0.5)
v
Lod Score=log10
L(θ=0.5)
v Statistical Significance of the Lod Score:
§ lod score > 3: evidence of linkage
§ 2 < lod score < 3: suggestive evidence of linkage
§ -2 < lod score < 2: uninformative of linkage
§ lod score < -2: exclusion of linkage
9,Transcript identification (p188~195)
ORF scanningscan—computer methods;
Hybridization test;——Northern blotting; Zoo blotting
cDNA sequencing(P192);
Exon trapping(P195);
10, Two types of homologous sequences: Paralogs(旁系同源) and orthologs(直系同源)
Homologous genes are ones that share a common evolutionary ancestor, revealed by sequence similarities between the genes.
Orthologous genes are those homologs that are present in different organisms and whose common ancestor predates the split between the species.
Paralogous genes are present in the same organism, often members of a recognized multigene family, their common ancestor possibly or possibly not predating the species in which the genes are now found.
11, 通過實驗研究驗證基因的功能有哪些策略及方法:
§ Expression patterns
RNA expression assayed by Northern blot or PCR amplification of cDNA with primers specific to candidate transcript
? Look for misexpression (no expression, underexpression, overexpression)
§ Sequence differences
? Missense mutations identified by sequencing coding region of candidate gene from normal and abnormal individuals
§ Artificial interfering the expression of the gene*
12, Assigning gene function by experimental analysis:
Gene knock-out;RNAi;Gene trap;Gene over expression
13,RNAi是指在生物體細胞內(nèi),dsRNA引起同源mRNA的特異性降解,因而抑制相應基因表達的過程。
v 一種轉(zhuǎn)錄后水平的基因沉默
v 生物體內(nèi)普遍存在的機制,抑制外源性進入機體內(nèi)的有害RNA
§ Virus RNA
§ Retroelement
§ Transgenic dsRNA
14, Gene transfer methodology :
1)physical methods
§ Microinjection
§ Electroporation
§ Particle bombardment (gene gun)
§ Electrofusion
2)biological methods
§ Retroviral mediated gene transfer
3)chemical methods
§ Lipofection
v TransMessenger, RNAifFect, HiperfFect (Qiagen for siRNA)
§ Non-liposomal lipids: FuGENE 6 (Roche)
§ Diethylaminoethyl (DEAE)-dextran (DEAE-葡聚糖)
§ Calcium phosphate coprecipitation methods (磷酸鈣共沉淀轉(zhuǎn)染)
4)other methods
§ Nuclear transplantation for embryo cell or ES cell
15, Gene targeting and transgenic mice
§ Mechanism: homologous recombination
§ Steps
§ Selection markers
§ Conditional knock-out: Cre-loxP/specific promoter
八 Comparative Genomics and Genome Evolution (2h)
1,The basic mechanisms in population evolution:
§ Variation (變異)
§ Selection (選擇)
? Natural selection (自然選擇)
? Neutral drift/random drift (中性/隨機漂變)
§ Reproductive isolation (生殖隔離)
2, 中性學說的要點:
(1)對每種生物大分子而言,只要分子的三級結(jié)構(gòu)與功能基本不變,那么各進化路線,以突變替代表示的進化速率大致保持每年在每個位置上恒定。
(2)機能較次要的分子或分子片段的進化速率,高于機能較重要的分子或分子片段的進化速率。
(3)在分子進化進程中,使分子現(xiàn)存結(jié)構(gòu)和功能破壞較小的突變比破壞較大的突變有更高的替換率。
(4)基因重復通常發(fā)生在一個具有新功能的基因出現(xiàn)之前。
(5)明顯有害的選擇清除和選擇上呈中性的或稍有害的突變隨機固定,比明顯有利突變的正達爾文選擇更為頻繁。
3,遺傳漂變(genetic drift):
群體遺傳學的哈迪-溫伯格定律(Hardy-Weinberg Law):在一個不發(fā)生突變、遷移和選擇的無限大的相互交配的群體中,基因頻率和基因型頻率將逐代保持不變。(1908)
由于中性突變對生物的生存和繁殖沒有影響。自然選擇對他們不起作用,它們在種群中的保存、擴散、消失完全隨機,并導致種群中某一等位基因在不同世代中傳遞時,其頻率有較大的波動。
4, The molecular basis for variation and evolution:
v DNA duplication (p465~473)
§ By duplication of the entire genome;
§ By duplication of a single chromosome or part of a chromosome;
§ By duplication of a single gene or group of genes.
v Mutation (DNA復制錯誤導致的突變)
v Recombination (重組)
§ Homologous recombination
§ Translocation (轉(zhuǎn)座)
v Horizontal gene transfer
5, 基因同線性 ( synteny ):
v 含義
§ 不同基因組中基因排列順序的一致性
§ 可以出現(xiàn)在不同基因組的對應區(qū)段
§ 也可以出現(xiàn)在同一基因組內(nèi)部的不同染色體位置
v 意義:兩個物種之間的同線性程度可以作為衡量它們之間進化距離的尺度。但分析時要注意避免高保守和高變異區(qū)段
6,人類基因組計劃的五大模式生物:
v 大腸桿菌(Esherichi coli)
v 釀酒酵母(Saccharomyces cevevisiae)
v 黑腹果蠅(Drosophila melanogaster)
v 秀麗線蟲(Caenorhabditis elegans)
v 小鼠(Mus musculus)
v 擬南芥(Arabidopsis thaliana)