IT数码 购物 网址 头条 软件 日历 阅读 图书馆
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
图片批量下载器
↓批量下载图片,美女图库↓
图片自动播放器
↓图片自动播放器↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁
 
   -> 开发工具 -> xindi数据分析记录 -> 正文阅读

[开发工具]xindi数据分析记录

1、 使用FastQC软件对数据进行质控检测

fastqc -t 16 -o ${dir}/fastqc_report/ ${dir}/clean_data/*.fq.gz

2、 使用Trim Galore软件对三组数据进行质控,去掉20bp以下的reads

1.对HeLa细胞数据进行处理

trim_galore -q 20 --phred33 --stringency 3 --length 20 -e 0.1 -j 16 --paired /Data/lizexing/projects/xindi/Data/new/Data/CleanData/T_HeLa_T_PAR_CLIP_Clean_Data1.fq.gz /Data/lizexing/projects/xindi/Data/new/Data/CleanData/T_HeLa_T_PAR_CLIP_Clean_Data2.fq.gz

trim_galore -q 20 --phred33 --stringency 3 --length 20 -e 0.1 -j 16 --paired /Data/lizexing/projects/xindi/2021_11_16/CleanData/GFP_HeLa_Clean_Data1.fq.gz /Data/lizexing/projects/xindi/2021_11_16/CleanData/GFP_HeLa_Clean_Data2.fq.gz

trim_galore -q 20 --phred33 --stringency 3 --length 20 -e 0.1 -j 16 --paired /Data/lizexing/projects/xindi/2021_11_16/CleanData/Input_HeLa_Clean_Data1.fq.gz /Data/lizexing/projects/xindi/2021_11_16/CleanData/Input_HeLa_Clean_Data2.fq.gz

2.对HCT116细胞数据进行处理

trim_galore -q 20 --phred33 --stringency 3 --length 20 -e 0.1 -j 16 --paired /Data/lizexing/projects/xindi/Data/new/Data/CleanData/T_HCT116_T_PAR_CLIP_Clean_Data1.fq.gz /Data/lizexing/projects/xindi/Data/new/Data/CleanData/T_HCT116_T_PAR_CLIP_Clean_Data2.fq.gz

trim_galore -q 20 --phred33 --stringency 3 --length 20 -e 0.1 -j 16 --paired /Data/lizexing/projects/xindi/2021_11_16/CleanData/GFP_HCT116_Clean_Data1.fq.gz /Data/lizexing/projects/xindi/2021_11_16/CleanData/GFP_HCT116_Clean_Data2.fq.gz

trim_galore -q 20 --phred33 --stringency 3 --length 20 -e 0.1 -j 16 --paired /Data/lizexing/projects/xindi/2021_11_16/CleanData/Input_HCT116_Clean_Data1.fq.gz /Data/lizexing/projects/xindi/2021_11_16/CleanData/Input_HCT116_Clean_Data2.fq.gz

3.对293T细胞数据进行处理

trim_galore -q 20 --phred33 --stringency 3 --length 20 -e 0.1 --paired /Data/lizexing/projects/xindi/Data/new/Data/CleanData/T_293T_T_PAR_CLIP_Clean_Data1.fq.gz /Data/lizexing/projects/xindi/Data/new/Data/CleanData/T_293T_T_PAR_CLIP_Clean_Data2.fq.gz

trim_galore -q 20 --phred33 --stringency 3 --length 20 -e 0.1 -j 16 --paired /Data/lizexing/projects/xindi/2021_11_16/CleanData/GFP_293T_Clean_Data1.fq.gz /Data/lizexing/projects/xindi/2021_11_16/CleanData/GFP_293T_Clean_Data2.fq.gz

trim_galore -q 20 --phred33 --stringency 3 --length 20 -e 0.1 -j 16 --paired /Data/lizexing/projects/xindi/2021_11_16/CleanData/Input_293T_Clean_Data1.fq.gz /Data/lizexing/projects/xindi/2021_11_16/CleanData/Input_293T_Clean_Data2.fq.gz

3. 使用gffread-0.12.1软件将45S 和5S rRNA的GFF3注释文件转换为GTF格式

参考文章:gffcompare和gffread

Usage: gffread <input_gff> [-g <genomic_seqs_fasta> | <dir>][-s <seq_info.fsize>]
 [-o <outfile>] [-t <trackname>] [-r [[<strand>]<chr>:]<start>..<end> [-R]]
 [-CTVNJMKQAFPGUBHZWTOLE] [-w <exons.fa>] [-x <cds.fa>] [-y <tr_cds.fa>]
 [-i <maxintron>] [--stream] [--bed] [--table <attrlist>] [--sort-by <ref.lst>]

(base) lizexing@bio:~/reference/h_45S_rDNA$ gffread U13369.1.gff3 -T -o U13369.1.gtf
(base) lizexing@bio:~/reference/h_5S_rDNA$ gffread NR_023363.1.gff3 -T -o NR_023363.1.gtf

4. 使用STAR软件对45S 和5S rRNA构建索引、对GRCh38.dna.primary_assembly、GRCh38.ncRNA、GRCh38.cds.all构建索引

参考文章比对软件STAR的使用

# 参数说明
--runThreadN是指你要用几个cpu来运行;
--genomeDir构建索引输出文件的目录;
--genomeFastaFiles你的基因组fasta文件所在的目录
--limitGenomeGenerateRAM 43749387189 STAR消耗内存太大,输入限制内存数目防止出错,感谢孙小雨帮忙

(base) lizexing@bio:~$ STAR  --runMode genomeGenerate --runThreadN 16 --genomeDir /Data/lizexing/reference/h_45S_rDNA/ --genomeFastaFiles /Data/lizexing/reference/h_45S_rDNA/U13369.1.fasta
Sep 05 14:14:23 ..... started STAR run
Sep 05 14:14:23 ... starting to generate Genome files
!!!!! WARNING: --genomeSAindexNbases 14 is too large for the genome size=42999, which may cause seg-fault at the mapping step. Re-run genome generation with recommended --genomeSAindexNbases 6
Sep 05 14:14:23 ... starting to sort Suffix Array. This may take a long time...
Sep 05 14:14:23 ... sorting Suffix Array chunks and saving them to disk...
Sep 05 14:14:23 ... loading chunks from disk, packing SA...
Sep 05 14:14:23 ... finished generating suffix array
Sep 05 14:14:23 ... generating Suffix Array index
Sep 05 14:14:26 ... completed Suffix Array index
Sep 05 14:14:26 ... writing Genome to disk ...
Sep 05 14:14:26 ... writing Suffix Array to disk ...
Sep 05 14:14:26 ... writing SAindex to disk
Sep 05 14:14:28 ..... finished successfully

(base) lizexing@bio:~$ STAR  --runMode genomeGenerate --runThreadN 16 --genomeDir /Data/lizexing/reference/h_5S_rDNA/ --genomeFastaFiles /Data/lizexing/reference/h_5S_rDNA/NR_023363.1.fasta
Dec 15 19:47:24 ..... started STAR run
Dec 15 19:47:24 ... starting to generate Genome files
!!!!! WARNING: --genomeSAindexNbases 14 is too large for the genome size=121, which may cause seg-fault at the mapping step. Re-run genome generation with recommended --genomeSAindexNbases 2
Dec 15 19:47:24 ... starting to sort Suffix Array. This may take a long time...
Dec 15 19:47:24 ... sorting Suffix Array chunks and saving them to disk...
Dec 15 19:47:24 ... loading chunks from disk, packing SA...
Dec 15 19:47:24 ... finished generating suffix array
Dec 15 19:47:24 ... generating Suffix Array index
Dec 15 19:47:27 ... completed Suffix Array index
Dec 15 19:47:27 ... writing Genome to disk ...
Dec 15 19:47:27 ... writing Suffix Array to disk ...
Dec 15 19:47:27 ... writing SAindex to disk
Dec 15 19:47:31 ..... finished successfully

(base) lizexing@bio:~/reference/Ensembl_GRCh38$ STAR  --runMode genomeGenerate --runThreadN 40 --limitGenomeGenerateRAM 82424365322 --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_dna_primary_assembly_index --genomeFastaFiles /Data/lizexing/reference/Ensembl_GRCh38/Homo_sapiens.GRCh38.dna.primary_assembly.fa
Mar 06 14:29:42 ..... started STAR run
Mar 06 14:29:42 ... starting to generate Genome files
Mar 06 14:30:58 ... starting to sort Suffix Array. This may take a long time...
Mar 06 14:31:18 ... sorting Suffix Array chunks and saving them to disk...
Mar 06 14:44:13 ... loading chunks from disk, packing SA...
Mar 06 14:45:46 ... finished generating suffix array
Mar 06 14:45:46 ... generating Suffix Array index
Mar 06 14:49:53 ... completed Suffix Array index
Mar 06 14:49:53 ... writing Genome to disk ...
Mar 06 14:49:55 ... writing Suffix Array to disk ...
Mar 06 14:50:18 ... writing SAindex to disk
Mar 06 14:50:20 ..... finished successfully

(base) lizexing@bio:~/reference/Ensembl_GRCh38$ STAR  --runMode genomeGenerate --runThreadN 16 --limitGenomeGenerateRAM 82424365322 --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_cds_all_index --genomeFastaFiles /Data/lizexing/reference/Ensembl_GRCh38/Homo_sapiens.GRCh38.cds.all.fa
Mar 05 10:59:02 ..... started STAR run
Mar 05 10:59:03 ... starting to generate Genome files
!!!!! WARNING: --genomeSAindexNbases 14 is too large for the genome size=137654284, which may cause seg-fault at the mapping step. Re-run genome generation with recommended --genomeSAindexNbases 12
Mar 05 11:00:53 ... starting to sort Suffix Array. This may take a long time...
Mar 05 11:02:49 ... sorting Suffix Array chunks and saving them to disk...
Mar 05 11:04:45 ... loading chunks from disk, packing SA...
Mar 05 11:05:50 ... finished generating suffix array
Mar 05 11:05:50 ... generating Suffix Array index
Mar 05 11:06:41 ... completed Suffix Array index
Mar 05 11:06:41 ... writing Genome to disk ...
Mar 05 11:07:17 ... writing Suffix Array to disk ...
Mar 05 11:07:18 ... writing SAindex to disk
Mar 05 11:07:19 ..... finished successfully

(base) lizexing@bio:~/reference/Ensembl_GRCh38$ STAR  --runMode genomeGenerate --runThreadN 16 --limitGenomeGenerateRAM 82424365322 --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_ncrna_index --genomeFastaFiles /Data/lizexing/reference/Ensembl_GRCh38/Homo_sapiens.GRCh38.ncrna.fa

5. STAR比对用法和结果说明

Usage: STAR  [options]... --genomeDir /path/to/genome/index/   --readFilesIn R1.fq R2.fq
--runThreadN 40 \ #线程数
--runMode alignReads \ #比对模式
--readFilesCommand zcat \ #说明你的fastq文件是压缩形式的,就是.gz结尾的,不加的话会报错
--quantMode TranscriptomeSAM GeneCounts \ #将reads比对至转录本序列
--sjdbGTFfile /Data/lizexing/reference/h_45S_rDNA/U13369.1.gtf #加入对应的注释文件
--twopassMode Basic \ #先按索引进行第一次比对,而后把第一次比对发现的新剪切位点信息加入到索引中进行第二次比对。这个参数可以保证更精准的比对情况,但是费时也费内存。
--outSAMtype BAM Unsorted \ #输出BAM文件,不进行排序。如果不加这一行,只输出SAM文件。
--outSAMunmapped None \
--genomeDir /gpfs/home/fangy04/downloads/STAR_index/GRCh38/ \ #索引文件目录
--readFilesIn /gpfs/home/fangy04/downloads/SRR8112732_1.fastq.gz /gpfs/home/fangy04/downloads/SRR8112732_2.fastq.gz \ #两个fastq文件目录
--outFileNamePrefix DRB_TT_seq_SRR8112732 #输出文件前缀
--outReadsUnmapped # output of unmapped and partially mapped (i.e. mapped only one mate of a paired end read) reads in separate file(s). Fastx   ... output in separate fasta/fastq files, Unmapped.out.mate1/2
--outSAMunmapped # output of unmapped reads in the SAM format
9216920116 Jun 28 17:06 DRB_TT_seq_SRR8112732Aligned.out.bam #这个文件是最重要的,用来后续进行remove duplicates和sort
1166235552 Jun 28 17:06 DRB_TT_seq_SRR8112732Aligned.toTranscriptome.out.bam #这个文件是那些比对到转录本上的reads组成的bam文件
2034 Jun 28 17:06 DRB_TT_seq_SRR8112732Log.final.out
20188 Jun 28 17:06 DRB_TT_seq_SRR8112732Log.out
2571 Jun 28 17:06 DRB_TT_seq_SRR8112732Log.progress.out
1585521 Jun 28 17:06 DRB_TT_seq_SRR8112732ReadsPerGene.out.tab
6732305 Jun 28 17:06 DRB_TT_seq_SRR8112732SJ.out.tab #剪切的信息
8192 Jun 28 16:51 DRB_TT_seq_SRR8112732_STARgenome
8192 Jun 28 16:51 DRB_TT_seq_SRR8112732_STARpass1

6. 使用STAR软件对三组数据与45S rRNA进行比对

1、对HeLa测序数据进行比对

STAR --runThreadN 40 --runMode alignReads --readFilesCommand zcat --quantMode TranscriptomeSAM GeneCounts --sjdbGTFfile /Data/lizexing/reference/h_45S_rDNA/U13369.1.gtf --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/h_45S_rDNA/ --readFilesIn /Data/lizexing/projects/xindi/Data/new/Data/CleanData/T_HeLa_T_PAR_CLIP_Clean_Data1_val_1.fq.gz /Data/lizexing/projects/xindi/Data/new/Data/CleanData/T_HeLa_T_PAR_CLIP_Clean_Data2_val_2.fq.gz --outFileNamePrefix HeLa-val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --readFilesCommand zcat --quantMode TranscriptomeSAM GeneCounts --sjdbGTFfile /Data/lizexing/reference/h_45S_rDNA/U13369.1.gtf --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/h_45S_rDNA/ --readFilesIn /Data/lizexing/projects/xindi/2021_11_16/CleanData/GFP_HeLa_Clean_Data1_val_1.fq.gz /Data/lizexing/projects/xindi/2021_11_16/CleanData/GFP_HeLa_Clean_Data2_val_2.fq.gz --outFileNamePrefix GFP_HeLa_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --readFilesCommand zcat --quantMode TranscriptomeSAM GeneCounts --sjdbGTFfile /Data/lizexing/reference/h_45S_rDNA/U13369.1.gtf --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/h_45S_rDNA/ --readFilesIn /Data/lizexing/projects/xindi/2021_11_16/CleanData/Input_HeLa_Clean_Data1_val_1.fq.gz /Data/lizexing/projects/xindi/2021_11_16/CleanData/Input_HeLa_Clean_Data2_val_2.fq.gz --outFileNamePrefix Input_HeLa_val --outReadsUnmapped Fastx

2、对HCT116测序数据进行比对

STAR --runThreadN 40 --runMode alignReads --readFilesCommand zcat --quantMode TranscriptomeSAM GeneCounts --sjdbGTFfile /Data/lizexing/reference/h_45S_rDNA/U13369.1.gtf --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/h_45S_rDNA/ --readFilesIn /Data/lizexing/projects/xindi/Data/new/Data/CleanData/T_HCT116_T_PAR_CLIP_Clean_Data1_val_1.fq.gz /Data/lizexing/projects/xindi/Data/new/Data/CleanData/T_HCT116_T_PAR_CLIP_Clean_Data2_val_2.fq.gz --outFileNamePrefix HCT116-val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --readFilesCommand zcat --quantMode TranscriptomeSAM GeneCounts --sjdbGTFfile /Data/lizexing/reference/h_45S_rDNA/U13369.1.gtf --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/h_45S_rDNA/ --readFilesIn /Data/lizexing/projects/xindi/2021_11_16/CleanData/GFP_HCT116_Clean_Data1_val_1.fq.gz /Data/lizexing/projects/xindi/2021_11_16/CleanData/GFP_HCT116_Clean_Data2_val_2.fq.gz --outFileNamePrefix GFP_HCT116_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --readFilesCommand zcat --quantMode TranscriptomeSAM GeneCounts --sjdbGTFfile /Data/lizexing/reference/h_45S_rDNA/U13369.1.gtf --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/h_45S_rDNA/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/Input/CleanData/Input_HCT116_Clean_Data1_val_1.fq.gz /Data/lizexing/projects/xindi/2022_03_05/TreatData/Input/CleanData/Input_HCT116_Clean_Data2_val_2.fq.gz --outFileNamePrefix Input_HCT116_val --outReadsUnmapped Fastx

3、对293T测序数据进行比对

STAR --runThreadN 40 --runMode alignReads --readFilesCommand zcat --quantMode TranscriptomeSAM GeneCounts --sjdbGTFfile /Data/lizexing/reference/h_45S_rDNA/U13369.1.gtf --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/h_45S_rDNA/ --readFilesIn /Data/lizexing/projects/xindi/Data/new/Data/CleanData/T_293T_T_PAR_CLIP_Clean_Data1_val_1.fq.gz /Data/lizexing/projects/xindi/Data/new/Data/CleanData/T_293T_T_PAR_CLIP_Clean_Data2_val_2.fq.gz --outFileNamePrefix 293T-val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --readFilesCommand zcat --quantMode TranscriptomeSAM GeneCounts --sjdbGTFfile /Data/lizexing/reference/h_45S_rDNA/U13369.1.gtf --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/h_45S_rDNA/ --readFilesIn /Data/lizexing/projects/xindi/2021_11_16/CleanData/GFP_293T_Clean_Data1_val_1.fq.gz /Data/lizexing/projects/xindi/2021_11_16/CleanData/GFP_293T_Clean_Data2_val_2.fq.gz --outFileNamePrefix GFP_293T_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --readFilesCommand zcat --quantMode TranscriptomeSAM GeneCounts --sjdbGTFfile /Data/lizexing/reference/h_45S_rDNA/U13369.1.gtf --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/h_45S_rDNA/ --readFilesIn /Data/lizexing/projects/xindi/2021_11_16/CleanData/Input_293T_Clean_Data1_val_1.fq.gz /Data/lizexing/projects/xindi/2021_11_16/CleanData/Input_293T_Clean_Data2_val_2.fq.gz --outFileNamePrefix Input_293T_val --outReadsUnmapped Fastx

8. 使用STAR软件对三组数据未比对上的序列与GRCh38.ncrna比对

1、对HeLa测序数据进行比对

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_ncrna_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/TopBP/HeLa/45SRNA/HeLa-valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/TopBP/HeLa/45SRNA/HeLa-valUnmapped.out.mate2 --outFileNamePrefix HeLa_ncrna_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_ncrna_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/HeLa/45SRNA/GFP_HeLa_valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/HeLa/45SRNA/GFP_HeLa_valUnmapped.out.mate2 --outFileNamePrefix HeLa_ncrna_val --outReadsUnmapped Fastx

2、对HCT116测序数据进行比对

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_ncrna_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/TopBP/HCT116/45SRNA/HCT116-valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/TopBP/HCT116/45SRNA/HCT116-valUnmapped.out.mate2 --outFileNamePrefix HCT116_ncrna_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_ncrna_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/HCT116/45SRNA/GFP_HCT116_valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/HCT116/45SRNA/GFP_HCT116_valUnmapped.out.mate2 --outFileNamePrefix HCT116_ncrna_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_ncrna_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/Input/HCT116/45SRNA/Input_HCT116_valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/Input/HCT116/45SRNA/Input_HCT116_valUnmapped.out.mate2 --outFileNamePrefix HCT116_ncrna_val --outReadsUnmapped Fastx

3、对293T测序数据进行比对

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_ncrna_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/TopBP/293T/45SRNA/293T-valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/TopBP/293T/45SRNA/293T-valUnmapped.out.mate2 --outFileNamePrefix 293T_ncrna_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_ncrna_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/293T/45SRNA/GFP_293T_valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/293T/45SRNA/GFP_293T_valUnmapped.out.mate2 --outFileNamePrefix 293T_ncrna_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_ncrna_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/Input/293T/45SRNA/Input_293T_valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/Input/293T/45SRNA/Input_293T_valUnmapped.out.mate2 --outFileNamePrefix 293T_ncrna_val --outReadsUnmapped Fastx

9. 使用STAR软件对三组数据未比对上的序列与GRCh38.cds.all比对

1、对HeLa测序数据进行比对

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_cds_all_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/TopBP/HeLa/45SRNA/HeLa-valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/TopBP/HeLa/45SRNA/HeLa-valUnmapped.out.mate2 --outFileNamePrefix HeLa_cds_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_cds_all_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/HeLa/45SRNA/GFP_HeLa_valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/HeLa/45SRNA/GFP_HeLa_valUnmapped.out.mate2 --outFileNamePrefix HeLa_cds_val --outReadsUnmapped Fastx

2、对HCT116测序数据进行比对

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_cds_all_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/TopBP/HCT116/45SRNA/HCT116-valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/TopBP/HCT116/45SRNA/HCT116-valUnmapped.out.mate2 --outFileNamePrefix HCT116_cds_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_cds_all_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/HCT116/45SRNA/GFP_HCT116_valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/HCT116/45SRNA/GFP_HCT116_valUnmapped.out.mate2 --outFileNamePrefix HCT116_cds_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_cds_all_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/Input/HCT116/45SRNA/Input_HCT116_valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/Input/HCT116/45SRNA/Input_HCT116_valUnmapped.out.mate2 --outFileNamePrefix HCT116_cds_val --outReadsUnmapped Fastx

3、对293T测序数据进行比对

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_cds_all_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/TopBP/293T/45SRNA/293T-valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/TopBP/293T/45SRNA/293T-valUnmapped.out.mate2 --outFileNamePrefix 293T_cds_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_cds_all_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/293T/45SRNA/GFP_293T_valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/293T/45SRNA/GFP_293T_valUnmapped.out.mate2 --outFileNamePrefix 293T_cds_val --outReadsUnmapped Fastx

STAR --runThreadN 40 --runMode alignReads --twopassMode Basic --outSAMtype BAM Unsorted --genomeDir /Data/lizexing/reference/Ensembl_GRCh38/star_cds_all_index/ --readFilesIn /Data/lizexing/projects/xindi/2022_03_05/TreatData/Input/293T/45SRNA/Input_293T_valUnmapped.out.mate1 /Data/lizexing/projects/xindi/2022_03_05/TreatData/Input/293T/45SRNA/Input_293T_valUnmapped.out.mate2 --outFileNamePrefix 293T_cds_val --outReadsUnmapped Fastx

10. 使用featureCounts软件对三组数据read summarization

featureCounts -T 32 -a /Data/lizexing/reference/h_45S_rDNA/U13369.1.gtf -p -B -C -f -t exon -g gene_id \
-o /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/HeLa/45SRNA/HeLA_val.read.count /Data/lizexing/projects/xindi/2022_03_05/TreatData/GFP/HeLa/45SRNA/GFP_HeLa_valAligned.out.bam.sort






  开发工具 最新文章
Postman接口测试之Mock快速入门
ASCII码空格替换查表_最全ASCII码对照表0-2
如何使用 ssh 建立 socks 代理
Typora配合PicGo阿里云图床配置
SoapUI、Jmeter、Postman三种接口测试工具的
github用相对路径显示图片_GitHub 中 readm
Windows编译g2o及其g2o viewer
解决jupyter notebook无法连接/ jupyter连接
Git恢复到之前版本
VScode常用快捷键
上一篇文章      下一篇文章      查看所有文章
加:2022-03-08 22:45:36  更:2022-03-08 22:46:58 
 
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁

360图书馆 购物 三丰科技 阅读网 日历 万年历 2024年11日历 -2024/11/26 6:31:20-

图片自动播放器
↓图片自动播放器↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
  网站联系: qq:121756557 email:121756557@qq.com  IT数码