Abstract:
Since the beginning of the Human genome project on October 1, 1990 genomics has unraveled the fundamental make-up of a being i.e. complete set of DNA within a single cell of an organism. Many –omics branches such as metagenomics and transcriptomics have emerged since then and these –omics approaches have brought together a challenge of the quality control of the data that is generated during production of sequencing reads from different Next Generation Sequencing platforms. The main focus of current study was the analysis of the quality filtering measures in the genomic and metagenomic datasets. This work discusses that the adapter filtration analysis on metagenomic dataset and the comparison of the filtered data with raw dataset. A comparison was also done between the Ap1 immune compromised mice dataset, a non immune compromised mice and a reference human dataset. Then, two quality control tools i.e. PRINSEQ and FAQs were compared and selected features of these tools were integrated into a pipeline. This pipeline was further tested on genomic and metagenomic datasets for validation of the pipeline.