Nantes Université

Skip to content
Extraits de code Groupes Projets
Valider 36c8902d rédigé par Adrien Leger's avatar Adrien Leger
Parcourir les fichiers

update readme and docstrings

parent 600e2f27
Aucune branche associée trouvée
Aucune étiquette associée trouvée
Aucune requête de fusion associée trouvée
......@@ -20,7 +20,7 @@ class FastqSeq (object):
"""
Simple Representation of a fastq file. The object support slicing and addition operations
The quality score is a numpy array to facilitate further data manipulation
Only works with illumina 1.8+ Phred 33 quality encoding
Only works with illumina 1.8+ Phred +33 quality encoding
"""
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
......@@ -42,15 +42,12 @@ class FastqSeq (object):
if type(qual) == str:
self.qual = np.array([ord(x)-33 for x in qual])
print ("Str type")
elif type(qual) == np.ndarray:
self.qual = qual
print ("ndarray type")
elif type(qual) == list:
self.qual = np.array(qual)
print ("list type")
else:
raise TypeError("qual is not a valid type : str, numpy.ndarray or list of int")
......
......@@ -12,11 +12,18 @@ When the file is empty the generator raise a StopIteration exception indicating
Any part of the sequence name following a blank space will be removed
## FastqSeq : Simple object representing a Fastq sequence
FastqSeq is a simple python object class generating a object representing a fastq sequence. The object is initialised with a name, a DNA sequence, an **illumina 1.8+ Phred +33** encoded quality sequence (same size than the DNA sequence) and eventually a short text description. After creation the object has the following fields:
FastqSeq is a simple python object class generating a object representing a Fastq sequence. The object is initialized with:
* A for the sequence without @
* A DNA sequence as a string
* A quality score, as an **illumina 1.8+** encoded quality string, a numpy.ndarray of integers in **Phred +33** or python list of integers in **Phred +33**
* Eventually a short text description
The DNA sequence and the quality score must have the same size otherwise an assertion error is raised
After creation the object has the following fields:
* name = name of the sequence without @.
* seq = The DNA sequence of the fastq sequence store as a simple string.
* qual = An numpy integer array representing the Phred Quality of bases (support all np.array methods)
* qual = An numpy integer ndarray representing the Phred Quality of bases (support all np.array methods)
* descr = A description of the fastq sequence.
* qualstr = A string of letters corresponding to the sequence quality in illumina 1.8+ Phred 33 encoding
* fastqstr = The field "descr" will be included in the output fastq sequence name after a space if present
......
0% Chargement en cours ou .
You are about to add 0 people to the discussion. Proceed with caution.
Terminez d'abord l'édition de ce message.
Veuillez vous inscrire ou vous pour commenter