grelu.transforms.seq_transforms#

Classes to assign each sequence a score based on its content.

Classes#

PatternScore

A class that returns a weighted score based on the number of occurrences of given subsequences.

MotifScore

A scorer that returns a weighted score based on the number of occurrences of given subsequences.

Module Contents#

class grelu.transforms.seq_transforms.PatternScore(patterns: List[str], weights: List[float])[source]#

A class that returns a weighted score based on the number of occurrences of given subsequences.

Parameters:
  • patterns – List of subsequences

  • weights – List of weights for each subsequence. If None, all patterns will receive a weight of 1.

patterns[source]#
weights[source]#
forward(seqs: List[str]) List[float][source]#

Compute scores.

Parameters:

seqs – A list of input sequences as strings.

__call__(seqs: List[str]) List[float][source]#
class grelu.transforms.seq_transforms.MotifScore(motifs: str | Dict[str, numpy.ndarray] = None, names: List[str] | None = None, weights: List[float] | None = None, pthresh: float = 0.001, rc: bool = True)[source]#

A scorer that returns a weighted score based on the number of occurrences of given subsequences.

Parameters:
  • motifs – Either the path to a MEME file, or a dictionary whose values are numpy arrays of shape (4, L).

  • names – List of names of motifs to read from the meme file. If None, all motifs will be read from the file.

  • weights – List of weights for each motif. If None, all motifs will receive a weight of 1.

  • pthresh – p-value cutoff to define binding sites

  • rc – Whether to scan the sequence reverse complement as well

motifs[source]#
pthresh[source]#
rc[source]#
forward(seqs: List[str]) List[float][source]#

Compute scores.

Parameters:

seqs – A list of input sequences as strings.

__call__(seqs: List[str]) List[float][source]#