Image

About

Documentation

MotifScope is a web-server application based on Flask and written in Python. The source code is freely available on our GitHub page.
The application can potentially be cloned on your local computer and executed.
MotifScope is under active development, please do not hesitate to reach out to us for questions, comments, feedbacks or collaboration opportunities.

The MotifScope preprint is available here. Go and check it out!

Instructions:

  • Please upload a FASTA file containing the sequences to be analyzed. The web server limits the input to 1500 sequences per fasta file, a maximum sequence length of 15kb for each sequence, and a maximum file size of 4MB. If you need to perform larger analyses, please install the local version.
  • Samples can be annotated with a class label or population label by uploading a TSV file
    • For the TSV file, the first column should contain the sample name and the second column should contain the class label, e.g. HG002\tCEU.
    • If the first line contains fieldnames, the name of the first column needs to be 'sample'. The name of the second column is used for annotation in the figure. When no header is given, the default annotation label is 'population'.
    • For the FASTA file, the header should be in the format >sample_name#hap_number#comment, e.g. >HG002#1#chr1:100-200.
  • Clustering of reads can be disabled by selecting "no" for "Cluster sequences". This will also remove the dendrogram from the output, which can be useful in case only one or two sequences are analyzed.
  • To run multiple sequence alignment on the compressed representation of the sequence, select "on motifs" for "multiple sequence alignment" to aligns based on motifs or select "on nucleotides" to aligns based on nucleotides.
  • To run the algorithm with a set of known motifs (e.g. in case only a single sample is analyzed), upload the reference motifs in a file that contains the reference motifs separated with tab.

Log history:

ams gith