The name of the alphabet symbol.

The frequency of the alphabet symbol as defined by the background model.

SpaMo outputs a tab-separated values (TSV) file ('spamo.tsv') that contains one line for each motif found to be significantly enriched. The lines are grouped by secondary motif and sorted in order of decreasing statistical significance. The first line in the file contains the (tab-separated) names of the fields. Your command line is given at the end of the file in a comment line starting with the character '#'. The names and meanings of each of the fields are described in the table below.

field name contents
1 prim_db The name of a file of motifs ("motif database file") that contains the primary motif.
2 prim_id The name of the primary motif, which is unique in the motif database file.
3 prim_alt An alternate name for the primary motif that may be provided in the motif database file.
4 prim_cons A consensus sequence computed from the primary motif (as described below).
5 sec_db The name of a file of motifs ("motif database file") that contains the secondary motif.
6 sec_id The name of the secondary motif, which is unique in the motif database file.
7 sec_alt An alternate name for the secondary motif that may be provided in the motif database file.
8 sec_cons A consensus sequence computed from the secondary motif (as described below).
9 trim_left Number of positions trimmed from left of secondary motif.
10 trim_right Number of positions trimmed from right of secondary motif.
If the next three fields are not blank, the motif is redundant with a more significant ('parent') motif.
11 red_db The name of a file of motifs ("motif database file") that contains the parent motif.
12 red_id The name of the parent motif, which is unique in the motif database file.
13 red_alt An alternate name for the parent motif that may be provided in the motif database file.
14 E-value The expected number motifs that would have least one spacing as enriched as the best spacing for this secondary. The E-value is the best spacing p-value multiplied by the number of motifs in the input database(s).
15 gap The distance between the edge of the primary and the (trimmed) secondary motif.
16 orient The (combination) of quadrants for which occurrences of this spacing are combined.
17 count The number of occurrences of the secondary motif with the given spacing and orientation to the primary motif.
18 total The total number of occurrences of the secondary motif within the margins around the best primary motif occurrence.
19 adj_p-value The p-value of the gap and orientation, adjusted for nine combinations of quadrants times the number of gaps tested (as controlled by the -range option).
20 p-value The p-value of the gap and orientation adjusted only for the number of gaps tested.

A consensus sequence is constructed from each column in a motif's frequency matrix using the "50% rule" as follows:

  1. The letter frequencies in the column are sorted in decreasing order.
  2. Letters with frequency less 50% of the maximum are discarded.
  3. The letter used in this position in the consensus sequence is determined by the first rule below that applies:
[ close ]

The database of the primary motif.

[ close ]

The ID of the primary motif followed by the alternate ID in brackets if it has one.

[ close ]

The logo of the primary motif.

Sections of the motif with a gray background have been trimmed and were not used for scanning.

[ close ]

The number of secondary motifs found that had significant spacings in the tested region.

[ close ]

The list of secondary motifs found that had significant spacings in the tested region.

[ close ]

The name of the sequence database.

[ close ]

The last modified date of the sequence database.

[ close ]

The number of sequences in the sequence database.

[ close ]

The number of sequences in the sequence database which were excluded because they were shorter than twice the margin plus the primary motif length.

[ close ]

The number of sequences in the sequence database which were excluded because they contained large runs of ambiguous symbols (normally wildcard masking) that could bias the results.

[ close ]

The number of sequences in the sequence database which were excluded because no match to the primary motif could be found at a distance to the edges larger than the margin.

[ close ]

The number of sequences in the sequence database which were excluded because they were largely identical to other sequences when aligned on the primary motif site.

[ close ]

The number of sequences which were scanned with the secondary motifs.

[ close ]

The name of the motif database derived from the file name.

[ close ]

The date that the motif database was last modified.

[ close ]

The number of motifs loaded from the motif database. Some motifs may have been excluded.

[ close ]

The number of motifs with significant E-values whose significant spacings were not considered too similar to those of another motif.

[ close ]

The number of motifs that while having significant spacings were less significant than another motif that matched most of the same sites.

[ close ]

This checkbox ensures the row stays visible after a filter operation that would normally hide it.

[ close ]

The ID of the secondary motif.

[ close ]

The alternate name of the secondary motif.

[ close ]

The ID of the secondary motif followed by the alternate ID in brackets if it has one.

[ close ]

The name of the cluster to which this secondary motif belongs. SpaMo assigns each secondary motif to a cluster, and names the cluster after the motif in it with the most significant spacing. SpaMo assigns two secondary motifs to the same cluster if the matches in their most significant spacings (from the primary motif) overlap substantially. Clustering is controlled by the -joint and -overlap options.

[ close ]

The E-value is the lowest p-value of any spacing of the secondary motif times the number of secondary motifs. It estimates the expected number of random secondary motifs that would have the observed minimum p-value or less.

[ close ]

The gap between the primary and secondary motifs for the most significant spacing.

[ close ]

The strand and position of the secondary motif relative to the primary motif for the most significant spacing.

[ close ]

The minimum score accepted as a match to either the primary or secondary motif. This value can greatly affect the results of SpaMo. If it is too high, there will be no matches to the primary motif. If too low, sequences with non-significant matches to the primary and/or secondary motif will reduce the effectiveness of the spacing analysis.

[ close ]

The distance either side of the primary motif site which makes up the region that can contain the secondary motif site. Additionally it is the minimum gap between the primary motif site and the edge of the sequence. These constraints mean that input sequences shorter than the trimmed length of the primary motif plus two times the margin size can not be used by SpaMo.

[ close ]

A histogram showing the counts for the orientation with the best spacing.

The significant spacings are highlighted in red.

[ close ]

The primary motif is used as the reference point for all spacing calculation.

Sections of the motif with a gray background have been trimmed and were not used for scanning.

[ close ]

The secondary motif occurs at the spacings relative to the primary shown in the histogram below.

Sections of the motif with a gray background have been trimmed and were not used for scanning.

[ close ]

The regions matching the secondary motif in the sequences with the given spacing are used to construct a motif. The logo for this "inferred" motif is shown aligned with that of the actual secondary motif.

The inferred secondary motif logo should closely resemble that of the secondary motif. If it does not, this may suggest that the observed spacing may actually be due to the enrichment of a motif that differs from the secondary motif.

You can download the inferred secondary motif by moving the mouse cursor over the logo and clicking "Download as MEME motif". You can then use this downloaded motif as an input to Tomtom to see what other known motifs it may resemble.

[ close ]

These are the sequence logos created by aligning all of the sequences with the significant motif spacing. Alignments are centered on the match to the primary motif and done separately for each of the quadrants that contribute to the significant spacing. The logos extend in both directions (up to) 10 positions past the maximum region considered in the significance tests.

Note 1: If you don't see the complete logo(s), you can use the scroll bar underneath the Alignment window. If you don't see a scroll bar and are on a Mac, you can turn on scroll bars by clicking on the Apple Icon at the top left of your terminal and clicking: System Preferences/General/Show scroll bars/Always.

Note 2:These logos are useful for detecting cases where highly similar regions (such as DNA repeats) are present among the sequences with the significant motif spacing. Such cases may indicate that the spacing is due to recent duplication events rather than to a functional biological relationship between the primary and secondary motifs. Ideally, the regions around the primary and secondary motifs should have low information content and their logos in the alignment should closely match their motifs.

[ close ]

This table shows the details of the significant spacings between the primary motif and the secondary motif currently selected in the "Secondaries" section, below. Click on a row in this table to select a particular spacing for detailed analysis.

Gap
is the space between the primary and secondary motifs where a value of zero means there is no space between them. Note that if a motif has had low information content areas trimmed off this is the gap to the first untrimmed position.
Orientation
is the combination of quadrants used. Possible values are: individual quadrants (up+, up-, dn+, dn-) which are important when neither motif is palindromic; the diagonally combined quadrants (up+/dn-, up-/dn+) which are important when only the primary motif is palindromic; the vertically combined quadrants (up+/up-, dn+/dn-) which are important when only the secondary motif is palindromic; and all quadrants combined together (all) which is important when both motifs are palindromes.
P-value
is the probability of the observed number (or more) sequences having the observed spacing between the primary and secondary motif, adjusted for multiple tests. The number of multiple tests is the number of spacing bins (the number of bars in one quadrant of the histogram) times the number of combinations of quadrants (nine) tested for significance.
[ close ]

The histogram below shows the frequency of spacings from the primary motif to the secondary motif.

The two quadrants on the left show spacings where the secondary motif is upstream of the primary motif and the two quadrants on the right show spacings where the secondary motif is downstream of the primary motif.

The two quadrants on the top show spacings where the secondary motif is on the same strand as the primary motif and the two quadrants on the bottom show spacings where the secondary motif is on the opposite strand to the primary motif.

Histogram bars highlighted pink are part of one of the listed significant spacings. This feature can be disabled by unchecking the "highlight all" option under the spacings.

Histogram bars highlighted red are part of the currently selected significant spacing. This feature can be disabled by unchecking the "highlight selected" option under the spacings.

[ close ]

The selected orientation graph shows the combined quadrants from the selected spacing with a zoomed view that only shows the portion of the graph for which significance testing was performed.

Histogram bars highlighted pink are one of the listed significant spacings for this orientation. This feature can be disabled by unchecking the "highlight all" option under the spacings.

The histogram bar highlighted red is the currently selected significant spacing. This feature can be disabled by unchecking the "highlight selected" option under the spacings.

[ close ]

This causes a file named spamo_contr_seqs.txt or spamo_contr_seqs.bed to be downloaded. The file contains the contributing sequence IDs for each significant spacing.

Each group of sequence IDs begins with a comment line containing (1) the rank of the spacing, (2) the name of the file that would contain the sequence IDs if you had used the "Contributing Sequence IDs Download" function for a single spacing, and (3) the p-value of the spacing. (Note: See the help bubble for "Contributing Sequence IDs", below, for the format and meaning of the file names.)

The sequence identifiers will be as they appear in the input sequence file (Plain) or in UCSC Genome Browser format (BED), depending on which file you choose to download.

[ close ]

This lists the sequence identifiers of the subset of sequences that contain the significant motif spacing. You can choose either the original sequence ID format (Plain) or UCSC Genome Browser format (BED) using the menu below.

These identifiers can be cut-and-pasted into other programs for further analysis (e.g., Genome Ontology analysis or location analysis in the case ChIP-seq peak regions).

You can also download the identifiers using the "Download" link below. They will be placed in a file with name:

seqs_<prim>_with_<scnd>_g<gap>_o<orient>
and extension .txt if you choose "Plain Format" or with the extension .bed if you choose BED format. The fields in brackets in the file name have the following meanings:
name meaning
<prim>the ID of the primary motif
<scnd>the ID of the secondary motif
<gap>the width of the spacing
<orient>an integer code denoting the orientation of the spacing.

The orientation codes are:

orientation code enriched quadrant(s)
0 up+
1 dn+
2 up-
3 dn-
4 up+/up-
5 up+/dn-
6 up-/dn+
7 dn+/dn-
8 all

[ close ]

Click on a row in this table to select one of the significant secondary motifs for detailed analysis. The details of the significant spacings between the primary motif and the secondary motif you select here will be displayed in the table and plots above.

[ close ]

Specify which secondary motifs to display in the Secondaries table by checking one or more of the tick boxes below and then entering filter criteria. Then click "Update" to refresh the view of the Secondaries table.

[ close ]

Specify the order in which secondary motifs are displayed in the Secondaries table by selecting a sorting criteria in the menu below. Then click "Update" to refresh the view of the Secondaries table.

[ close ]

For further information on how to interpret these results please access https://meme-suite.org/meme/doc/spamo-output-format.html.
To get a copy of the MEME software please access https://meme-suite.org.

If you use SpaMo in your research, please cite the following paper:
Tom Whitington, Martin C. Frith, James Johnson and Timothy L. Bailey "Inferring transcription factor complexes from ChIP-seq data", Nucleic Acids Res. 39(15):e98, 2011. [full text]

Primary Motifs  |  Sequence Database  |  Secondary Motif Databases  |  Spacing Analysis  |  Inputs and Settings  |  Program information  |  Results in TSV Format 
  |  Contributing Sequence IDs [Download Plain] [Download BED] 

Primary Motifs

Next Top
Database
Name
Preview
Significant Secondaries
List
meme.xml SYTYWAATCCCAGCA (MEME-2) 9

Alphabet

    Background source: the file 'GSM4160243-ETO_WO_meme-chip/background'

Name 
Bg. 
Bg. 
Name 
Adenine0.251A~T0.251Thymine
Cytosine0.249C~G0.249Guanine

Sequence Database

Next Previous Top
Name
Last Modified
Contained
Too Short
Too Masked
No Primary
Too Similar
Used
GSM4160243-ETO_WO.mm10plusrDNA.summits_200 Fri May 26 19:15:36 2023 2000 0 0 1936 4 60

Secondary Motif Databases

Next Previous Top
Name
Last Modified
Number of Motifs
Motifs Significant
Motifs Redundant
meme.xml Fri May 26 19:16:33 2023 3 3 0
streme.xml Fri May 26 19:17:00 2023 6 5 1

Settings

Next Previous Top
Match Score Threshold
  
7 (bits)
Margin size
150
Width of histogram bins 1
Significance computed up to this distance    150
Secondary match handling Count only the best secondary match above the score threshold Count all secondary matches above the match score threshold
Maximum allowed sequence identity 0.5
Odds ratio for redundancy heuristic 20
Bin p-value cutoff 0.05
Secondary motif E-value cutoff 10
Overlapping bases for redundancy check 2
Fraction of sites for redundancy check 0.5
Pseudocount added to motifs 0.1
Bit threshold for trimming motif edges 0.25
Primary and secondary motif alphabets Converting secondary alphabet to primary alphabet Primary and secondary alphabets must match
Random number seed 1
Show Advanced Settings Hide Advanced Settings

Spacing Analysis for

Next Previous Top
Secondary Motif:
Cluster:
E-value:
Best Gap:
Best Orientation:

Primary Motif Logo

Secondary Motif Logo

Inferred Secondary Motif Logo

Download as EPS

Spacings

Gap
Gap
Orientation
Orientation
p-value
p-value
Highlight:

Overview Graph

Download as EPS

Selected Orientation Graph

Download as EPS

Contributing Sequence IDs ()


Download

Secondaries

Filter

Sort

Lock
ID
Name
Cluster
E-value
Best Gap
Best Orientation
Spacings

Spacing Analysis for SYTYWAATCCCAGCA (MEME-2)

Next Previous Top
Secondary Motif: 6-CTGCTCTTCCT (STREME-6)
Cluster: 6-CTGCTCTTCCT (STREME-6)
E-value: 2.44e-13
Best Gap: 10
Best Orientation: upstream / same strand

Primary Motif Logo

Secondary Motif Logo

Inferred Secondary Motif Logo

Download as EPS

Spacings

Gap
Gap
Orientation
Orientation
p-value
p-value
10up+2.71e-14
10up+/up-3.37e-12
10up+/dn-3.37e-12
10all4.07e-10
Highlight:

Overview Graph

Download as EPS

Selected Orientation Graph

Download as EPS

Contributing Sequence IDs (7)


Download

Secondaries

Filter

Sort

Lock
ID
Name
Cluster
E-value
Best Gap
Best Orientation
Spacings
6-CTGCTCTTCCTSTREME-66-CTGCTCTTCCT2.44e-1310upstream / same strand
4-TGAGTTCAAAKCCMSTREME-44-TGAGTTCAAAKCCM2.89e-1027downstream / same strand
CSWYCCTCCGKYYGYMEME-1CSWYCCTCCGKYYGY2.55e-99upstream / opposite strand
1-RRMCGGAGGDWBGNBSTREME-1CSWYCCTCCGKYYGY5.34e-99upstream / same strand
AAAAAAVAAAVAAAAMEME-3AAAAAAVAAAVAAAA4.80e-6105downstream / same strand
2-AWAGCAAAASTREME-22-AWAGCAAAA2.37e+0133upstream / same strand
SYTYWAATCCCAGCAMEME-2SYTYWAATCCCAGCA2.61e+024upstream / primary palindromic
3-GTGCTGGGASTREME-33-GTGCTGGGA9.00e+017downstream / opposite strand
5-CACCATGTGGTTGCTSTREME-55-CACCATGTGGTTGCT9.00e+09downstream / same strand
Previous Top
SpaMo version
5.5.2 (Release date: Sun Jan 29 10:33:12 2023 -0800)
Reference
Tom Whitington, Martin C. Frith, James Johnson and Timothy L. Bailey "Inferring transcription factor complexes from ChIP-seq data", Nucleic Acids Res. 39(15):e98, 2011. [full text]
Command line


Result calculation took 0 seconds