Co-linearity analyses module¶

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

Contact: arzwa@psb.vib-ugent.be

The co-linearity analysis functions currently only support intragenomic analyses. It can be a bit finicky, but that’s mainly due to I-ADHoRe and people messing up the GFF format, not me! (definitely also me)

wgd.colinearity._write_gene_lists(genome, output_dir='gene_lists')¶

Write out the gene lists

Parameters:	genome – Genome object output_dir – output directory

wgd.colinearity.get_anchor_pairs(anchors, ks_distribution=None, out_file='anchors_ks.csv')¶

Get anchor pairs and their corresponding Ks values (if provided)

Parameters:	anchors – anchorpoints.txt output from I-ADHoRe 3.0 ks_distribution – Ks distribution dataf rame out_file – output file name
Returns:	pandas dataframe(s): anchors and data frame

wgd.colinearity.run_adhore(config_file)¶

Run I-ADHoRe for a given config file

Parameters:	config_file – path to I-ADHoRe configuration file

wgd.colinearity.segments_to_chords_table(segments_file, genome, output_file='chords.tsv')¶

Create chords table for visualization in a chord diagram. Uses the segments.txt output of I-ADHoRe. Chords are defined by a source chromosome and a target chromosome and begin and end coordinates for each chromosome respectively.

TODO: the length of each syntenic block should be included in the table as well with length defined as number of genes, not physical length.

Parameters:	segments_file – pat to the I-ADHoRe segments.txt output file genome – a `gff_parser.Genome object()` output_file – output file name

wgd.colinearity.write_config_adhore(gene_lists, families, config_file_name='i-adhore.conf', genome='genome', output_path='i-adhore_out', gap_size=30, cluster_gap=35, q_value=0.75, prob_cutoff=0.01, anchor_points=3, alignment_method='gg2', level_2_only='false', table_type='family', multiple_hypothesis_correction='FDR', visualize_ghm='false', visualize_alignment='true')¶

Write out the config file for I-ADHoRe. See I-ADHoRe manual for information on parameter settings.

Parameters:

gene_lists – directory with gene lists per chromosome
families – file with gene to family mapping
config_file_name – name for the config file
genome – genome name
output_path – output path name
gap_size – see I-ADHoRe 3.0 documentation
cluster_gap – see I-ADHoRe 3.0 documentation
q_value – see I-ADHoRe 3.0 documentation
prob_cutoff – see I-ADHoRe 3.0 documentation
anchor_points – see I-ADHoRe 3.0 documentation
alignment_method – see I-ADHoRe 3.0 documentation
level_2_only – see I-ADHoRe 3.0 documentation
table_type – see I-ADHoRe 3.0 documentation
multiple_hypothesis_correction – see I-ADHoRe 3.0 documentation
visualize_ghm – see I-ADHoRe 3.0 documentation
visualize_alignment – see I-ADHoRe 3.0 documentation

Returns:

configuration file see I-ADHoRe 3.0 documentation

wgd.colinearity.write_families_file(families, all_genes, output_file='families.tsv')¶

Write out families file

Parameters:	families – gene families all_genes – set object with all genes for the species of interest output_file – output file name
Returns:	nada

Co-linearity analyses module¶

Previous topic

Next topic

This Page