Genome build liftover — liftover • MungeSumstats

Transfer genomic coordinates from one genome build to another.

liftover(
  sumstats_dt,
  convert_ref_genome,
  ref_genome,
  chain_source = "ensembl",
  imputation_ind = TRUE,
  chrom_col = "CHR",
  start_col = "BP",
  end_col = start_col,
  as_granges = FALSE,
  style = "NCBI",
  verbose = TRUE
)

Source

liftOver

UCSC chain files

Ensembl chain files

Arguments

sumstats_dt: data table obj of the summary statistics file for the GWAS.
convert_ref_genome: name of the reference genome to convert to ("GRCh37" or "GRCh38"). This will only occur if the current genome build does not match. Default is not to convert the genome build (NULL).
ref_genome: name of the reference genome used for the GWAS ("GRCh37" or "GRCh38"). Argument is case-insensitive. Default is NULL which infers the reference genome from the data.
chain_source: chain file source used ("ucsc" as default, or "ensembl")
imputation_ind: Binary Should a column be added for each imputation step to show what SNPs have imputed values for differing fields. This includes a field denoting SNP allele flipping (flipped). On the flipped value, this denoted whether the alelles where switched based on MungeSumstats initial choice of A1, A2 from the input column headers and thus may not align with what the creator intended.Note these columns will be in the formatted summary statistics returned. Default is FALSE.
chrom_col: Name of the chromosome column in sumstats_dt (e.g. "CHR").
start_col: Name of the starting genomic position column in sumstats_dt (e.g. "POS","start").
end_col: Name of the ending genomic position column in sumstats_dt (e.g. "POS","end"). Can be the same as start_col when sumstats_dt only contains SNPs that span 1 base pair (bp) each.
as_granges: Return results as GRanges instead of a data.table (default: FALSE).
style: Style to return GRanges object in (e.g. "NCBI" = 4; "UCSC" = "chr4";) (default: "NCBI").
verbose: Print messages.

Value

Lifted summary stats in data.table

or GRanges format.

Examples

sumstats_dt <- MungeSumstats::formatted_example()
#> Standardising column headers.
#> First line of summary statistics file: 
#> MarkerName	CHR	POS	A1	A2	EAF	Beta	SE	Pval	
#> Sorting coordinates.

sumstats_dt_hg38 <- liftover(sumstats_dt=sumstats_dt, 
                             ref_genome = "hg19",
                             convert_ref_genome="hg38")
#> Performing data liftover from hg19 to hg38.
#> Converting summary statistics to Genomic Ranges.
#> Downloading chain file from Ensembl.
#> /tmp/RtmpmswcRa/GRCh37_to_GRCh38.chain.gz
#> Reordering so first three column headers are SNP, CHR and BP in this order.
#> Reordering so the fourth and fifth columns are A1 and A2.