Documentation

FAQ

  • What are the rDiff options?
  • What does it mean if the test status is not "OK"?
  • How can I make rDiff.nonparametric faster?
    What are all the rDiff options good for?

    The different options for rDiff are explained in the table bellow. Most of them are not required for the basic feature but can be used to adapt rDiff to your experimental setting.

    Option Description
    -h Display the help
    -o This option takes as argument the output directory where the results should be. This is also where rDiff will save the other output files.
    -d Directory where the bam-files are located. If they are in in different directories this can be also be / and the path to the bamfiles can be given as part of the bam-file names.
    -a This argument specifies which sample should be used for sample 1. It takes as argument a comma separated list of bam-files for sample 1. It is important not to have spaces between the files. The input should be of the form: File1.bam,File2.bam,...
    -b This argument specifies which sample should be used for sample 2. It takes as argument a comma separated list of bam-files for sample 2. It is important not to have spaces between the files. The input should be of the form: File1.bam,File2.bam,...
    -g Path to GFF3 gene structure
    -L Read length used for rDiff.parametric to compute the alternative regions. The default ist 75 bp. If the reads are longer or shorter rDiff will try to find the best match to an alternative region.
    -m This option takes as argument the method that should be used for testing. The default option is rDiff.parametric:
    • param for rDiff.parametric
    • nonparam for rDiff.nonparametric
    • poisson for rDiff.poisson
    • mmd for rDiff.mmd
      -M Minimal read length required. The default is 30 bp. The reads that are shorter are not used for the analysis.
      -e Skip the gene expression estimation. If the gene expression estimation step should be skipped enter 0. The default is 1.
      -E Only estimate the gene expression and variance function estimation and do not perform testing. If you want to exit after the variance function estimation enter 0. The default is 1.
      -A This option takes as argument the path to variance function for sample 1. This option can be used for example, if a previously computed variance function should be used.
      -B This option takes as argument the path to variance function for sample 2. This option can be used for example, if a previously computed variance function should be used.
      -S Filename under which variance function for sample 1 will be saved.
      -T Filename under which variance function for sample 2 will be saved.
      -P Using this option one can specify a parametric variance function for sample 1 of the form f(x)=a+b*x+b*x^2. The argument for this option is a,b,c.
      -Q Using this option one can specify a parametric variance function for sample 2 of the form f(x)=a+b*x+b*x^2. The argument for this option is a,b,c.
      -y Use only the gene start and stop for the rDiff.nonparametric variance function estimation. Enter 1 if this should be done and 0 otherwise.
      -s This option allows to sample the reads down to a certain number. This increases the speed for highly covered genes The argument is number of reads per gene to which to down sample. The Default is 10000.
      -C Number of bases to clip from each end of each read. This reduces the false mappings of spliced read ends. The default is 3 bp.
      -p Number of permutations performed for rDiff.nonparametric. The default is 1000.
      -x Merge sample 1 and sample 2 for variance function estimation. Type 1 to merge the samples. The default is 0
      What does it mean if the test status is not "OK"?

      This means that there was a problem when the testing. This can happen for example when there are not enough reads for testing.

      How can I make rDiff.nonparametric faster?

      You can either reduce the number of reads that should be sampled using the parameter using the option -s or reduce the number of permutation using the parameter -p.

      Alternatively you can also parallelize rDiff.parametric by first estimating the variance functions. You can then split up the gene structure and test using the estimated gene expression and variance functions.