Quick Start¶
The demo dataset is provided with the repository under from /path/to/m6anet/demo/eventalign.txt
Firstly, we need to preprocess the segmented raw signal file in the form of nanopolish eventalign file using ‘m6anet-dataprep’:
m6anet-dataprep --eventalign m6anet/demo/eventalign.txt \
--out_dir /path/to/output --n_processes 4
The output files are stored in /path/to/output
:
data.index
: Indexing of data.json to allow faster access to the filedata.json
: json file containing the features to feed into m6Anet model for predictiondata.log
: Log file containing all the transcripts that have been successfully preprocesseddata.readcount
: File containing the number of reads for each DRACH positions in eventalign.txteventalign.index
: Index file created during dataprep to allow faster access of Nanopolish eventalign.txt during dataprep
Now we can run m6anet over our data using m6anet-run_inference:
m6anet-run_inference --input_dir demo_data --out_dir demo_data ---n_processes 4
The output files demo_data/data.result.csv.gz contains the probability of modification at each individual position for each transcript. The output file will have 4 columns
transcript_id
: The transcript id of the predicted positiontranscript_position
: The transcript position of the predicted positionn_reads
: The number of reads for that particular positionprobability_modified
: The probability that a given site is modified
The total run time should not exceed 10 minutes on a normal laptop