TB-AMR-CNN
The NN was trained using data from MTB-CNN and uses a similar architecture with the distinction that no multi-sequence alignment is needed (for more details see here). Instead, it requires only the one-hot-encoded sequences of the relevant loci from a particular sample.
The start and end coordinates of the loci can be found in data_files/target_loci.csv
or queried from the container by passing the --get-target-loci
argument (see below). If you have a SAM/BAM/CRAM file with reads aligned against H37Rv, you can use this Docker container to extract the one-hot-encoded sequences.
Example usage
Get coordinates of target loci
docker run -v $PWD:/data \
julibeg/tb-ml-neural-net-from-one-hot-encoded-seqs-13-drugs:v0.7.0 \
--get-target-loci \
-o nn_target_loci.csv
Predict resistance against 13 drugs from one-hot-encoded sequences (passed in a CSV file)
docker run -v $PWD:/data \
julibeg/tb-ml-neural-net-from-one-hot-encoded-seqs-13-drugs:v0.7.0 \
input_seqs.csv