TB-AMR-CNN

The NN was trained using data from MTB-CNN and uses a similar architecture with the distinction that no multi-sequence alignment is needed (for more details see here). Instead, it requires only the one-hot-encoded sequences of the relevant loci from a particular sample.

The start and end coordinates of the loci can be found in data_files/target_loci.csv or queried from the container by passing the --get-target-loci argument (see below). If you have a SAM/BAM/CRAM file with reads aligned against H37Rv, you can use this Docker container to extract the one-hot-encoded sequences.

Example usage

Get coordinates of target loci

docker run -v $PWD:/data \
    julibeg/tb-ml-neural-net-from-one-hot-encoded-seqs-13-drugs:v0.7.0 \
    --get-target-loci \
    -o nn_target_loci.csv

Predict resistance against 13 drugs from one-hot-encoded sequences (passed in a CSV file)

docker run -v $PWD:/data \
    julibeg/tb-ml-neural-net-from-one-hot-encoded-seqs-13-drugs:v0.7.0 \
    input_seqs.csv