|| Efficient Distributed Source Coding of Fragmented Genomic Sequencing Data
||Yotam Gershon, Yuval Cassuto, Technion – Israel Institute of Technology, Israel|
||D7-S6-T2: Applications of Source Coding
||Tuesday, 20 July, 23:40 - 00:00
||Wednesday, 21 July, 00:00 - 00:20
In this paper we present a new compression scheme for genomic read data produced by modern sequencing technologies. In this setting, a reference genome similar to the one being sequenced is available only at the decoder, while the offset of each read within this reference in unknown. The proposed scheme significantly reduces the encoding complexity relative to known reference-based compression schemes. The results include a code construction based on generalized concatenation coset codes, analysis of the decoding failure probability, and optimization of the scheme parameters for minimal compression rate.