Medical intelligence and language engineering lab
The Medical Intelligence and Language Engineering Laboratory, also known as the MILE lab, is a research laboratory at the Indian Institute of Science, Bangalore under the Department of Electrical Engineering. The lab is known for its work on Image processing, online handwriting recognition, Text-To-Speech and Optical character recognition[1] systems, all of which are focused mainly on documents and speech in Indian languages.[2] The lab is headed by A. G. Ramakrishnan.[3] Research focusOne of the commitments of the MILE lab is the development of technology for people with visual impairment to harness knowledge from any available printed material in Indian languages.[4] The lab is working towards reaching this goal. Its work so far has included: document mosaicing of coloured, camera captured images ; text extraction from complex colour images, including camera captured images; document layout analysis; detection of broken and merged characters; OCR technology for Tamil and Kannada;[5] text to speech conversion in Tamil and Kannada;[6] pitch modification using discrete cosine transform in the source domain;[7] automated part of speech tagging; phrase prediction and prosody modeling. Mozhi Vallan, the Tamil OCR[8] product developed by MILE Lab, is being used by Worth Trust and Karna Vidya Technology Centre, Chennai[9] for the conversion of printed school and college books to Braille format. Sri Ramakrishna Math, Chennai[10] is using it to convert their printed philosophical books in Tamil to computer readable text. Lipi Gnani, the Kannada OCR developed by MILE Lab is being used by Braille Transcription Centers of Mitrajyothi[11] and Canara Bank Relief & Welfare Society,[12] Bangalore for similar purposes. Also, Thirukkural,[13] the Tamil TTS system[14] developed by MILE Lab is being used by some school teachers in Singapore for assignments. Madhura, the Kannada TTS[15] developed by the lab, is being used by two blind students, integrated with a screen reader, to read aloud text OCR'ed with Lipi Gnani from Kannada books. Currently, the lab is researching on machine listening[16] and a novel temporal feature named as plosion index has been proposed, which has been shown to be extremely effective in detecting closure-burst transitions of stop consonants and affricates from continuous speech, even in noise.[17] Another feature proposed is DCTILPR,[18] which is a voice source based feature vector that improves the recognition performance of a speaker identification system. In the early days, significant work was carried out in medical signal and image processing. A unique algorithm was proposed for ECG compression by treating each cardiac cycle as a vector, and applying linear prediction on the discrete wavelet transform of this vector, after normalizing its period using multirate processing based interpolation.[19] The maturity of the fetal lung was predicted using image texture features obtained from the liver and lung regions of the ultrasound images obtained from pregnant women[20] An effective technique was proposed for lossless compression of 3D magnetic resonance images of the brain. Each MRI slice was represented by uniform or adaptive mesh; affine transformation was applied between the corresponding mesh elements of adjacent slices and context-based entropy coding, on the residues.[21] References
External links |