Posts Tagged ‘speak’

Enable festival Linux (text-to-speech-system) to read/speak PDF and DOC files (Speech PDF and DOC in Festival Script)

Sunday, September 20th, 2009

Today I wondered if festival supports reading of PDF files on Linux? The answer due to my short research was NO!.

Well though I couldn’t find official program to speak PDFs for me it’s not such a big deal since it’s not so hard to convert PDF files into plain text files in Linux with pdftotext command. 

pdftotext is part of the poppler-utils which is a nice package which alsocontains pdfimages – enabling you to extract images from pdfs,
pdftohtml – pdf to html converter and
pdffonts – pdf font analyzier. The normal way to read PDF files via festival is: First use pdftotext to convert your PDF to text file

$ pdftotext filename.pdf outputfile.txt

and then to make computer speak it over festival default configured synthesizer:

$ cat outfile.txt | festival –tts
For convenience I’ve created a small shell script I calledfestival-read-pdf.sh which does this directly.

Please download the festival-read-pdf.sh shell script here Furthermore I wondered how to make the Microsoft Office .doc files to be played throughfestival. On that account It was required something to convert again the .doc file extension to plain text. I came across antiword which I’ve blogged about in my previous post. Thus to carry it via festival you need to: antiword filename.doc | festival –tts I’ve fastly scripted it for some convenience. Download the festival-doc-read.sh script here I’ve also created a third bash script which enables you to select either to play DOC or PDF file in Festival.
Here is a link to the festival’s festival-read-doc-en-pdf.sh PDF, DOC speaker script .
Talking about festival it might be interesting to mention fala – A simple text reader.If you’re a Debian user you’ll be glad to know there is already a package containg fala. Well I hope you’ll find the PDF, DOC festival speech scripts useful. Enjoy

END—–