The data (e)speaks: Difference between revisions
From Algolit
Line 14: | Line 14: | ||
− | The texts were gathered from aaaaarg.fail, gen.lib.rus.ec, archive.org and gutenberg.org, run through terminal commands such as | + | The texts were gathered from aaaaarg.fail, gen.lib.rus.ec, archive.org and gutenberg.org, run through terminal commands such as [https://en.wikipedia.org/wiki/Pdftotext pdftotext] in order to generate .txt files and stripped of punctuation marks with the help of [https://gitlab.constantvzw.org/algolit/algolit/blob/master/algoliterary_encounter/algoliterary-toolkit/text-punctuation-clean-up.py a Python code snippet]. |
+ | |||
The ensuing datasets were: | The ensuing datasets were: |
Revision as of 17:17, 25 October 2017
Type: | Algoliterary exploration |
Datasets: | |
Technique: | espeak |
Developed by: | & Algolit |
In the process of making the Algolit datasets, careful consideration was given to the selection of the source texts. Our attempt was to have a variety of tone of voices that highlights the heterogeneity of all of them combined.
The texts were gathered from aaaaarg.fail, gen.lib.rus.ec, archive.org and gutenberg.org, run through terminal commands such as pdftotext in order to generate .txt files and stripped of punctuation marks with the help of a Python code snippet.
The ensuing datasets were: