The data (e)speaks: Difference between revisions
From Algolit
Line 15: | Line 15: | ||
The texts were gathered from aaaaarg.fail, gen.lib.rus.ec, archive.org and gutenberg.org, run through terminal commands such as [https://en.wikipedia.org/wiki/Pdftotext pdftotext] in order to generate .txt files and stripped of punctuation marks with the help of [https://gitlab.constantvzw.org/algolit/algolit/blob/master/algoliterary_encounter/algoliterary-toolkit/text-punctuation-clean-up.py a Python code snippet]. | The texts were gathered from aaaaarg.fail, gen.lib.rus.ec, archive.org and gutenberg.org, run through terminal commands such as [https://en.wikipedia.org/wiki/Pdftotext pdftotext] in order to generate .txt files and stripped of punctuation marks with the help of [https://gitlab.constantvzw.org/algolit/algolit/blob/master/algoliterary_encounter/algoliterary-toolkit/text-punctuation-clean-up.py a Python code snippet]. | ||
− | The ensuing datasets | + | The ensuing datasets are: |
− | * | + | * [[Common Crawl]] |
− | * | + | * [[WikiHarass]] |
− | * | + | * [[Frankenstein]] |
+ | * [[Learning from Deep Learning]] | ||
+ | * [[nearbySaussure]] | ||
+ | * [[astroBlackness]] | ||
''The data (e)speaks'' is an audio installation that gives a voice to the datasets by selecting specific sentences from the body text. | ''The data (e)speaks'' is an audio installation that gives a voice to the datasets by selecting specific sentences from the body text. | ||
[[Category:Algoliterary-Encounters]] | [[Category:Algoliterary-Encounters]] |
Revision as of 18:37, 31 October 2017
Type: | Algoliterary exploration |
Datasets: | |
Technique: | espeak |
Developed by: | & Algolit |
In the process of making the Algolit datasets, careful consideration was given to the selection of the source texts. Our attempt was to have a variety of tone of voices that highlights the heterogeneity of all of them combined.
The texts were gathered from aaaaarg.fail, gen.lib.rus.ec, archive.org and gutenberg.org, run through terminal commands such as pdftotext in order to generate .txt files and stripped of punctuation marks with the help of a Python code snippet.
The ensuing datasets are:
The data (e)speaks is an audio installation that gives a voice to the datasets by selecting specific sentences from the body text.