Cleaning for Poems: Difference between revisions
From Algolit
Line 1: | Line 1: | ||
by Algolit | by Algolit | ||
− | For this exhibition we're working with 3% of the Mundaneum's archive. These documents have first been scanned or photographed. To make the documents searchable they are transformed into text using Optical Character Recognition software (OCR). OCR are algorithmic models that are trained on other texts. They learned to identify characters, words, sentences and paragraphs. | + | For this exhibition we're working with 3% of the Mundaneum's archive. These documents have first been scanned or photographed. To make the documents searchable they are transformed into text using Optical Character Recognition software (OCR). OCR are algorithmic models that are trained on other texts. They have learned to identify characters, words, sentences and paragraphs. |
− | The software | + | The software often makes 'mistakes'. It might recognize a wrong character, it might get confused by a stain an unusual font or the other side of the page shining through. |
− | + | While these mistakes are often considered noise, confusing the training, they can also be seen as poetic interpretations of the algorithm. They show us the limits of the machine. The mistakes show us how the algorithm might work, what material it has seen in training and what is new, they reveal the standards of it's makers. In this installation you can choose how you treat the algorithm's misreadings, pick your degree of poetic cleanness, print your poem and take it home. | |
------------------------------------------ | ------------------------------------------ |
Revision as of 20:30, 4 March 2019
by Algolit
For this exhibition we're working with 3% of the Mundaneum's archive. These documents have first been scanned or photographed. To make the documents searchable they are transformed into text using Optical Character Recognition software (OCR). OCR are algorithmic models that are trained on other texts. They have learned to identify characters, words, sentences and paragraphs. The software often makes 'mistakes'. It might recognize a wrong character, it might get confused by a stain an unusual font or the other side of the page shining through. While these mistakes are often considered noise, confusing the training, they can also be seen as poetic interpretations of the algorithm. They show us the limits of the machine. The mistakes show us how the algorithm might work, what material it has seen in training and what is new, they reveal the standards of it's makers. In this installation you can choose how you treat the algorithm's misreadings, pick your degree of poetic cleanness, print your poem and take it home.
Concept, code, interface: Gijs de Heij