Algoliterary Encounters: Difference between revisions
From Algolit
(38 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
__NOTOC__ | __NOTOC__ | ||
− | + | == About == | |
− | + | * [[An Algoliterary Journey]] | |
− | |||
− | == | ||
− | |||
− | * [[ | ||
* [[Program]] | * [[Program]] | ||
==Algoliterary works== | ==Algoliterary works== | ||
− | + | A selection of works by members of Algolit presented in other contexts before. | |
* [[i-could-have-written-that]] | * [[i-could-have-written-that]] | ||
− | * | + | * [[The Weekly Address, A model for a politician]] |
* [[In the company of CluebotNG]] | * [[In the company of CluebotNG]] | ||
+ | * [[Oulipo recipes]] | ||
==Algoliterary explorations== | ==Algoliterary explorations== | ||
+ | This chapter presents part of the research of Algolit over the past year. | ||
+ | |||
=== What the Machine Writes: a closer look at the output === | === What the Machine Writes: a closer look at the output === | ||
+ | Two neural networks are presented more closely, what content do they produce? | ||
* [[CHARNN text generator]] | * [[CHARNN text generator]] | ||
* [[You shall know a word by the company it keeps]] | * [[You shall know a word by the company it keeps]] | ||
=== How the Machine Reads: Dissecting Neural Networks === | === How the Machine Reads: Dissecting Neural Networks === | ||
− | |||
==== Datasets ==== | ==== Datasets ==== | ||
− | * [[Many many words]] | + | Working with Neural Networks includes collecting big amounts of textual data. |
− | + | We compared a 'regular' size with the collection of words of the Library of St-Gilles. | |
+ | * [[Many many words]] | ||
− | ===== | + | =====Public datasets===== |
+ | Most commonly used public datasets are gathered at [https://aws.amazon.com/public-datasets/ Amazon]. | ||
+ | We looked closely at the following two: | ||
* [[Common Crawl]] | * [[Common Crawl]] | ||
* [[WikiHarass]] | * [[WikiHarass]] | ||
=====Algoliterary datasets===== | =====Algoliterary datasets===== | ||
+ | Working with literary texts allows for poetic beauty in the reading/writing of the algorithms. | ||
+ | This is a small collection used for experiments. | ||
+ | * [[The data (e)speaks]] | ||
* [[Frankenstein]] | * [[Frankenstein]] | ||
− | * [[Learning from Deep Learning]] | + | * [[Learning from Deep Learning]] |
− | * [[ | + | * [[nearbySaussure]] |
− | * [[ | + | * [[astroBlackness]] |
==== From words to numbers ==== | ==== From words to numbers ==== | ||
+ | As machine learning is based on statistics and math, in order to process text, words need to be transformed to numbers. In the following section we present three technologies to do so. | ||
* [[A Bag of Words]] | * [[A Bag of Words]] | ||
* [[A One Hot Vector]] | * [[A One Hot Vector]] | ||
+ | * [[About Word embeddings|Exploring Multidimensional Landscapes: Word Embeddings]] | ||
+ | * [[Crowd Embeddings|Word Embeddings Casestudy: Crowd embeddings]] | ||
− | + | ===== Different vizualisations of word embeddings ===== | |
− | |||
− | |||
− | |||
− | ===== Different | ||
* [[Word embedding Projector]] | * [[Word embedding Projector]] | ||
− | |||
* [[The GloVe Reader]] | * [[The GloVe Reader]] | ||
− | ===== Inspecting the technique ===== | + | ===== Inspecting the technique behind word embeddings ===== |
− | * [[word2vec_basic.py | + | * [[word2vec_basic.py]] |
− | |||
* [[Reverse Algebra]] | * [[Reverse Algebra]] | ||
=== How a Machine Might Speak === | === How a Machine Might Speak === | ||
+ | If a computer model for language comprehension could speak, what would it say? | ||
* [[We Are A Sentiment Thermometer]] | * [[We Are A Sentiment Thermometer]] | ||
== Sources == | == Sources == | ||
− | * [ | + | The scripts we used and a selection of texts that kept us company. |
− | * [[Algoliterary Bibliography]] | + | * [[Algoliterary Toolkit]] |
+ | * [[Algoliterary Bibliography]] | ||
[[Category:Algoliterary-Encounters]] | [[Category:Algoliterary-Encounters]] |
Latest revision as of 13:50, 2 November 2017
About
Algoliterary works
A selection of works by members of Algolit presented in other contexts before.
- i-could-have-written-that
- The Weekly Address, A model for a politician
- In the company of CluebotNG
- Oulipo recipes
Algoliterary explorations
This chapter presents part of the research of Algolit over the past year.
What the Machine Writes: a closer look at the output
Two neural networks are presented more closely, what content do they produce?
How the Machine Reads: Dissecting Neural Networks
Datasets
Working with Neural Networks includes collecting big amounts of textual data. We compared a 'regular' size with the collection of words of the Library of St-Gilles.
Public datasets
Most commonly used public datasets are gathered at Amazon. We looked closely at the following two:
Algoliterary datasets
Working with literary texts allows for poetic beauty in the reading/writing of the algorithms. This is a small collection used for experiments.
From words to numbers
As machine learning is based on statistics and math, in order to process text, words need to be transformed to numbers. In the following section we present three technologies to do so.
- A Bag of Words
- A One Hot Vector
- Exploring Multidimensional Landscapes: Word Embeddings
- Word Embeddings Casestudy: Crowd embeddings
Different vizualisations of word embeddings
Inspecting the technique behind word embeddings
How a Machine Might Speak
If a computer model for language comprehension could speak, what would it say?
Sources
The scripts we used and a selection of texts that kept us company.