Actions

You shall know a word by the company it keeps

From Algolit

Revision as of 14:42, 25 October 2017 by Manetta (talk | contribs)


Type: Algoliterary exploration
Datasets: Frankenstein, AnarchFem, WikiHarass, Learning from Deep Learning, Tristes Tropiques
Technique: word-embeddings
Developed by: Google Tensorflow's word2vec, Algolit

You shall know a word by the company it keeps is a series of 5 landscapes that are based on different datasets. Each landscape includes the words 'collective', 'being', 'social' in company of different semantic clusters. The belief that distances in the graph are connected to semantic similarity of words, was one of the basic ideas behind word2vec.

The graphs are the result of a code study based on an existing word-embedding tutorial script word2vec_basic.py. In a machine learning practise, graphs like these function as one of the validation tools to see if a model starts to make sense. It is interesting how this validation process is fuelled by individual semantic understanding of the clusters and the words.

How can we use these semantic landscapes as reading tools?

graph 1: Frankenstein dataset

Includes the book Frankenstein or, The Modern Prometheus by Mary Shelly.

Error creating thumbnail: Unable to save thumbnail to destination

graph 2: Anarch Feminist dataset

Includes 3 books (...)

Error creating thumbnail: Unable to save thumbnail to destination

graph 3: Claude Levi-Strauss dataset

Includes the book Tristes Tropiques by Claude Lévi-Strauss.

Error creating thumbnail: Unable to save thumbnail to destination

graph 4: Deep Learning textbooks dataset

Includes the books (...).

Error creating thumbnail: Unable to save thumbnail to destination

graph 5: Harassing comments dataset

Includes examples of harassment on Talk page comments from Wikipedia.

Error creating thumbnail: Unable to save thumbnail to destination