Actions

You shall know a word by the company it keeps

From Algolit

Revision as of 13:48, 2 November 2017 by An (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Type: Algoliterary exploration
Datasets: Frankenstein, AstroBlackness, WikiHarass, Learning from Deep Learning, nearbySaussure
Technique: word embeddings
Developed by: Google Tensorflow's word2vec, Algolit

You shall know a word by the company it keeps is a series of 5 landscapes that are based on different datasets. Each landscape includes the words 'human', 'learning', 'system' in company of different semantic clusters. The belief that distances in the graph are connected to semantic similarity of words, is one of the basic ideas behind word2vec.

The graphs are the result of a code study based on an existing word-embedding tutorial script word2vec_basic.py. In a machine learning practise, graphs like these function as one of the validation tools to see if a model starts to make sense. It is interesting how this validation process is fuelled by individual semantic understanding of the clusters and the words.

How can we use these semantic landscapes as reading tools?

settings

graph 1: Frankenstein

Includes the book Frankenstein or, The Modern Prometheus by Mary Shelly.

loss value: 4.45983128536
Nearest to human: fair, active, crevice, sympathizing, pretence, fellow, nightingale, productions, deaths, medicine,
Nearest to learning: steeple, clump, electricity, security, foretaste, fluctuating, finding, gazes, pour, decides,
Nearest to system: philosophy, coincidences, threatening, selfcontrol, distinctly, babe, stream, chimney, recess, accounts,
Error creating thumbnail: Unable to save thumbnail to destination
Error creating thumbnail: Unable to save thumbnail to destination

graph 2: AstroBlackness

A selection of texts from an afrofuturist perspective.

loss value: 5.8195698024
Nearest to human: black, difference, white, gender, otherwise, 3, 7, ignorance, contemporary, greater,
Nearest to learning: superior, truth, function, lens, start, dying, existence, changing, symbol, place,
Nearest to system: attempts, adapt, programmed, varieties, limit, realization, color, promise, population, voice,
Error creating thumbnail: Unable to save thumbnail to destination
Error creating thumbnail: Unable to save thumbnail to destination

graph 3: nearbySaussure

Includes three secondary books about Saussure's work in structuralist linguistics.

loss value: 5.78265964687
Nearest to human: cultural, 181, psychic, Human, rational, physical, story, chance, domain, furthermore,
Nearest to system: structure, content, community, System, term, center, study, plurality, form, value,

The word 'learning' did not appear in the list of 5000 most common words.
Error creating thumbnail: Unable to save thumbnail to destination
Error creating thumbnail: Unable to save thumbnail to destination

graph 4: Learning from Deep Learning

Includes seven text books on the topic of deep learning.

loss value: 6.65393904257
Nearest to human: healthy, given, modeling, poorly, inspired, criterion, specifically, Accuracy, surface, predicting,
Nearest to learning: Learning, pretrained, sparse, neat, 21, inference, tuning, adagrad, tested, Use,
Nearest to system: UNK, roi, dataframe, code, win, page, approach, diagonal, cae, letter,
Error creating thumbnail: Unable to save thumbnail to destination
Error creating thumbnail: Unable to save thumbnail to destination

graph 5: WikiHarass

Includes examples of harassment on Talk page comments from Wikipedia.

loss value: 3.93717244664
Nearest to human: jacob, Persianyes, phrase, track, star, attack, puts, jews, helps, plastic,
Nearest to learning: sound, people, getting, writing, thinking, talking, thoughts, modify, less, prince,
Nearest to system: armenian, UNK, georgia, george, n, developed, its, each, daniele, claim,
Error creating thumbnail: Unable to save thumbnail to destination
Error creating thumbnail: Unable to save thumbnail to destination