Gorwa et al. (2020) – Algorithmic content moderation: technical and political challenges in the automation of platform governance Flashcards

Question 1

Q

Abstract

Answer

A

As government pressure on major technology companies builds, both firms and legislators are
searching for technical solutions to difficult platform governance puzzles such as hate speech and misinformation. Automated hash-matching and predictive machine learning tools – what we
define here as algorithmic moderation systems – are increasingly being deployed to conduct
content moderation at scale by major platforms for user-generated content such as Facebook,
YouTube and Twitter. This article provides an accessible technical primer on how algorithmic
moderation works; examines some of the existing automated tools used by major platforms to
handle copyright infringement, terrorism and toxic speech; and identifies key political and ethical
issues for these systems as the reliance on them grows. Recent events suggest that algorithmic
moderation has become necessary to manage growing public expectations for increased platform
responsibility, safety and security on the global stage; however, as we demonstrate, these
systems remain opaque, unaccountable and poorly understood. Despite the potential promise of
algorithms or ‘AI’, we show that even ‘well optimized’ moderation systems could exacerbate,
rather than relieve, many existing problems with content policy as enacted by platforms for three
main reasons: automated moderation threatens to (a) further increase opacity, making a
famously non-transparent set of practices even more difficult to understand or audit, (b) further
complicate outstanding issues of fairness and justice in large-scale sociotechnical systems and
(c) re-obscure the fundamentally political nature of speech decisions being executed at scale.

Question 2

Q

Turning to AI for moderation at scale

Answer

A

Automated moderation systems have become necessary to manage growing public
expectations for increased platform responsibility, safety and security
- But, these systems remain opaque, unaccountable and poorly understood
* The goal of this article is to provide an accessible primer on how automated moderation
works

Question 3

Q

What is algorithmic moderation?

Answer

A

Content moderation à governance mechanisms that structure participation in a
community to facilitate cooperation and prevent abuse
In this understanding, moderation includes not only the administrators or moderators
with power to remove content or exclude users, but also the design decisions that
organize how the members of a community engage with one another
Algorithmic commercial content moderation (algorithmic moderation) à a system that
classify user-generated content based on either matching or prediction, leading to a
decision and governance outcome (e.g., removal, geo-blocking, account takedown)
Hard moderation systems à systems that make decisions about content and accounts
The focus of this paper lies on the hard moderation systems
Soft moderation systems à recommender systems, norms, design decisions,
architectures, etc.

Question 4

Q

A primer on the main technologies involved in algorithmic moderation

Answer

A

Algorithmic content moderation involves a range of techniques from statistics and
computer science, which vary in complexity and effectiveness
They all aim to identify, match, predict, or classify some piece of content (text, audio,
image, video, etc.) on the basis of its exact properties or general features
There are some major differences in the techniques used depending on the kind of
matching or classification required, and the types of data considered:
A distinction between systems that aim to match content à ‘is this file depicting the
same image as that file?’
A distinction between systems that aim to classify or predict content as belonging to
one of several categories à ‘is this file spam? Is this text hate speech?’

Question 5

Q

Hashing:

Answer

A

The process of transforming a known example of a piece of content into a ‘hash’ – a string of data meant to uniquely identify the underlying content
- They are useful because they are easy to
compute and smaller in size than the underlying content, so it is easy to compare any given hash against a large table of existing hashes to see if it matches any of them

Question 6

Q

Secure cryptographic has functions

Answer

A

à aim to create hashes that appear to be random,
giving away no clues about the content from which they are derived
- They are useful for checking the integrity of a piece of data or code to make sure that
no unauthorized modifications have been made
- For example, if a software vendor publishes a hash of the software’s installation file,
and the user downloads the software from somewhere where it may have been
modified, the user can check the integrity by computing the hash locally and
comparing it to the vendor’s
- Cryptographic hash functions are not useful for content moderation, because they are
sensitive to any changes in the underlying content, such that a minor modification
(changing the color of one pixel in an image) will result in a completely different hash
value

Question 7

Q

Perceptual hashing

Answer

A

Involves fingerprinting certain perceptually salient features of content, such as corners in images or hertz-frequency over time in audio

This type of hashing can be more robust to changes that are irrelevant to how
humans perceive the content

Question 8

Q

Classification

Answer

A

assesses newly uploaded content that has no corresponding previous version in a database

The aim is to put new content into one of a number of categories

Question 9

Q

Modern classification too

Answer

A

Machine learning (the automatic induction of statistical patterns from data)

One of the main branches of machine learning is supervised learning: models are
trained to predict outcomes based on labelled instances (offensive/not offensive)
Content classification à based on manually coded features

It is hard to identify the context of a text or word, when using this type of classification

Question 10

Q

Bag of words

Answer

A

treats all of the words in a sentence as features, ignoring order and
grammar

Question 11

Q

Word embeddings

Answer

A

Represent the position of a word in relation to all the other words that usually appear around it

Semantically similar words therefore have similar positions

Question 12

Q

Matching and classification have some important differences:

Answer

A

Matching requires a manual process of collating and curating individual examples of
the content to be matched (particular terrorist images)
Classification involves inducing generalizations about features of many examples from a given category into which unknown examples may be classified (terrorist images in general)

Question 13

Q

An algorithmic moderation typology

Answer

A

The specific fashion in which these matching or predictive systems are deployed depends
greatly on a variety of factors, including:
The type of community
The type of content it must deal with
The expectations placed upon the platform by various governance stakeholders
Automated tools are used by platforms to police content across a host of issue areas at
scale, including terrorism, graphic violence, toxic speech (hate speech, harassment and
bullying), sexual content, child abuse, and spam or fake account detection

Question 14

Q

Once content has been identified as a match, or is predicted to fall into a category of
content that violates a platform’s rules, there are several possible outcomes:

Answer

A

Flagging: content is placed in either a regular queue, indistinguishable from user flagged content, or in a priority queue where it will be seen faster, or by specific ‘expert’
moderators
Deletion: content is removed outright or prevented from being uploaded in the first
place

Fully automated decision-making systems that do not include a human-in-the-loop are
dangerous

Question 15

Q

Copyright

Answer

A

Content ID: is unique in that it allows copyright holders to upload the material that will be (a) searched against existing content on YouTube and (b) added to a hash database
and used to detect new uploads of that content

In the copyright context, the goal of deploying automatic systems is not only to find identical files but also to identify different instances and performances of cultural
works that may be protected by copyright
A key concern in the deployment of automated moderation technologies in the context of
copyright is systematic over-blocking

Question 16

Q

Terrorism

Answer

A

Terrorism
* EU Code of Conduct on Countering Illegal Hate Speech Online: committing the firms to
a wide-ranging set of principles, including:

Takedown of hateful speech within 24 hours under platform terms of service
Intensification of cooperation between themselves and other platforms and social
media companies to enhance best practice sharing
Each firm applies its own policies and definitions of terrorist content when deciding
whether to remove content when a match to a shared hash is found

Question 17

Q

Toxic speech

Answer

A

Toxicity of comments / conversational health : umbrella terms for various concepts,
including hate speech, offence, profanity, personal attacks, sleights, defamatory claims,
bullying and harassment
By training machine learning algorithms on large corpora of texts manually labelled for
toxicity, they aim to create automatic classification systems to flag ‘toxic’ comments
Perspective à an application programming interface (API) with a stated aim to make it
easier to host better conversations
A platform could use Perspective to receive a score which predicts the impact a comment might have on a conversation
The clearest problem of moderation of toxic speech is that language is incredibly
complicated, personal and context dependent words that are widely accepted to be
slurs may even be used by members of a group to reclaim certain terms
Insufficient context awareness can lead crude classifiers to flag content for
adjudication by moderators who usually do not have the context required to tell
whether the speaker is a member of the group that the ‘hate speech’ is being directed
against

Question 18

Q

Three political issues: transparency, fairness and depoliticization

Answer

A

There is outsized concern about over-blocking à it is very difficult for predictive classifiers
to make difficult, contextual decisions on slippery concepts like hate speech, and
automated systems a scale are likely to make lots of incorrect decisions on a daily basis
The use of automate techniques can potentially help firms remove illegal content more
quickly and effectively

Question 19

Q

Decisional transparency

Answer

A

A common critique of automated decision making is the potential lack of transparency,
especially when claims of commercial intellectual property are used to deflect
responsibility

In content moderation it will become more difficult to decipher the dynamics of takedowns (and potential human rights harms) around some policy issues when the initial flagging decisions were made by automated systems
From a user perspective, there is little transparency around whether (or to what extent) an
automated decision factored into a takedown

Question 20

Q

Justice

Answer

A

Content classifiers in general, whether used for recommendation, ranking, or blocking,
may be more or less favorable to content associated with gender, race, and other
protected categories
Hate speech classifiers designed to detect violations of a platform’s guidelines could be
disproportionally flagging language used by a certain social group, thus making that
group’s expression more likely to be removed.
Fairness critiques often miss broader structural issues, and risk being blind ot wider
patterns of systemic harm

Question 21

Q

De-politicization

Answer

A

Algorithmic moderation has already introduced a level of obscurity and complexity into the
inner workings of content decisions made around issues of economic or political
importance, such as copyright and terrorism
* This elides the political question of who exactly is considered a terrorist group
* As algorithmic moderation becomes more seamlessly integrated into user’s day-to-day
online experience, human rights advocates and researchers must continue to challenge
both the discourse and reality of the use of automated decision-making in moderation

Brainscape's Knowledge GenomeTM

Gorwa et al. (2020) – Algorithmic content moderation: technical and political challenges in the automation of platform governance Flashcards

Brainscape's Knowledge Genome^TM