Facebook:
Facebook officially has multi-pronged strategies to combat hate content including AI based machine learning, user reporting, internal reviewing and third party fact checking.
“The company’s algorithms are designed to identify and remove harmful content, while its automated system reviews user-reported posts. When the system fails to draw a conclusion, the post is sent to human content moderators, many of whom face substantial language barriers and lack the bandwidth to review a high volume of content”[1].
Through this process, Facebook takes action only on those posts that they think didn’t follow Community Standards. Reports are usually kept confidential and the account someone reported won’t see who reported them. Reporting only doesn’t guarantee that the post, comment or the profile that has been reported would be removed or be acted upon. But it allows an appeal of the decision. There are two areas where an appeal can be registered[2] through a review process to the Oversight Board. It has to be either on a decision about your content or a decision about content by others that a user has reported.
As mentioned earlier, Facebook now uses an artificial-intelligence (AI) based system to identify hate content. The most important part of that AI system is classifiers. Classifiers are screening algorithms, which requires manual intervention to mark a vast number of posts based on a set of rules. Programmers then take these examples and train their systems to determine the probability whether a post has violated that rule or not. Whenever a country is in that list of in-risk countries, Facebook prepares a list of classifiers related to that country. Also to build classifiers, Facebook’s policy is to look at ongoing violence rather than a temporary one.
But this AI based moderation strategy also has its limitations too. One of the Facebook Papers revealed[3] that Facebook’s algorithm reportedly catches less than 5 percent of online hate speech. One of the videos that sparked Bangladesh’s recent violence stayed online for six days and was shared over 56,000 times before Facebook took it down. Facebook is also trying to adjust itself on an ongoing basis to build up its expertise on conflict dynamics in different countries and arrange its mitigations accordingly. In Bangladesh[4], the platform has removed accounts associated with hacking, spam, and election-related misinformation; more recently, it appointed the country’s first public policy manager and expanded its Bengali-speaking staff. In another report[5] published by Facebook shows it has developed classifiers in Bengali to identify hate content. This is important particularly in a situation where lack of local knowledge and local language experts continue to create an issue for content moderation at Facebook.
Sarah Myers West (2018) did an interesting research[6] on Facebook to show how much and what types of content are being flagged, removed or acted up on the basis of user reporting. A total of 311 (n=311) comments, posts, or images featuring what one of the two coders considered hate speech according to Facebook’s definition were reported to the company as a violation of its community standards. Of those, 149 comments were removed and 162 were not removed. The majority of comments or posts reported as hate speech fell under the sub-category of gender/sexual orientation, followed closely by race/ ethnicity and religion. Race/ethnicity and the religious groups were the highest percentage of content that has been removed. On the other hand, reported hate speech that targeted people based on gender/sexual orientation had the highest percentage of posts that went unremoved.
YouTube
In YouTube, once a content is reported then it goes for internal review. Reported content is reviewed along these guidelines:
- Content that violates our Community Guidelines is removed from YouTube.
- Content that may not be appropriate for younger audiences may be age-restricted.
YouTube has a machine learning system that helps them identify and remove spam content, as well as track any re-uploading of the same content that is already reviewed and flagged as a content that violated community guidelines. This is YouTube’s first line of defense. For content other than spam that are also flagged by the system are actually reviewed by the human reviewers before any decision is made on these. Reviewers team actually assess whether the content in question has violated YouTube policies[7] and protect content that has an educational, documentary, scientific or artistic purpose. Reviewer teams remove content that violates YouTube policies and age-restrict content that may not be appropriate for all audiences. But these inputs are then used to train and improve the accuracy of the system on a much larger scale.
YouTube also has its existence in more than 100 different countries in the World, so they also have processes[8] in place to review content and appropriately act on valid requests in applicable local laws. Google says[9] that it receives content removal requests through a variety and levels of government including, court orders, written requests from national and local government agencies, requests from law enforcement agencies etc. Google assesses the legitimacy and completeness of that request, given that it is provided in writing as specific as possible and clear in its explanation of how the content is illegal.
In normal situations, if the reviewers think that any particular content has violated the Community Guidelines, then YouTube removes the content and sends a notice to the creator. If it is the first time that the creator has violated the guideline, then in most of the cases, the creator receives a warning with no penalty to the channel. But after one warning, if the failure to comply with the guideline continues, then YouTube issues a community guideline strike to the channel imposing a temporary restriction on uploading videos, live streaming etc. for a week. Channels that receive three strikes within a 90-day period will be terminated. In addition to removing content that violates policies, YouTube also works to reduce recommendations of content that goes as borderline content. It also has advertiser-friendly guidelines that prohibit ads from running on videos that include hateful content.
As mentioned earlier, YouTube also has this idea of borderline-content that gives YouTube a flexibility in how it responds to content that falls into gray areas where opinions both for and against banning material from the platform are strong. YouTube punishes the border line content through demonetization. But growing bodies of evidence suggests that borderline content doesn’t prevent the increasing growth of borderline content, rather confusion over what qualifies for demonetization has only increased since this policy update. In order to address this issue, YouTube employed two strategies. They are expanding their review teams linguistic and subject matter expertise and second, they are strengthening the appeal process allowing the reviewers to overturn content removal decisions made by the automatic AI based system.
Instagram:
Once a post, content, comment or profile is reported it goes for reviewing. Instagram can take action to hide or remove that content if they find that it goes against the Community Guidelines. Community Guidelines define what is and isn’t allowed on Instagram, Anyone who has reported something at Instagram, can check the status of the report on the bottom right or bottom left of the profile. You can
- Tap on available reports to learn more about our policies.
- See when Instagram takes action on your report and the decision they made.
- Request a review of decision. For some types of content, you can’t request a review.
Requesting a review of a decision or appealing is possible for a certain type of content. If the content doesn’t go against the Community Guidelines, then Instagram will let the user (who reported about it) know. If you disagree, then Instagram will offer another chance to review. When you request a review, Instagram will revisit the content and will review it once again.
Even if a piece of content or the profile is not removed due to a report, Instagram uses this data to set the sequence of content in the feed and/or prepare the recommendations accordingly. Posts that are similar to your report go to the bottom of the feed so that you don’t see those posts that much. Recommendations such as, ‘Suggested Posts and Explore’ also follow the same logic and don’t appear in the list.
While Instagram is not as much criticized as YouTube or Facebook, this social media platform certainly has its own share of problems. ‘Hateful content and misinformation thrive on the platform as much as any other social network, and certain mechanisms in the app (like its suggested follows feature) have been shown to push users toward extreme viewpoints for topics like anti-vaccination’[10]. To resolve this issue, Instagram looks[11] at “seed accounts,” which are accounts that users have interacted with in the past by liking or saving their content. It identifies accounts similar to these, and from them, it selects 500 pieces of content. These candidates are filtered to remove spam, misinformation, and “likely policy-violating content,” and the remaining posts are ranked based on how likely a user is to interact with each one. Finally, the top 25 posts are sent to the first page of the user’s Explore tab.
At the hate speech front, Instagram is working to show potential hate speech lower down the feed. In a way, it is a balance between removing a piece of content and keeping it with less priority or visibility. In many instances, it was found[12] that Instagram does not automatically remove posts that even have been determined by the independent fact-checkers as misinformation and rather prefer to de-prioritise them instead. Instagram also takes into account your reporting history when deciding what posts to show you. ‘While Instagram’s algorithm prioritizes posts based on how likely it thinks you’ll interact with them, it will now also de-prioritise posts it thinks you’re likely to report, as determined by your reporting history’[13].
In a report[14] from instagram, it announced that between July and September of 2020-2021, it took action on 6.5 million pieces of hate speech on Instagram, including in DMs, 95% of which they found before anyone reported about it.
[1] https://foreignpolicy.com/2022/02/04/facebook-tech-moderation-violence-bangladesh-religion/
[2] https://www.facebook.com/help/346366453115924/?helpref=related_articles
[3] https://foreignpolicy.com/2022/02/04/facebook-tech-moderation-violence-bangladesh-religion/
[4] https://foreignpolicy.com/2022/02/04/facebook-tech-moderation-violence-bangladesh-religion/
[5] https://twitter.com/thecaravanindia/status/1535955079899009026?lang=en
[6] https://firstmonday.org/article/view/10288/8327
[7] https://www.youtube.com/intl/ALL_ca/howyoutubeworks/policies/overview/
[8] https://www.youtube.com/howyoutubeworks/policies/legal-removals/
[9] https://transparencyreport.google.com/government-removals/overview?hl=en
[10] https://www.theverge.com/2019/11/25/20977734/instagram-ai-algorithm-explore-tab-machine-learning-method
[11] https://mashable.com/article/instagram-hate-speech-bullying-feed
[12] https://mashable.com/article/instagram-hate-speech-bullying-feed
[13] https://mashable.com/article/instagram-hate-speech-bullying-feed
[14] https://about.instagram.com/blog/announcements/an-update-on-our-work-to-tackle-abuse-on-instagram