What do you do with the massive amounts of data that Internet sites are gathering?
Moderating is David Weinberger, from the Berkman Center for Internet and Society. The panelists include:
- Brian Simpson is a programmer and admin at Reddit.
- Kevin Allacocca, trends manager at YouTube
- Alan Schaaf, founder and CEO of Imagur.
How big is big? May is YouTube’s 7th birthday. They get 4 billion views per day. YouTube had 1 trillion views in 2011. Over one hour of video is uploaded on YouTube every second. YouTube is localized in 43 countries and in 60 languages.
On the 30th of April, Reddit had around 5.6 million votes on links, 5.6 million votes on comments, 11 million total votes, 69 thousand links submitted — in one day!
Imgur now has a half million images uploaded per day, 100TB data transferred per day, from 1 billion images per day. They have 40 million unique visitors per month, 2 billion page views per month, and 11 pages per visit. Alan started imgur as a personal project to solve Reddit’s image posting problem. He released it on Reddit, and it grew alongside Reddit to reach its 2 billion page views per month.
What do these sites measure? Reddit logs a lot of things, but they don’t track people’s behaviours. YouTube makes available an analytics suite to individual YouTube users. Imgur has the “imgur gallery” and uses data from the site to populate the gallery. They look for retweets, reddit uploads, tumblr postings etc to decide what to highlight in the gallery. Individual images don’t get popular on imgur; they get popular on elsewhere. As a result, imgur offers a great way to measure the spread of an image on the Internet. Kevin mentions that YouTube’s ranking pages are a similar initiative.
Weinberger highlights two common issues around personalisation: privacy and the filter bubble, the idea that personalised content can create an echo chamber of Internet experience. All of the sites get takedown requests. At YouTube, Kevin says he doesn’t have to worry too much about it because privacy is a larger conversation within YouTube.
And the Filter Bubble? Is it a problem? It’s a big issue at Reddit. Once you get past the front page, you end up with a weird echo chamber of Reddit ideas. So the developer team has worked hard (not always successfully) to develop suggestions that try to break them out of their isolated spaces online.
The Filter Bubble is broader than just what happens on any social networking site, Kevin Allacocca says. Thousands of YouTube videos are shared every second on numerous sites, and no single site can control the shape of sharing online.
Dave asks, we used to think that media created alienation. But then the Internet happened, and we connect very strongly online. How has that happened at such a mass scale? Kevin points out that none of us could have imagined what has happened on the Internet. He mentions a group of people who produce YouTube videos of elevators. There’s one channel created by a disabled boy and his father who create elevator videos every Saturday and already have posted 200. Nobody could have expected that.
Why do people feel so connected to imgur? It’s just an image posting site. Alan responds that Imgur isn’t a massive corporation, it’s just five guys. He thinks people love it because it has grown up in front of Redditors- it has grown up together with Reddit. The imgur community now calls itself “imgurians” and have chosen the imguraffe as their mascot. Weinberger offers another possible explanation: there is nothing on the imgur site which isn’t about the users.
How global are these sites? Reddit is primarily American, with a few complaining Australians. Most people don’t realize that 70% of all YouTube views are international. Kevin points out why cats and babies are so popular on YouTube– everyone in the world loves them and they rise to the top. In the medium tail, it’s easier to see cultural differences by geography. For example, a Tiawanese video might be popular in San Francisco, where there is a large Tiawanese population. Imgur is mostly popular in English.
An audience member asks if imgur is shifting to more premium content or profit sharing? Alan responds that the long term vision for Imgur is for them to be an entertainment destination for creative images and funny memes. The audience member responds that bloggers are sometimes upset that their images are put onto imgur: how can we serve them? Alan says that it’s really hard for imgur to find where an image came from, something they’re working on. Image searches can’t solve it- they just show you where the image is served elsewhere on the Internet, not where it came from.
Hosting is an ongoing challenge for imgur. They got shut down in their first 3 days after too much traffic. After moving four times, imgur is running on 40 servers running PHP on Amazon Web Services. How do they pay for the hosting? It’s all ad supported– and they only show one ad on each page with an image. In contrast, Reddit has around 300 servers behind their service.
Kevin shares examples of of the things that are popular everywhere. Many of them are still culturally specific. Music videos, videos of older people doing silly things, out-of-context dancing. YouTube is also able to measure the scale of memes. Nyancat is now a year old, and there are now a hundred thousand remixes on YouTube.
Weinberger asks if we’re still going to be caring about these things in 10 years. Brian Simpson thinks that there will always be some guy to hate on.
An audience member asks how many videos Kevin watches every day, and how he keeps himself from going insane. His bar for interesting has gone very high. He only completes a few videos. The average person has subscriptions or looks at a video shared on their social networks. Kevin, whose job is to define popularity, watches the algorithms. What’s the difference between what’s being watched and searched? What are the video types that are popular without any single video going viral?
Someone asks if Imgur looks at the exif data in photos. Alan responds that it’s stripped out before it hits the database.
Another audience member asks, what categories of user behaviour do you think about on your sites? Reddit does try to find and deal with trolls on their site. But even the trolls are part of the site, no matter how they might enrage some people. Kevin draws a clear distinction among people who have accounts and upload, people who have accounts, and people who just watch videos. People often ask Kevin why YouTube the view counts lag behind a video’s popularity. YouTube verifies views at certain intervals, which is why we see the jumps. The only user monitoring that imgur does is to check for extremes: are you uploading or downloading too fast? You might be banned for a while.
Weinberger asks, to what extent is the long tail unrepresentative of the short tail- are we missing things at the tail? Reddit seems like a wonderful accepting community- is the tail vicious, homophobic, mysogynist? Brian points David to the Knights of New, a community which tries to find and surface things at the long tail.
Are trends predictable? Kevin responds that some things become obviously shareable once they reach a certain point– things which are headed to reaching 500,000 views. What’s unpredictable are the videos which go from 500,000 to millions. Who would have guessed that every Friday there’s a spike from Rebecca Black? Keven refers to Bear’s comment that he knew double rainbow would be popular, but not how popular. Reddit and Imgur have 24 hour windows- if your post isn’t popular within 24 hours, it will never be featured.
Do hosting ever feel conflicted about the things that get posted? Dave Weinberger forbids Alan from speaking about the worst image on imgur, at which point the audience cheers and Alanhints at what they contain. Does Alan lose faith in humanity? When you look at the gallery, you see amazing examples of people’s creative and political expression. Kevin is hopeful. He thinks that the ugly side of the Internet is part of what you get for having a system like YouTube where people post videos of Tahrir Square. Brian has no regrets about Reddit either.
Do sites moderate comments? YouTube has a comments team that thinks about these things, mostly to give people control over their comments. Alan responds that comments on imgur moderate themselves through upvotes and downvotes.