Madison: semantic listening through crowdsourcing

Our recent work at the Labs has focused on semantic listening: systems that obtain meaning from the streams of data surrounding them. Chronicle and Curriculum are recent examples of tools designed to extract semantic information (from our corpus of news coverage and our group web browsing history, respectively). However, not every data source is suitable for algorithmic analysis–and, in fact, many times it is easier for humans to extract meaning from a stream. Our new projects, Madison and Hive, are explorations of how to best design crowdsourcing projects for gathering data on cultural artifacts, as well as provocations for the design of broader, more modular kinds of crowdsourcing tools.

RCA ad

Madison is a crowdsourcing project designed to engage the public with an under-viewed but rich portion of The New York Times’s archives: the historical ads neighboring the articles. News events and reporting give us one perspective on our past, but the advertisements running alongside these articles provide a different view, giving us a sense of the culture surrounding these events. Alternately fascinating, funny and poignant, they act as commentary on the technology, economics, gender relations and more of that time period. However, the digitization of our archives has primarily focused on news, leaving the ads with no metadata–making them very hard to find and impossible to search for them. Complicating the process further is that these ads often have complex layouts and elaborate typefaces, making them difficult to differentiate algorithmically from photographic content, and much more difficult to scan for text. This combination of fascinating cultural information with little structured data seemed like the perfect opportunity to explore how crowdsourcing could form a source of semantic signals.

There were endless data points we were interested in collecting, but the challenge was how to do so while keeping our audience engaged. A very complex task might garner us a wealth of data in theory–but not if no one, in reality, wanted to do it. Further, we could try to punch up an extremely dry task with external incentives for the most active contributors–but we risked having our system gamed by those who simply wanted the rewards, potentially setting ourselves up for a bank of incorrect data. In order to avoid these problems, we took an approach that centered around reducing friction as much as possible, and that limited gamifying elements in favor of highlighting the interesting parts of the task at hand.

We settled on a set of design principles to reduce friction as much as possible, inspired by previous cultural and science-oriented crowdsourcing projects (such as the Zooniverse projects, the NYPL’s Building Inspector and The Guardian’s MP’s Expenses):

  • Add more tasks rather than making one task very complex. Keeping tasks clear and streamlined meant that the user didn’t have to constantly switch contexts, and could knock out a few assignments in a row without having to think too much about it.
  • Make the tasks self-explanatory. Building off the first concept, a task whose question is obvious is much easier to answer than one that requires looking at and interpreting instructions.
  • Design for a variety of use cases. If the tasks are simple and modular, chances are they can be done on a variety of devices in a variety of situations. Our Find task works nicely on mobile, and can be done in a few seconds while waiting for the bus; our Transcribe task is much more oriented to someone at home on a desktop computer, looking for a way to spend 10 minutes.
  • Permit anonymous contributions. Asking users to sign up first thing would give them an excuse to leave the site. By letting users contribute without logging in, we allow them to try it out and get into it before they commit to creating an account.

We also purposefully chose an approach that downplayed gamification in order to place the  the fun part of a potentially dry process at the forefront: discovering and sharing interesting cultural items and artifacts with your friends. In their paper on their Menus project, the NYPL points out that, for cultural institutions, “the incentives reside in the materials themselves and in the proposition of working in partnership with a public trust.” Rather than trying to tempt a user into participation with external, material rewards, we aimed to design a system whose biggest rewards came from engaging with it: namely, discovering and sharing a piece of culture that probably only a handful of people have seen since its original publication. This has a double benefit: because the rewards of Madison are largely in the delight of finding new things (not earning points, climbing a leaderboard, completing missions or getting badges), there is little incentive to game the system with bad information, in turn bolstering our confidence in opening up the project to anonymous users. Madison also has built-in validation criteria that require agreement from a number of contributors to ensure that the ads are annotated with correct metadata.

These choices formed the basis of Madison, and also shaped the platform underneath it: Hive. Hive is a modular, open-source framework for building crowdsourcing projects like Madison with any set of assets. We will be sharing more information about Hive in an upcoming blog post!


Chronicle: Tracking New York Times language use over time

Chronicle graph of World War I and Great War


News publishing is an inherently ephemeral act. A big story will consume public attention for a day, or a month or a year only to fade from memory as quickly as it erupted. But news coverage, aggregated over time, can provide a fascinating “first draft of history” — a narrative of events as they occurred. At The New York Times, we have an incredibly rich resource in our 162-year archive of Times reporting, and one of the areas we occasionally explore in the lab is how to harness our archive to create new kinds of experiences or tools.

Two years ago, I created Chronicle, a tool for graphing the usage of words and phrases in New York Times reporting. Inspired by my own love of language and history, it’s a fascinating way to see historical events, political shifts, cultural trends or stylistic tropes. Chronicle can reveal things like the rise of feminism, evolution of cultural bêtes noires or when we shifted from talking about the “greenhouse effect” to talking about “climate change”. The Times’ corpus is particularly interesting as a reflection of culture because our style guide  carefully informs how our reporters use language to describe the world, which allows us to see those changes more clearly than if we were looking at a heterogenous archive of text. More broadly, Chronicle acts as another example of “semantic listening” approaches we have been researching in the lab — methods for extracting useful semantic signals from streams as diverse as conversations, web browsing history, or in this case, a historic corpus of news coverage.

Since its creation, Chronicle has been in use internally as a research tool, and has occasionally made its way into our report, most notably in Margaret Sullivan’s election day look at hot-button issues in past presidential elections. While we have made a multitude of discoveries through using Chronicle within The Times, we want to see what our readers can unearth about our history as well. As of today, Chronicle is now open and available to the public. Go explore and tell us about your best finds by tweeting them to @nytlabs!

Social Wearables

We’ve been thinking about social wearables in the Lab recently; that is, objects that explicitly leverage their visibility or invisibility to create social affordances. In this post I’ll expand on this concept and frame a few questions we’re pursuing.


Wearables right now are almost exclusively recorders: we use them to record sensor values at a resolution and duration that we, as humans, aren’t capable of doing on our own. Today’s wearables exist in the physical world only to the extent that their sensors require: if it were, for example, possible for Fitbit to know my minute-to-minute activity levels without me carrying their device, they’d happily sell me quantified-self-as-a-service, rather than expending more effort trying to squeeze more battery life into an ever-smaller hardware package.

My Fitbit is small enough to conceal in my pocket, but there’s another sensor-laden device in my pocket, one that does more than just record accelerometer values; why don’t I use my smartphone to do the recording? And in fact, Google Now’s “Activity” card is more or less selling me Fitbit’s QS-as-a-service with one key upgrade: there’s no extra device to carry around.

People are enormously sensitive to the cost of carrying stuff around – one look at everyday carry illustrates the extent to which we choose our personal effects for optimal weight, utility and style. What if your phone got heavier with each app you installed? Obviously, you’d apply a logic similar to your everyday carry, carefully considering the net benefit you expect to receive from donning a device against the cost of lugging it around all day.

In other words, wearable devices need to have a reason for being physical, separate objects in order for there to be a good use case for carrying them around.

There’s also a more diffuse, conceptual unease accompanying public/visible technology that only one person may use. The archetypal rude businessman talking on his Bluetooth earpiece; the anxiety we feel when in the presence of a Glass-wearer but are not able to know what he’s looking at: these bad experiences that happen when technology allows someone to superimpose their world onto the world we have to share with him, but without letting us participate.

More generally, anything I wear is going to be read by others in the same way that they already “read” my clothing choices, my grooming, my affect and so on. It’s just strange to ignore the social-performative component of a wearable device. Conversely, one good reason to wear a thing that is visible/readable by others is for the social affordances such a device might offer.

What I’m getting at is the implausibility of an enduring category of wearables that only record: I think they’re an anomaly in a larger trajectory in which wearable devices come to leverage their physicality, their presence in my immediate environment, to add to my interactions as they happen, rather than record aspects of the world for later.

New types of information

Wearables that engage with the world around me, and particularly with the people around me, are few and far between right now, but I think that as we move from low-level sensor fusion (gait analysis, GPS breadcrumbs) to more nuanced, semantically-rich signals (Curriculum, anticipatory systems), we’ll be able to author more synchronous and in-context experiences; we will have moved from recording to listening.

I’m particularly interested in social wearables because they will make rapid progress in the near term, as our listening capabilities (semantic analysis, real-time speech-to-text) improve. They also have the potential to introduce totally new types of information into a face-to-face interaction: we have an opportunity here to add bandwidth to ourselves, to make our own superpowers.

What kinds of interactions are we looking out for? I’ve tried to categorize the main functions I think we might see wearables focus on:

Prosthesis / Behavior Augmentation

These are devices that I wear to assist in my interactions and behaviors. For example, a band that vibrates when I’m speaking too loudly, or when it seems like I have been interrupting someone frequently.

Deeper Connections

Devices that marshall extra information or that retain/recall my cognitive context (for example, “what was I thinking about last week?” or “What ideas led me to bookmark this article?”) across space and time. Blush falls into this category, as do some features of Android Wear’s SDK.

Dowsing and Divination

Devices that I use opportunistically to find affinities between myself and others, or between myself and spaces. The 90s toy Lovegety, which lets wearers know when they’re near a person with similar interests, is an early instance of this kind of device. A wearable that knew and understood its immediate context would be the ideal platform for many different types of ad hoc affinity- or resource-driven dowsing: automatic opportunistic mesh networking in areas with spotty network access, local dead-drop file exchange, and the like.

These examples are obviously just some of the futures that become possible as wearables evolve.  I think the real promise of the next decade of wearable technologies will be realized when we apply ourselves to these questions:

  • What new kinds of information might best feed a “Deeper Connection”-type wearable? Will these new feeds be legible to everyone, or will they have their own private meanings?
  • How will we negotiate the additional levels of disclosure that a wearable might make available? We already see a deep anxiety about Glass’s photo capabilities; what happens when the next generation of Google Glass, or its successor, is more or less invisible?
  • Are these wearables really leveraging their presence on the body? Do they augment or extend my capabilities in the world, or are we just fragmenting phone functions to more convenient forms, pushing the UI out of the screen?

We’re already using these questions to frame our work as we continue to think about the opportunities and challenges presented by semantically-informed, socially-engaged wearables. I’ll be posting more works and thoughts here as they develop, including a couple of new wearable experiments to be completed in the next few months.


Vellum: A reading layer for your Twitter feed

Screen Shot 2014-04-25 at 1.18.17 PM

In the course of our work, we make a lot of small experiments, often in code. Sometimes we hit upon something that may not be a signal from the future, but is quite useful in the present. Vellum is one such project.

One of my primary uses for Twitter is to find interesting reading material: breaking news, long reads, research relevant to my work, or funny things about maps. However, Twitter’s interface treats commentary as primary and content as secondary, which can make it difficult to discover things to read if I’m mostly interested in that secondary content.

To address this use case, we created Vellum. Vellum acts as a reading list  for your Twitter feed, finding all the links that are being shared by those you follow on Twitter and displaying them each with their full titles and descriptions. This flips the Twitter model, treating the links as primary and the commentary as secondary (you can still see all the tweets about each link, but they are less prominent). Vellum puts a spotlight on content, making it easy to find what you should read next.

We also wanted to include signals about what might be most important to read right now, so links are ranked by how often they have been shared by those you follow on Twitter, allowing you to stay informed about the news your friends and colleagues are discussing most.

Vellum was built as a quick experiment, but as we and other groups within The New York Times have been using it over the past few months, it has proven to be an invaluable tool for using Twitter as a content discovery interface. So today we are opening up Vellum to the public. We hope you find it as useful as we have. Happy reading!

Check out Vellum now »

In the Loop: Designing Conversations With Algorithms

Note: This was also published as a guest post for the Superflux blog.

Earlier this year, I saw a video from the Consumer Electronics Show in which Whirlpool gave a demonstration of their new line of connected appliances: appliances which would purportedly engage in tightly choreographed routines in order to respond easily and seamlessly to the consumer’s every need. As I watched, it struck me how similar the notions were to the “kitchen of the future” touted by Walter Cronkite in this 1967 video. I began to wonder: was that future vision from nearly fifty years ago particularly prescient? Or, perhaps, are we continuing to model technological innovation on a set of values that hasn’t changed in decades?

When we look closely at the implicit values embedded in the vast majority of new consumer technologies, they speak to a particular kind of relationship we are expected to have with computational systems, a relationship that harkens back to mid-20th century visions of robot servants. These relationships are defined by efficiency, optimization, and apparent magic. Products and systems are designed to relieve users of a variety of everyday “burdens” — problems that are often prioritized according to what technology can solve rather than their significance or impact. And those systems are then assumed to “just work”, in the famous words of Apple. They are black boxes in which the consumer should never feel the need to look under the hood, to see or examine a system’s process, because it should be smart enough to always anticipate your needs.

So what’s wrong with this vision? Why wouldn’t I want things doing work for me? Why would I care to understand more about a system’s process when it just makes the right decisions for me? [Read more...]

streamtools: a graphical tool for working with streams of data

We see a moment coming when the collection of endless streams of data is commonplace. As this transition accelerates it is becoming increasingly apparent that our existing toolset for dealing with streams of data is lacking. Over the last 20 years we have invested heavily in tools that deal with tabulated data, from Excel, MySQL and MATLAB to Hadoop, R and Python+Numpy. These tools, when faced with a stream of never ending data, fall short and diminish our creative potential.

In response to this shortfall we have created streamtools – a new, open source project by The New York Times R&D Lab which provides a general purpose, graphical tool for dealing with streams of data. It provides a vocabulary of operations that can be connected together to create live data processing systems without the need for programming or complicated infrastructure. These systems are assembled using a visual interface that affords both immediate understanding and live manipulation of the system.

[Read more...]

Lift 2014!

Noah, Matt and I just returned from Geneva, Switzerland, where we were attending Lift 2014. Lift is a conference on technology, design and innovation that we have followed from afar for a while now, as it is co-founded by Nicolas Nova, of the Near Future Laboratory (whose work we all greatly admire) and always has a great lineup of speakers. This year, we not only had the opportunity to attend but we all led a workshop and I presented a talk.

My talk was entitled “In the Loop: Designing Conversations with Algorithms”. In it, I shared signals we’re seeing that indicate a shifting relationship between people and algorithmic systems, discussed how those changes are at odds with some of the implicit ideas we’ve been building into innovation for decades, and advocated for a set of design principles that can help us create better interactions for the future: ones where we are empowered to engage in negotiations with the complex and increasingly pervasive systems around us. Full video of the talk (20 minutes) is below:

Our workshop was framed around the idea of “impulse response”, which refers to a means of sounding out the properties of an unknown system by sending a known signal into it. As increasing aspects of our lives are mediated by algorithmic systems, we adapt our behavior according to our understanding of how these systems sense, track, and analyze what we do. Some of these systems show us what they know and how they work; other systems may behave as black boxes, recording their observations and making inferences we don’t fully understand. As we learn more about how these systems work, what behaviors are emerging / will emerge to optimize or obscure our participation with them? We had participants design strategies, products, or other interventions that apply this idea to the systems around them. An overview video is below and you can also check out Lift’s Storify of the workshop.

Understanding these emerging trends around how people engage with algorithmic systems deeply informs the work we do in the R&D Lab. As we design and build new kinds of interactions with information, we strive to make those interactions embody the values of transparency, agency and virtuosity in order to create compelling, satisfying and empowered experiences for our users now and in the future.


Blush, a social wearable

Blush (prototype)

What does it look like when our devices stop merely listening to us, and start becoming part of our conversations? How can the technology that lives closest to our bodies actively enhance our relationships with others?

We’ve been thinking about this recently, and to investigate it further we made Blush, a wearable that highlights the moments when your online interests and offline behaviors overlap.

Blush listens to everything that is said around it, and lights up when the conversation touches on topics that are in my Curriculum, a feed of topics that the Lab’s members have recently researched online. (You can read a more detailed breakdown of Curriculum here). Blush is the first of a couple wearable experiments that we’re working on here in the Lab.

Social Wearables / Augmentation

We’re particularly interested in developing wearables for more than just the wearer: devices that engage with the world around them and add to our social interactions. Blush functions as an alternative kind of punctuation in a conversation, a subtle way to include your online life in your offline interactions.

When we converse, we’re constantly sending signals beyond just the words we say to each other: our posture, eye contact, gestures, and other factors combine to add a huge amount of context to what we’re discussing. (Part of what makes a phone call so different from an in-person conversation is the lack of all that extra context that lets us know more about our counterpart’s attitude than just what he or she chooses to say.) We think of Blush as a “social wearable” because it writes to this same layer of the conversation, the layer full of second-order, contextual clues that augment what we’re saying.

Of course, extra information isn’t always good–the signal that Blush provides, whether or not my colleagues have encountered a topic before, can be interpreted many ways: this is boring, I already know this stuff; or ooh, now we’re getting to something I’m interested in! There are also obviously privacy concerns: some colleagues have playfully plumbed Blush to find out if our Curriculum contains anything embarrassing, like “Wrecking Ball” or “beginner PHP example.”


We designed Blush to inhabit the middle ground between the (often subconscious) signals we’re always sending with our bodies and the ideas we explicitly choose to talk about with others. Blush’s role as “punctuation” is important: we did not want it to dominate conversations, or derail them by being distracting. By limiting Blush to simply lighting up, and only doing that when a notable event has occurred, we hope Blush can live comfortably in the real world, augmenting our interactions with a little bit of extra information while not bogging them down.


Blush pairs with an Android app that does the continuous speech recognition. When it hears a match with a Curriculum topic, it activates the pendant over BLE; the pendant itself is a very tiny (and, at present, messy) circuit living on the back of the excellent RFD22301 radio. I’ll post an update on this blog when there’s more to share.

Blush (prototype)Blush hardware

Curriculum, semantic listening for groups

We’ve been using a new system in the Lab for a few months now, and it has really captured our imagination. The system, called “Curriculum,” is a real-time stream of topics from Lab members’ web browsing activity. So for example, right now, the latest topics in Curriculum are:

  • “DHT humidity/temperature sensors”
  • “3.3V i2c interface”
  • “PIR thermometer device”

…these topics are generated by a semantic analyzer that reads the content of the page and infers the topics that the page is about (in this case, I was just researching some sensors for another project). Of course, there’s a healthy amount of weird noise as well; here are some less-intelligible topics that were also browsed recently:

  • “deepest thoughts”
  • “G2108 G2110 G2111 G2112 G2113 G2116 G2124 G2125 G212C”
  • “insane flow”

Even with some noise, the feed is fascinating enough that it has quickly become a habit for all five Curriculum users to check the feed quite regularly. Checking the feed is rewarding because it is always at least a little bit funny (the imperfections of semantic analysis make for some great robot poetry), and it often affords a deeper or more intimate view into what my colleagues are working on.

Click through for a more in-depth look at Curriculum and how it was designed. [Read more...]

Repetitive tasks abstracted, or a Python module for URL metadata

As I was working on a new prototype last week, I once again found myself in the position of having a large set of URLs for which I dynamically needed to get some basic information: page titles, summaries, etc. Instead of writing yet another block of custom code for the project as I have in the past, I went looking for an API or library that I could use for this purpose. While I found a couple of options, most options were external services upon which my code would be dependent, and those services either seemed poorly maintained or were paid products. So after a brief discussion with others in the lab to make sure that this was something that was broadly useful (the answer was a resounding yes), I decided to write a simple Python module to get meta information from web pages.

The pageinfo module is very straightforward: You import it, pass it a url, and it gives you back the following (where available):

  • Page title
  • Page description
  • Favicon
  • Twitter card data
  • Facebook open graph data

Since this seems like a task lots of people need on a regular basis, I packaged up pageinfo and it is available to install via pip (details below), or for those who may want to tweak or expand upon the concept, the code is all up on nytlabs’ github. Below are details on how to install and use pageinfo. [Read more...]