If you’ve recently visited a natural history museum, you may have learned about the Cambrian Explosion: a point at which, 540 million years ago, the volume and and variety of life on earth suddenly and exponentially increased. Why did this happen? According to the Director of the Stanford AI Lab Fei Fei Li, speaking last week at the CITRIS / Banatao Institute symposium on "Inclusive AI,” it’s because organisms developed eyes.
Before the Cambrian Explosion, life was chill. Creatures floated around the sea, consuming food when it passed by. With sight, they could detect food nearby and move towards it; they could notice a hunter closing in with its jaws wide and swim away. Predator-prey behavior was, according to Dr. Li, a whip to the hindquarters of natural selection. Life evolved from floating blobs into a multiplicity of active and intelligent beings.
Dr. Li was demonstrating that visual intelligence — the ability to recognize and process images — is the staple ingredient of active, intelligent minds. And by the standard of the ImageNet Classification Challenge, machine visual intelligence has improved sevenfold since 2010 (see photo above).
Dr. Li’s goal was to showcase how machine visual intelligence can attack unsolved social problems — like that of hospital employees not always washing their hands properly, leading one out of twenty-five hospital patients to acquire an infection. Despite attempts to ensure compliance by assigning staff to peer over their colleague’s shoulders at the bathroom sink, there’s no scalable way for humans to monitor the hand-washing habits of hospital staff by themselves.
Enter Dr. Li and her image-processing computers.
She and her team installed visual sensors (donated by Microsoft and similar to the one on your X-Box) at throughout Stanford Children’s Hospital: in hallways, to see when someone enters a bathroom, and right above all sinks, to provide a fly’s eye view of hands under running water.
The sensors send images to a program that notes when an employee enters the bathroom; when that employee washes their hands, the program notes whether they demonstrated correct “hand hygiene behavior.”
Dirty hands in hospitals a simple problem, and with advances in visual intelligence, we now have a fairly simple solution. If the rate of poor hand washing in hospitals around the country was reduced even by half, it would save thousands of lives.
But at this point in the presentation I wrote “UNEASY” in big letters in my notebook. If my workplace had devices everywhere to notice when I used the bathroom, I would feel pretty weird. If the system sees me enter the bathroom, and is waiting for me to go to the sink so that it can verify my hand washing, isn’t it counting how long I’m taking? Data on how long every employee been sitting on the toilet isn’t something I would want my bosses to have.
When most people enter the rest room, they inflate a little bubble of privacy, constraining their perception to whatever they need to do to handle their own business. Everything outside that bubble is kind of gross. This application pricks that bubble.
And though I don’t want to diminish the achievement in engineering, it all felt a little obvious. Of course putting cameras everywhere to monitor individual behavior can create a better collective outcome. What I wanted to know was, is that worth a devaluation of privacy?
In this case, yes. Hospitals are one of those spaces where normal rules are suspended in service of the ultimate goal, like a battlefield, or a space shuttle. This solution will prevent some of the most preventable deaths that happen today. Staff will get used to having their hand hygiene monitored, and I’m sure most hospital administrators aren’t interested how long their employees take to use the bathroom. But I was bothered that the privacy/collective benefit tradeoff wasn’t discussed in Dr. Li’s presentation.
In describing the Cambrian Explosion, Dr. Li set up a parallel to today — where visual intelligence will cause the variety and power of machines to shoot upwards, just like it did for animals 540 million years ago, in a digital Holocene Explosion. Putting cameras everywhere to optimize everything was a pretty creepy preview of an impending multiplicity of visually intelligent machines.
We have enough trouble protecting privacy with technology we already have. And that’s with machines that are physically blind.
Once they can see? If the people who know how to put cameras everywhere and teach computers to process images don’t ever think about the privacy tradeoff, bathroom monitoring might be only the beginning for our collective anxieties.