Shock wave from sun has opened up a crack in Earth’s magnetic field, and it could trigger a geomagnetic storm

https://www.livescience.com/shock-wave-opens-crack-in-magnetosphere


A mysterious shock wave in a gust of solar wind has sent a barrage of high-speed material smashing into Earth’s magnetic field, opening up a crack in the magnetosphere. The barrage of plasma could lead to a geomagnetic storm today (Dec. 19), according to SpaceWeather.com (opens in new tab).

The shockwave’s origins aren’t exactly known, but scientists think it could have come from a coronal mass ejection launched by the sunspot AR3165, a fizzing region on the sun’s surface that released a flurry of at least eight solar flares on Dec. 14, causing a brief radio blackout over the Atlantic Ocean. 

Sunspots are areas on the sun’s surface where powerful magnetic fields, created by the flow of electrical charges, knot into kinks before suddenly snapping. The resulting release of energy launches bursts of radiation called solar flares, or plumes of solar material called coronal mass ejections (CMEs). Once launched, CMEs travel at speeds in the millions of miles per hour, sweeping up charged particles from the solar wind to form a giant, combined wavefront that (if pointed toward Earth) can trigger geomagnetic storms.

Related: Listen to the terrifying rumble of Earth’s magnetic field being assaulted by a solar storm

Geomagnetic storms occur when energetic solar debris (mostly consisting of electrons, protons and alpha particles) gets absorbed by, and subsequently compresses, Earth’s magnetic field. The solar particles zip through the atmosphere near the poles where Earth’s protective magnetic field is weakest and agitate oxygen and nitrogen molecules — causing them to release energy in the form of light to form colorful auroras such as the northern lights. 

The storms can also create cracks in the magnetosphere which remain open for hours at a time, enabling some solar material to stream through and disrupt satellites, radio communications, and power systems.

Thankfully today’s potential storm, predicted to be a G-1 class, will be fairly weak. It may cause minor fluctuations in power grids and impair some satellite functions — including those for mobile devices and GPS systems. It could also cause an aurora to appear as far south as Michigan and Maine (opens in new tab)

More extreme geomagnetic storms, however, can have far more serious effects. They can not only warp our planet’s magnetic field powerfully enough to send satellites tumbling to Earth, but can disrupt electrical systems and even cripple the internet (opens in new tab)

The upcoming storm is just the latest in a string of solar attacks fired at Earth as the sun ramps up into the most active phase of its roughly 11-year solar cycle.

Astronomers have known since 1775 that solar activity rises and falls in cycles, but recently, the sun has been more active than expected, with nearly double the sunspot appearances predicted by the National Oceanic and Atmospheric Administration (opens in new tab).

NASA’s Solar Dynamics Observatory spacecraft captured this shot of a solar flare erupting from the sunspot AR3165 (center) on Dec. 14, 2022. (Image credit: NASA/SDO)

Scientists anticipate that the sun’s activity will steadily climb for the next few years, reaching an overall maximum in 2025 before decreasing again.

The largest solar storm in recent history was the 1859 Carrington Event, which released roughly the same energy as 10 billion 1-megaton atomic bombs. After slamming into Earth, the powerful stream of solar particles fried telegraph systems around the world and caused auroras brighter than the light of the full moon to appear as far south as the Caribbean.

If a similar event were to happen today, scientists warn it would cause trillions of dollars’ worth of damage, trigger widespread blackouts, and endanger thousands of lives. A previous solar storm in 1989 released a billion-ton plume of gas that caused a blackout across the entire Canadian province of Quebec, NASA reported (opens in new tab).

But this may not even scratch the surface of what our star is capable of hurling at us. Scientists are also investigating the cause of a series of sudden and colossal spikes in radiation levels recorded in ancient tree rings across Earth’s history. A leading theory is that the spikes could have come from solar storms 80 times more powerful than the Carrington Event, but scientists have yet to rule out some other potentially unknown cosmic source.

Originally published on LiveScience.com.

Join our Space Forums to keep talking space on the latest missions, night sky and more! And if you have a news tip, correction or comment, let us know at: community@space.com.

via Space https://ift.tt/s8pU716

December 20, 2022 at 12:34PM

Wells Fargo to pay $3.7 billion settling charges it wrongfully seized homes and cars

https://www.npr.org/2022/12/20/1144331954/wells-fargo-billions-wrongful-fees-settlement

The Consumer Financial Protection Bureau (CFPB) has ordered Wells Fargo to pay billions in fines and redress to mistreated consumers.

The case marks the largest penalty ever imposed by the federal watchdog agency the Consumer Financial Protection Bureau. Customers who were harmed will receive $2 billion in restitution.

(Image credit: Jakub Porzycki/NurPhoto via Getty Images)

via NPR Topics: News https://ift.tt/dDPm13c

December 20, 2022 at 11:31AM

Rivian R1T and R1S receive ‘Snow’ mode via over-the-air update

https://www.autoblog.com/2022/12/20/rivian-r1t-r1s-snow-mode-over-the-air-update/


Rivian is making it easier for its electric R1T and R1S models to power through winter weather. The company released an over-the-air software update that adds a driving profile called “Snow” and expanded its app with remote features designed to keep owners warm.

The new “Snow” mode sounds similar to the winter-taming technology already offered by several carmakers. When engaged via the infotainment system’s touchscreen, the drivetrain’s 10th driving profile softens the throttle response to help prevent wheelspin, dials in a new brake regeneration setting called “Low” to stop the wheels from locking up, and makes the transition to the Auto-Hold function smoother.

Rivian explains that it developed the “Snow” mode exclusively for on-road use. It’s what you’d select if you end up on an unplowed road on your way to work. It’s not designed to be used off-road; the company recommends using the Off-Road All-Terrain mode for snowy trails.

In addition to the new driving profile, the latest round of over-the-air updates brings additional remote functions to the Rivian app. Owners can use their phone to remotely defrost the windows and door mirrors and turn on the heated seats or the heated steering wheel.

The update also fixes several software bugs, updates some of the suspension system’s settings, improves the regenerative braking system’s performance, and lets the vehicle “sleep” even if a door is left open. Rivian owners can download it starting in December 2022.

Related video:

via Autoblog https://ift.tt/VcYv07P

December 20, 2022 at 09:31AM

Elon Musk and the Dangers of Censoring Real-Time Flight Trackers

https://www.wired.com/story/elon-musk-elonjet-flight-tracker-transparency/


I woke up Friday morning to the message I’d been expecting: “Your account, @Justin_Ling has been locked for violating the Twitter Rules.

Below was the offending tweet: a link to one of the few websites that provide real-time private jet flight data that “chief twit” Elon Musk, I wrote, “hasn’t bullied into suppressing his flight data.”

Musk has accused these flight trackers of providing “basically assassination coordinates.” He has launched a crusade against these apps and anyone who shares them on his recently acquired social media platform. Accounts like mine were locked, while others were banned entirely—from the @ElonJet bot, which shared the location of Musk’s private plane, to reporters who picked up on his campaign. Twitter rules were rewritten on the fly to forbid publishing anyone’s “physical location.”

The chaotic few days prompted the European Union to warn Musk that silencing journalists would likely result in sanctions from EU regulators. US Representative Adam Schiff demanded that Musk reinstate the suspended accounts and explain to Congress why he decided to retaliate against the press in the first place.

As of Monday, following a poll asking users when he should lift the account suspensions, Musk reinstated some—but not all—of those accounts. 

Lost in the chaos is just how successful Musk has been at suppressing that real-time flight data on the internet. In so doing, he’s taking aim at an incredibly valuable source of information—which has helped researchers, journalists, and experts with everything from tracking Russian oligarchs to investigating the fate of missing aircraft to tracking down international hitmen. Musk isn’t the only one trying to keep this type of information out of the public’s hands. 

Both real-time and historical information on Musk’s main private jet—a 2015 Gulfstream G650ER, tail number N628TS—is conspicuously missing from the two main flight-tracking platforms: FlightAware and FlightRadar24.

FlightAware reports that its real-time data on Musk’s jet is unavailable “due to European government data rules,” while its historical data about the plane’s comings and goings was removed “per request from the owner/operator.” Looking up Musk’s jet on FlightRadar24 returns the message: “we could not find data.”

Even smaller tracking platforms, like AirportInfo—the account that led to my Twitter being locked—have taken Musk’s flight information offline.

“The ongoing hullabaloo about the location of Elon Musk’s airplane has caused us to stop displaying his plane at the moment,” says Christian Rommes, an AirportInfo administrator. “Because Musk is threatening legal action, we don’t want to take any risks.”

While Rommes says his office hasn’t heard from Musk’s legal team, they took the step as a precaution. “Don’t mess with the (former) richest man of the world,” he says.

Aircraft operators are required to report detailed information on their flight path to various national regulators, including the Federal Aviation Administration. That data is generally a matter of public record and is published to various websites popular amongst airline enthusiasts. 

Some companies, like FlightAware, augment government data with their own sources of real-time flight information. Other websites, like planespotters.net and airliners.net, allow users to submit photos taken of aircraft as they come and go around the world.

via Wired Top Stories https://www.wired.com

December 20, 2022 at 06:02AM

A Roomba recorded a woman on the toilet. How did screenshots end up on Facebook?

https://www.technologyreview.com/2022/12/19/1065306/roomba-irobot-robot-vacuums-artificial-intelligence-training-data-privacy/

In the fall of 2020, gig workers in Venezuela posted a series of images to online forums where they gathered to talk shop. The photos were mundane, if sometimes intimate, household scenes captured from low angles—including some you really wouldn’t want shared on the Internet. 

In one particularly revealing shot, a young woman in a lavender T-shirt sits on the toilet, her shorts pulled down to mid-thigh.

The images were not taken by a person, but by development versions of iRobot’s Roomba J7 series robot vacuum. They were then sent to Scale AI, a startup that contracts workers around the world to label audio, photo, and video data used to train artificial intelligence. 

They were the sorts of scenes that internet-connected devices regularly capture and send back to the cloud—though usually with stricter storage and access controls. Yet earlier this year, MIT Technology Review obtained 15 screenshots of these private photos, which had been posted to closed social media groups. 

The photos vary in type and in sensitivity. The most intimate image we saw was the series of video stills featuring the young woman on the toilet, her face blocked in the lead image but unobscured in the grainy scroll of shots below. In another image, a boy who appears to be eight or nine years old, and whose face is clearly visible, is sprawled on his stomach across a hallway floor. A triangular flop of hair spills across his forehead as he stares, with apparent amusement, at the object recording him from just below eye level.

The other shots show rooms from homes around the world, some occupied by humans, one by a dog. Furniture, décor, and objects located high on the walls and ceilings are outlined by rectangular boxes and accompanied by labels like “tv,” “plant_or_flower,” and “ceiling light.” 

iRobot—the world’s largest vendor of robotic vacuums, which Amazon recently acquired for $1.7 billion in a pending deal—confirmed that these images were captured by its Roombas in 2020. All of them came from “special development robots with hardware and software modifications that are not and never were present on iRobot consumer products for purchase,” the company said in a statement. They were given to “paid collectors and employees” who signed written agreements acknowledging that they were sending data streams, including video, back to the company for training purposes. According to iRobot, the devices were labeled with a bright green sticker that read “video recording in progress,” and it was up to those paid data collectors to “remove anything they deem sensitive from any space the robot operates in, including children.”

In other words, by iRobot’s estimation, anyone whose photos or video appeared in the streams had agreed to let their Roombas monitor them. iRobot declined to let MIT Technology Review view the consent agreements and did not make any of its paid collectors or employees available to discuss their understanding of the terms.

While the images shared with us did not come from iRobot customers, consumers regularly consent to having our data monitored to varying degrees on devices ranging from iPhones to washing machines. It’s a practice that has only grown more common over the past decade, as data-hungry artificial intelligence has been increasingly integrated into a whole new array of products and services. Much of this technology is based on machine learning, a technique that uses large troves of data—including our voices, faces, homes, and other personal information—to train algorithms to recognize patterns. The most useful data sets are the most realistic, making data sourced from real environments, like homes, especially valuable. Often, we opt in simply by using the product, as noted in privacy policies with vague language that gives companies broad discretion in how they disseminate and analyze consumer information. 

Did you participate in iRobot’s data collection efforts? We’d love to hear from you. Please reach out at tips@technologyreview.com. 

The data collected by robot vacuums can be particularly invasive. They have “powerful hardware, powerful sensors,” says Dennis Giese, a PhD candidate at Northeastern University who studies the security vulnerabilities of Internet of Things devices, including robot vacuums. “And they can drive around in your home—and you have no way to control that.” This is especially true, he adds, of devices with advanced cameras and artificial intelligence—like iRobot’s Roomba J7 series.

This data is then used to build smarter robots whose purpose may one day go far beyond vacuuming. But to make these data sets useful for machine learning, individual humans must first view, categorize, label, and otherwise add context to each bit of data. This process is called data annotation.

There’s always a group of humans sitting somewhere—usually in a windowless room, just doing a bunch of point-and-click: ‘Yes, that is an object or not an object,’” explains Matt Beane, an assistant professor in the technology management program at  the University of California, Santa Barbara, who studies the human work behind robotics.

The 15 images shared with MIT Technology Review are just a tiny slice of a sweeping data ecosystem. iRobot has said that it has shared over 2 million images with Scale AI and an unknown quantity more with other data annotation platforms; the company has confirmed that Scale is just one of the data annotators it has used. 

James Baussmann, iRobot’s spokesperson, said in an email the company had “taken every precaution to ensure that personal data is processed securely and in accordance with applicable law,” and that the images shared with MIT Technology Review were “shared in violation of a written non-disclosure agreement between iRobot and an image annotation service provider.” In an emailed statement a few weeks after we shared the images with the company, iRobot CEO Colin Angle said that “iRobot is terminating its relationship with the service provider who leaked the images, is actively investigating the matter, and [is] taking measures to help prevent a similar leak by any service provider in the future.” The company did not respond to additional questions about what those measures were. 

Ultimately, though, this set of images represents something bigger than any one individual company’s actions. They speak to the widespread, and growing, practice of sharing potentially sensitive data to train algorithms, as well as the surprising, globe-spanning journey that a single image can take—in this case, from homes in North America, Europe, and Asia to the servers of Massachusetts-based iRobot, from there to San Francisco–based Scale AI, and finally to Scale’s contracted data workers around the world (including, in this instance, Venezuelan gig workers who posted the images to private groups on Facebook, Discord, and elsewhere). 

Together, the images reveal a whole data supply chain—and new points where personal information could leak out—that few consumers are even aware of. 

“It’s not expected that human beings are going to be reviewing the raw footage,” emphasizes Justin Brookman, director of tech policy at Consumer Reports and former policy director of the Federal Trade Commission’s Office of Technology Research and Investigation. iRobot would not say whether data collectors were aware that humans, in particular, would be viewing these images, though the company said the consent form made clear that “service providers” would be.

“It’s not expected that human beings are going to be reviewing the raw footage.”

“We literally treat machines differently than we treat humans,” adds Jessica Vitak, an information scientist and professor at the University of Maryland’s communication department and its College of Information Studies. “It’s much easier for me to accept a cute little vacuum, you know, moving around my space [than] somebody walking around my house with a camera.” 

And yet, that’s essentially what is happening. It’s not just a robot vacuum watching you on the toilet—a person may be looking too. 

The robot vacuum revolution 

Robot vacuums weren’t always so smart. 

The earliest model, the Swiss-made Electrolux Trilobite, came to market in 2001. It used ultrasonic sensors to locate walls and plot cleaning patterns; additional bump sensors on its sides and cliff sensors at the bottom helped it avoid running into objects or falling off stairs. But these sensors were glitchy, leading the robot to miss certain areas or repeat others. The result was unfinished and unsatisfactory cleaning jobs. 

The next year, iRobot released the first-generation Roomba, which relied on similar basic bump sensors and turn sensors. Much cheaper than its competitor, it became the first commercially successful robot vacuum.

The most basic models today still operate similarly, while midrange cleaners incorporate better sensors and other navigational techniques like simultaneous localization and mapping to find their place in a room and chart out better cleaning paths. 

Higher-end devices have moved on to computer vision, a subset of artificial intelligence that approximates human sight by training algorithms to extract information from images and videos, and/or lidar, a laser-based sensing technique used by NASA and widely considered the most accurate—but most expensive—navigational technology on the market today. 

Computer vision depends on high-definition cameras, and by our count, around a dozen companies have incorporated front-facing cameras into their robot vacuums for navigation and object recognition—as well as, increasingly, home monitoring. This includes the top three robot vacuum makers by market share: iRobot, which has 30% of the market and has sold over 40 million devices since 2002; Ecovacs, with about 15%; and Roborock, which has about another 15%, according to the market intelligence firm Strategy Analytics. It also includes familiar household appliance makers like Samsung, LG, and Dyson, among others. In all, some 23.4 million robot vacuums were sold in Europe and the Americas in 2021 alone, according to Strategy Analytics. 

From the start, iRobot went all in on computer vision, and its first device with such capabilities, the Roomba 980, debuted in 2015. It was also the first of iRobot’s Wi-Fi-enabled devices, as well as its first that could map a home, adjust its cleaning strategy on the basis of room size, and identify basic obstacles to avoid. 

Computer vision “allows the robot to … see the full richness of the world around it,” says Chris Jones, iRobot’s chief technology officer. It allows iRobot’s devices to “avoid cords on the floor or understand that that’s a couch.” 

But for computer vision in robot vacuums to truly work as intended, manufacturers need to train it on high-quality, diverse data sets that reflect the huge range of what they might see. “The variety of the home environment is a very difficult task,” says Wu Erqi, the senior R&D director of Beijing-based Roborock. Road systems “are quite standard,” he says, so for makers of self-driving cars, “you’ll know how the lane looks … [and] how the traffic sign looks.” But each home interior is vastly different. 

“The furniture is not standardized,” he adds. “You cannot expect what will be on your ground. Sometimes there’s a sock there, maybe some cables”—and the cables may look different in the US and China. 

family bent over a vacuum. light emitting from the vaccuum shines on their obscured faces.

MATTHIEU BOUREL

MIT Technology Review spoke with or sent questions to 12 companies selling robot vacuums and found that they respond to the challenge of gathering training data differently. 

In iRobot’s case, over 95% of its image data set comes from real homes, whose residents are either iRobot employees or volunteers recruited by third-party data vendors (which iRobot declined to identify). People using development devices agree to allow iRobot to collect data, including video streams, as the devices are running, often in exchange for “incentives for participation,” according to a statement from iRobot. The company declined to specify what these incentives were, saying only that they varied “based on the length and complexity of the data collection.” 

The remaining training data comes from what iRobot calls “staged data collection,” in which the company builds models that it then records.

iRobot has also begun offering regular consumers the opportunity to opt in to contributing training data through its app, where people can choose to send specific images of obstacles to company servers to improve its algorithms. iRobot says that if a customer participates in this “user-in-the-loop” training, as it is known, the company receives only these specific images, and no others. Baussmann, the company representative, said in an email that such images have not yet been used to train any algorithms. 

In contrast to iRobot, Roborock said that it either “produce[s] [its] own images in [its] labs” or “work[s] with third-party vendors in China who are specifically asked to capture & provide images of objects on floors for our training purposes.” Meanwhile, Dyson, which sells two high-end robot vacuum models, said that it gathers data from two main sources: “home trialists within Dyson’s research & development department with a security clearance” and, increasingly, synthetic, or AI-generated, training data. 

Most robot vacuum companies MIT Technology Review spoke with explicitly said they don’t use customer data to train their machine-learning algorithms. Samsung did not respond to questions about how it sources its data (though it wrote that it does not use Scale AI for data annotation), while Ecovacs calls the source of its training data “confidential.” LG and Bosch did not respond to requests for comment.

“You have to assume that people … ask each other for help. The policy always says that you’re not supposed to, but it’s very hard to control.” 

Some clues about other methods of data collection come from Giese, the IoT hacker, whose office at Northeastern is piled high with robot vacuums that he has reverse-engineered, giving him access to their machine-learning models. Some are produced by Dreame, a relatively new Chinese company based in Shenzhen that sells affordable, feature-rich devices. 

Giese found that Dreame vacuums have a folder labeled “AI server,” as well as image upload functions. Companies often say that “camera data is never sent to the cloud and whatever,” Giese says, but “when I had access to the device, I was basically able to prove that it’s not true.” Even if they didn’t actually upload any photos, he adds, “[the function] is always there.”  

Dreame manufactures robot vacuums that are also rebranded and sold by other companies—an indication that this practice could be employed by other brands as well, says Giese. 

Dreame did not respond to emailed questions about the data collected from customer devices, but in the days following MIT Technology Review’s initial outreach, the company began changing its privacy policies, including those related to how it collects personal information, and pushing out multiple firmware updates.

But without either an explanation from companies themselves or a way, besides hacking, to test their assertions, it’s hard to know for sure what they’re collecting from customers for training purposes.

How and why our data ends up halfway around the world

With the raw data required for machine-learning algorithms comes the need for labor, and lots of it. That’s where data annotation comes in. A young but growing industry, data annotation is projected to reach $13.3 billion in market value by 2030. 

The field took off largely to meet the huge need for labeled data to train the algorithms used in self-driving vehicles. Today, data labelers, who are often low-paid contract workers in the developing world, help power much of what we take for granted as “automated” online. They keep the worst of the Internet out of our social media feeds by manually categorizing and flagging posts, improve voice recognition software by transcribing low-quality audio, and help robot vacuums recognize objects in their environments by tagging photos and videos. 

Among the myriad companies that have popped up over the past decade, Scale AI has become the market leader. Founded in 2016, it built a business model around contracting with remote workers in less-wealthy nations at cheap project- or task-based rates on Remotasks, its proprietary crowdsourcing platform. 

In 2020, Scale posted a new assignment there: Project IO. It featured images captured from the ground and angled upwards at roughly 45 degrees, and showed the walls, ceilings, and floors of homes around the world, as well as whatever happened to be in or on them—including people, whose faces were clearly visible to the labelers. 

Labelers discussed Project IO in Facebook, Discord, and other groups that they had set up to share advice on handling delayed payments, talk about the best-paying assignments, or request assistance in labeling tricky objects. 

iRobot confirmed that the 15 images posted in these groups and subsequently sent to MIT Technology Review came from its devices, sharing a spreadsheet listing the specific dates they were made (between June and November 2020), the countries they came from (the United States, Japan, France, Germany, and Spain), and the serial numbers of the devices that produced the images, as well as a column indicating that a consent form had been signed by each device’s user. (Scale AI confirmed that 13 of the 15 images came from “an R&D project [it] worked on with iRobot over two years ago,” though it declined to clarify the origins of or offer additional information on the other two images.)

iRobot says that sharing images in social media groups violates Scale’s agreements with it, and Scale says that contract workers sharing these images breached their own agreements. 

“The underlying problem is that your face is like a password you can’t change. Once somebody has recorded the ‘signature’ of your face, they can use it forever to find you in photos or video.” 

But such actions are nearly impossible to police on crowdsourcing platforms. 

When I ask Kevin Guo, the CEO of Hive, a Scale competitor that also depends on contract workers, if he is aware of data labelers sharing content on social media, he is blunt. “These are distributed workers,” he says. “You have to assume that people … ask each other for help. The policy always says that you’re not supposed to, but it’s very hard to control.” 

That means that it’s up to the service provider to decide whether or not to take on certain work. For Hive, Guo says, “we don’t think we have the right controls in place given our workforce” to effectively protect sensitive data. Hive does not work with any robot vacuum companies, he adds. 

“It’s sort of surprising to me that [the images] got shared on a crowdsourcing platform,” says Olga Russakovsky, the principal investigator at Princeton University’s Visual AI Lab and a cofounder of the group AI4All. Keeping the labeling in house, where “folks are under strict NDAs” and “on company computers,” would keep the data far more secure, she points out.

In other words, relying on far-flung data annotators is simply not a secure way to protect data. “When you have data that you’ve gotten from customers, it would normally reside in a database with access protection,” says Pete Warden, a leading computer vision researcher and a PhD student at Stanford University. But with machine-learning training, customer data is all combined “in a big batch,” widening the “circle of people” who get access to it.

Screenshots shared with MIT Technology Review of data annotation in progress

For its part, iRobot says that it shares only a subset of training images with data annotation partners, flags any image with sensitive information, and notifies the company’s chief privacy officer if sensitive information is detected. Baussmann calls this situation “rare,” and adds that when it does happen, “the entire video log, including the image, is deleted from iRobot servers.”

The company specified, “When an image is discovered where a user is in a compromising position, including nudity, partial nudity, or sexual interaction, it is deleted—in addition to ALL other images from that log.” It did not clarify whether this flagging would be done automatically by algorithm or manually by a person, or why that did not happen in the case of the woman on the toilet.

iRobot policy, however, does not deem faces sensitive, even if the people are minors. 

“In order to teach the robots to avoid humans and images of humans”—a feature that it has promoted to privacy-wary customers—the company “first needs to teach the robot what a human is,” Baussmann explained. “In this sense, it is necessary to first collect data of humans to train a model.” The implication is that faces must be part of that data.

But facial images may not actually be necessary for algorithms to detect humans, according to William Beksi, a computer science professor who runs the Robotic Vision Laboratory at the University of Texas at Arlington: human detector models can recognize people based “just [on] the outline (silhouette) of a human.” 

“If you were a big company, and you were concerned about privacy, you could preprocess these images,” Beksi says. For example, you could blur human faces before they even leave the device and “before giving them to someone to annotate.”

“It does seem to be a bit sloppy,” he concludes, “especially to have minors recorded in the videos.” 

In the case of the woman on the toilet, a data labeler made an effort to preserve her privacy, by placing a black circle over her face. But in no other images featuring people were identities obscured, either by the data labelers themselves, by Scale AI, or by iRobot. That includes the image of the young boy sprawled on the floor.

Baussmann explained that iRobot protected “the identity of these humans” by “decoupling all identifying information from the images … so if an image is acquired by a bad actor, they cannot map backwards to identify the person in the image.”

But capturing faces is inherently privacy-violating, argues Warden. “The underlying problem is that your face is like a password you can’t change,” he says. “Once somebody has recorded the ‘signature’ of your face, they can use it forever to find you in photos or video.” 

AI labels over the illustrated faces of a family

MATTHIEU BOUREL

Additionally, “lawmakers and enforcers in privacy would view biometrics, including faces, as sensitive information,” says Jessica Rich, a privacy lawyer who served as director of the FTC’s Bureau of Consumer Protection between 2013 and 2017. This is especially the case if any minors are captured on camera, she adds: “Getting consent from the employee [or testers] isn’t the same as getting consent from the child. The employee doesn’t have the capacity to consent to data collection about other individuals—let alone the children that appear to be implicated.” Rich says she wasn’t referring to any specific company in these comments. 

In the end, the real problem is arguably not that the data labelers shared the images on social media. Rather, it’s that this type of AI training set—specifically, one depicting faces—is far more common than most people understand, notes Milagros Miceli, a sociologist and computer scientist who has been interviewing distributed workers contracted by data annotation companies for years. Miceli has spoken to multiple labelers who have seen similar images, taken from the same low vantage points and sometimes showing people in various stages of undress. 

The data labelers found this work “really uncomfortable,” she adds. 

Surprise: you may have agreed to this 

Robot vacuum manufacturers themselves recognize the heightened privacy risks presented by on-device cameras. “When you’ve made the decision to invest in computer vision, you do have to be very careful with privacy and security,” says Jones, iRobot’s CTO. “You’re giving this benefit to the product and the consumer, but you also have to be treating privacy and security as a top-order priority.”

In fact, iRobot tells MIT Technology Review it has implemented many privacy- and security-protecting measures in its customer devices, including using encryption, regularly patching security vulnerabilities, limiting and monitoring internal employee access to information, and providing customers with detailed information on the data that it collects. 

But there is a wide gap between the way companies talk about privacy and the way consumers understand it. 

It’s easy, for instance, to conflate privacy with security, says Jen Caltrider, the lead researcher behind Mozilla’s “*Privacy Not Included” project, which reviews consumer devices for both privacy and security. Data security refers to a product’s physical and cyber security, or how vulnerable it is to a hack or intrusion, while data privacy is about transparency—knowing and being able to control the data that companies have, how it is used, why it is shared, whether and for how long it’s retained, and how much a company is collecting to start with. 

Conflating the two is convenient, Caltrider adds, because “security has gotten better, while privacy has gotten way worse” since she began tracking products in 2017. “The devices and apps now collect so much more personal information,” she says. 

Company representatives also sometimes use subtle differences, like the distinction between “sharing” data and selling it, that make how they handle privacy particularly hard for non-experts to parse. When a company says it will never sell your data, that doesn’t mean it won’t use it or share it with others for analysis.

These expansive definitions of data collection are often acceptable under companies’ vaguely worded privacy policies, virtually all of which contain some language permitting the use of data for the purposes of “improving products and services”—language that Rich calls so broad as to “permit basically anything.”

“Developers are not traditionally very good [at] security stuff.” Their attitude becomes “Try to get the functionality, and if the functionality is working, ship the product. And then the scandals come out.” 

Indeed, MIT Technology Review reviewed 12 robot vacuum privacy policies, and all of them, including iRobot’s, contained similar language on “improving products and services.” Most of the companies to which MIT Technology Review reached out for comment did not respond to questions on whether “product improvement” would include machine-learning algorithms. But Roborock and iRobot say it would. 

And because the United States lacks a comprehensive data privacy law—instead relying on a mishmash of state laws, most notably the California Consumer Privacy Act—these privacy policies are what shape companies’ legal responsibilities, says Brookman. “A lot of privacy policies will say, you know, we reserve the right to share your data with select partners or service providers,” he notes. That means consumers are likely agreeing to have their data shared with additional companies, whether they are familiar with them or not.

Brookman explains that the legal barriers companies must clear to collect data directly from consumers are fairly low. The FTC, or state attorneys general, may step in if there are either “unfair” or “deceptive” practices, he notes, but these are narrowly defined: unless a privacy policy specifically says “Hey, we’re not going to let contractors look at your data” and they share it anyway, Brookman says, companies are “probably okay on deception, which is the main way” for the FTC to “enforce privacy historically.” Proving that a practice is unfair, meanwhile, carries additional burdens—including proving harm. “The courts have never really ruled on it,” he adds.

Most companies’ privacy policies do not even mention the audiovisual data being captured, with a few exceptions. iRobot’s privacy policy notes that it collects audiovisual data only if an individual shares images via its mobile app. LG’s privacy policy for the camera- and AI-enabled Hom-Bot Turbo+ explains that its app collects audiovisual data, including “audio, electronic, visual, or similar information, such as profile photos, voice recordings, and video recordings.” And the privacy policy for Samsung’s Jet Bot AI+ Robot Vacuum with lidar and Powerbot R7070, both of which have cameras, will collect “information you store on your device, such as photos, contacts, text logs, touch interactions, settings, and calendar information” and “recordings of your voice when you use voice commands to control a Service or contact our Customer Service team.” Meanwhile, Roborock’s privacy policy makes no mention of audiovisual data, though company representatives tell MIT Technology Review that consumers in China have the option to share it. 

iRobot cofounder Helen Greiner, who now runs a startup called Tertill that sells a garden-weeding robot, emphasizes that in collecting all this data, companies are not trying to violate their customers’ privacy. They’re just trying to build better products—or, in iRobot’s case, “make a better clean,” she says. 

Still, even the best efforts of companies like iRobot clearly leave gaps in privacy protection. “It’s less like a maliciousness thing, but just incompetence,” says Giese, the IoT hacker. “Developers are not traditionally very good [at] security stuff.” Their attitude becomes “Try to get the functionality, and if the functionality is working, ship the product.” 

“And then the scandals come out,” he adds.

Robot vacuums are just the beginning

The appetite for data will only increase in the years ahead. Vacuums are just a tiny subset of the connected devices that are proliferating across our lives, and the biggest names in robot vacuums—including iRobot, Samsung, Roborock, and Dyson—are vocal about ambitions much grander than automated floor cleaning. Robotics, including home robotics, has long been the real prize.  

Consider how Mario Munich, then the senior vice president of technology at iRobot, explained the company’s goals back in 2018. In a presentation on the Roomba 980, the company’s first computer-vision vacuum, he showed images from the device’s vantage point—including one of a kitchen with a table, chairs, and stools—next to how they would be labeled and perceived by the robot’s algorithms. “The challenge is not with the vacuuming. The challenge is with the robot,” Munich explained. “We would like to know the environment so we can change the operation of the robot.” 

This bigger mission is evident in what Scale’s data annotators were asked to label—not items on the floor that should be avoided (a feature that iRobot promotes), but items like “cabinet,” “kitchen countertop,” and “shelf,” which together help the Roomba J series device recognize the entire space in which it operates. 

The companies making robot vacuums are already investing in other features and devices that will bring us closer to a robotics-enabled future. The latest Roombas can be voice controlled through Nest and Alexa, and they recognize over 80 different objects around the home. Meanwhile, Ecovacs’s Deebot X1 robot vacuum has integrated the company’s proprietary voice assistance, while Samsung is one of several companies developing “companion robots” to keep humans company. Miele, which sells the RX2 Scout Home Vision, has turned its focus toward other smart appliances, like its camera-enabled smart oven.

And if iRobot’s $1.7 billion acquisition by Amazon moves forward—pending approval by the FTC, which is considering the merger’s effect on competition in the smart-home marketplace—Roombas are likely to become even more integrated into Amazon’s vision for the always-on smart home of the future.

Perhaps unsurprisingly, public policy is starting to reflect the growing public concern with data privacy. From 2018 to 2022, there has been a marked increase in states considering and passing privacy protections, such as the California Consumer Privacy Act and the Illinois Biometric Information Privacy Act. At the federal level, the FTC is considering new rules to crack down on harmful commercial surveillance and lax data security practices—including those used in training data. In two cases, the FTC has taken action against the undisclosed use of customer data to train artificial intelligence, ultimately forcing the companies, Weight Watchers International and the photo app developer Everalbum, to delete both the data collected and the algorithms built from it. 

Still, none of these piecemeal efforts address the growing data annotation market and its proliferation of companies based around the world or contracting with global gig workers, who operate with little oversight, often in countries with even fewer data protection laws. 

When I spoke this summer to Greiner, she said that she personally was not worried about iRobot’s implications for privacy—though she understood why some people might feel differently. Ultimately, she framed privacy in terms of consumer choice: anyone with real concerns could simply not buy that device. 

“Everybody needs to make their own privacy decisions,” she told me. “And I can tell you, overwhelmingly, people make the decision to have the features as long as they are delivered at a cost-effective price point.”

But not everyone agrees with this framework, in part because it is so challenging for consumers to make fully informed choices. Consent should be more than just “a piece of paper” to sign or a privacy policy to glance through, says Vitak, the University of Maryland information scientist. 

True informed consent means “that the person fully understands the procedure, they fully understand the risks … how those risks will be mitigated, and … what their rights are,” she explains. But this rarely happens in a comprehensive way—especially when companies market adorable robot helpers promising clean floors at the click of a button.

Do you have more information about how companies collect data to train AI? Did you participate in data collection efforts by iRobot or other robot vacuum companies? We’d love to hear from you and will respect requests for anonymity. Please reach out at tips@technologyreview.com or securely on Signal at 626.765.5489. 

Additional research by Tammy Xu.

via Technology Review Feed – Tech Review Top Stories https://ift.tt/7MTwPL6

December 19, 2022 at 09:22AM

How AI-generated text is poisoning the internet

https://www.technologyreview.com/2022/12/20/1065667/how-ai-generated-text-is-poisoning-the-internet/

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

This has been a wild year for AI. If you’ve spent much time online, you’ve probably bumped into images generated by AI systems like DALL-E 2 or Stable Diffusion, or jokes, essays, or other text written by ChatGPT, the latest incarnation of OpenAI’s large language model GPT-3.

Sometimes it’s obvious when a picture or a piece of text has been created by an AI. But increasingly, the output these models generate can easily fool us into thinking it was made by a human. And large language models in particular are confident bullshitters: they create text that sounds correct but in fact may be full of falsehoods. 

While that doesn’t matter if it’s just a bit of fun, it can have serious consequences if AI models are used to offer unfiltered health advice or provide other forms of important information. AI systems could also make it stupidly easy to produce reams of misinformation, abuse, and spam, distorting the information we consume and even our sense of reality. It could be particularly worrying around elections, for example. 

The proliferation of these easily accessible large language models raises an important question: How will we know whether what we read online is written by a human or a machine? I’ve just published a story looking into the tools we currently have to spot AI-generated text. Spoiler alert: Today’s detection tool kit is woefully inadequate against ChatGPT. 

But there is a more serious long-term implication. We may be witnessing, in real time, the birth of a snowball of bullshit. 

Large language models are trained on data sets that are built by scraping the internet for text, including all the toxic, silly, false, malicious things humans have written online. The finished AI models regurgitate these falsehoods as fact, and their output is spread everywhere online. Tech companies scrape the internet again, scooping up AI-written text that they use to train bigger, more convincing models, which humans can use to generate even more nonsense before it is scraped again and again, ad nauseam.

This problem—AI feeding on itself and producing increasingly polluted output—extends to images. “The internet is now forever contaminated with images made by AI,” Mike Cook, an AI researcher at King’s College London, told my colleague Will Douglas Heaven in his new piece on the future of generative AI models. 

“The images that we made in 2022 will be a part of any model that is made from now on.”

In the future, it’s going to get trickier and trickier to find good-quality, guaranteed AI-free training data, says Daphne Ippolito, a senior research scientist at Google Brain, the company’s research unit for deep learning. It’s not going to be good enough to just blindly hoover text up from the internet anymore, if we want to keep future AI models from having biases and falsehoods embedded to the nth degree.

“It’s really important to consider whether we need to be training on the entirety of the internet or whether there’s ways we can just filter the things that are high quality and are going to give us the kind of language model we want,” says Ippolito. 

Building tools for detecting AI-generated text will become crucial when people inevitably try to submit AI-written scientific papers or academic articles, or use AI to create fake news or misinformation. 

Technical tools can help, but humans also need to get savvier.

Ippolito says there are a few telltale signs of AI-generated text. Humans are messy writers. Our text is full of typos and slang, and looking out for these sorts of mistakes and subtle nuances is a good way to identify text written by a human. In contrast, large language models work by predicting the next word in a sentence, and they are more likely to use common words like “the,” “it,” or “is” instead of wonky, rare words. And while they almost never misspell words, they do get things wrong. Ippolito says people should look out for subtle inconsistencies or factual errors in texts that are presented as fact, for example. 

The good news:her research shows that with practice, humans can train ourselves to better spot AI-generated text. Maybe there is hope for us all yet. 

Deeper Learning

A Roomba recorded a woman on the toilet. How did screenshots end up on Facebook?

This story made my skin crawl. Earlier this year my colleague Eileen Guo got hold of 15 screenshots of private photos taken by a robot vacuum, including images of someone sitting on the toilet, posted to closed social media groups. 

Who is watching? iRobot, the developer of the Roomba robot vacuum, says that the images did not come from the homes of customers but “paid collectors and employees” who signed written agreements acknowledging that they were sending data streams, including video, back to the company for training purposes. But it’s not clear whether these people knew that humans, in particular, would be viewing these images in order to train the AI. 

Why this matters: The story illustrates the growing practice of sharing potentially sensitive data to train algorithms, as well as the surprising, globe-spanning journey that a single image can take—in this case, from homes in North America, Europe, and Asia to the servers of Massachusetts-based iRobot, from there to San Francisco–based Scale AI, and finally to Scale’s contracted data workers around the world. Together, the images reveal a whole data supply chain—and new points where personal information could leak out—that few consumers are even aware of. Read the story here.

Bits and Bytes

OpenAI founder Sam Altman tells us what he learned from DALL-E 2 
Altman tells Will Douglas Heaven why he thinks DALLE-2 was such a big hit, what lessons he learned from its success, and what models like it mean for society. (MIT Technology Review

Artists can now opt out of the next version of Stable Diffusion
The decision follows a heated public debate between artists and tech companies over how text-to-image AI models should be trained. Since the launch of Stable Diffusion, artists have been up in arms, arguing that the model rips them off by including many of their copyrighted works without any payment or attribution. (MIT Technology Review

China has banned lots of types of deepfakes 
The Chinese Cyberspace Administration has banned deepfakes that are created without their subject’s permission and that go against socialist values or disseminate “Illegal and harmful information.” (The Register

What it’s like to be a chatbot’s human backup
As a student, writer Laura Preston had an unusual job: stepping in when a real estate AI chatbot called Brenda went off-script. The goal was that customers would not notice. The story shows just how dumb the AI of today can be in real-life situations, and how much human work goes into maintaining the illusion of intelligent machines. (The Guardian)

via Technology Review Feed – Tech Review Top Stories https://ift.tt/7MTwPL6

December 20, 2022 at 05:14AM

SpaceX fires up Starship prototype again ahead of test flight (video)

https://www.space.com/spacex-starship-static-fire-ship-24-december-15


SpaceX continues to gear up for the first-ever orbital test flight of its Starship Mars rocket.

SpaceX performed another “static fire” test today (Dec. 15) at its South Texas facility, lighting up Ship 24, a prototype of Starship‘s 165-foot-tall (50 meters) upper-stage spacecraft.

One of Ship 24’s six Raptor engines ignited at 2:01 p.m. EST (1901 GMT; 1:01 p.m. local Texas time), firing for about seven seconds while the vehicle remained anchored to the ground. The brief test was captured on video by NASASpaceFlight (opens in new tab) and Rocket Ranch Boca Chica (opens in new tab)

Related: SpaceX’s Starship Super Heavy booster test-fires record 14 engines (video)

Ship 24 will fly on the Starship program’s first orbital test flight, which SpaceX founder and CEO Elon Musk has said could lift off in the next few months. The prototype has already been through a static-fire gamut; it lit up all six of its Raptors in September, for example.

Ship 24 will lift off atop Booster 7, a prototype of Starship’s massive Super Heavy first stage. SpaceX has been putting Booster 7 through its paces as well. On Nov. 29, for instance, the giant rocket lit up 11 of its 33 Raptors. Fifteen days before that, the booster performed a 14-engine static fire.

SpaceX is developing the fully reusable Starship to take people and cargo to the moon and Mars, and to fly a variety of other missions closer to home. For example, SpaceX plans to launch the majority of its next-generation Starlink internet satellites using Starship, Musk has said.

SpaceX has already lined up some customers for Starship lunar flights. NASA picked the vehicle to be the first crewed lander for its Artemis moon program, and Japanese billionaire Yusaku Maezawa booked a Starship mission around the moon that will carry a handful of artists and influencers, including “Everyday Astronaut” Tim Dodd

Mike Wall is the author of “Out There (opens in new tab)” (Grand Central Publishing, 2018; illustrated by Karl Tate), a book about the search for alien life. Follow him on Twitter @michaeldwall (opens in new tab). Follow us on Twitter @Spacedotcom (opens in new tab) or on Facebook (opens in new tab).  

Join our Space Forums to keep talking space on the latest missions, night sky and more! And if you have a news tip, correction or comment, let us know at: community@space.com.

via Space https://ift.tt/s8pU716

December 15, 2022 at 04:23PM