Google held its annual I/O22 annual developer conference in mid-May, and the surprises included some revolutionary leaps in machine language for several apps like Translate, Search, and Google Assistant. The company also announced three new Pixel phones, a Google Watch, earbuds, and a Google tablet.
Google’s most recent developer conference, I/O 2022 held on May 11, offered a panoramic look in the Googleplex windows at what’s next for the search giant and its developers. As expected, there was a keynote from the Alphabet CEO Sundar Pichai and a keynote from Jeanine Banks, vice president for Developer X and head of developer relations, along with multiple presenters of what’s new in the apps and services and when we can expect new Google hardware. From tinkering with the legacy Google Search to surprising developments from the skunkworks, the afternoon moved quickly through the two-hour program. We’ll start with the recent progress in the Google apps and services and then look at Google’s half-dozen new hardware devices.
LANGUAGE AND AI
Google Translate has added 24 new languages to the real-time translation app including the indigenous American Quechua, Sanskrit, Tsonga, and Sorani Kurdish. That brings the current total supported by Translate to 133. It still leaves a “long tail of languages that are underrepresented on the web today” according to Pichai, and that’s due to a particularly difficult technical problem. “Translation models are usually trained with bilingual text—for example, the same phrase in both English and Spanish. However, there’s not enough publicly available bilingual text for every language. So with advances in machine learning, we’ve developed a monolingual approach where the model learns to translate a new language without ever seeing a direct translation of it. By collaborating with native speakers and institutions, we found these translations were of sufficient quality to be useful.”
The remaining tail is long—there are more than 7,150 languages still spoken around the world today. If the new monolingual approach to translation can be improved, many more will soon be added to the Google app.
Working in another sphere of knowledge, geospatial information, 15-year-old Google Maps, has already mapped about 1.6 billion buildings and over 60 million kilometers (37.3 million miles) of roads. Now, using computer vision and neural networks to interpret buildings from satellite imagery, more than 20% of the buildings on Google Maps have been added using this improved vision.
Pichai also explained the new “high-fidelity representations of a place” called immersive view. These are made of billions of aerial and street-level images. Interior views of important sites as well as restaurants can even float you through rooms and halls. He said, “What’s amazing is that isn’t a drone flying in the restaurant—we use neural rendering to create the experience from images alone.”
Another new feature of Maps is eco-friendly routing. Available as a route option since last year in Canada and the United States, it gives you the most fuel-efficient route, “giving you the choice to save money on gas and reduce carbon emissions.” The feature will be available later this year in Europe. A similar add-on for Google Flights shows the carbon-emission estimates between different flights between cities along with other information like price and schedule.
Three recent improvements in YouTube should significantly improve the value of informational videos. The developers noted that often you need to find a specific moment in a video, so last year they launched auto-generated chapters to make it easier to find those parts. Pichai explained, “We’re now applying multimodal technology from DeepMind. It simultaneously uses text, audio, and video to auto-generate chapters with greater accuracy and speed.” The goal now is to have 10 times the number of videos, from eight million today to 80 million over the next year.
Google is also using speech recognition models to transcribe videos and to offer these transcripts to all Android and iOS users. And to add to the textual tools, they’re developing autotranslated captions on YouTube to mobile. “[That] means viewers can now auto-translate video captions in 16 languages, and creators can grow their global audience.”
The advancements in language have also been applied to Google Workspace apps. If you’ve ever seen the abbreviation TL;DR (too long; didn’t read) in an internet correspondence you might sympathize or even be grateful for the alert that any comments aren’t based on actual reading knowledge. Or you might be offended if you sent the document intending for it to be read. Google has a useful solution for this kind of situation. Pichai explained in a Google blog, “That’s why we’ve introduced automated summarization for Google Docs. Using one of our machine learning models for text summarization, Google Docs will automatically parse the words and pull out the main points.” Not an easy trick, he adds, “This marks a big leap forward for natural language processing. Summarization requires understanding of long passages, information compression and language generation, which used to be outside the capabilities of even the best machine learning tools.” Later this year, he promises, summarization will be added to Google Chat, and work has begun on adding it and transcription to Google Meet.
A number of improvements are also coming for Google Assistant. Sissie Hsiao, writing on blog.google described the new Look and Talk feature that will be first rolled out on Nest Hub Max. You can opt in and use both Face Match and Voice Match to have the device recognize you. Just look at the Nest Hub Max and start talking without the usual “Hey Google.” She notes that it takes both seeing and recognizing your voice to start, and “video from these interactions is processed entirely on-device, so it isn’t shared with Google or anyone else…. It takes six machine learning models to process more than 100 signals from both the camera and microphone—like proximity, head orientation, gaze direction, lip movement, context awareness and intent classification—all in real time.”
Pichai explained the importance of language research at Google. The developers conference included a demo of their LaMDA generative language model. “LaMDA2 will be available inside an app called AI Test Kitchen.” Currently the model can hold a conversation or even run a text-based adventure. “We’re continually working to advance our conversational capabilities. Conversation and natural language processing are powerful ways to make computers more accessible to everyone. And large language models are key to this.”
The most anticipated news from Google at the I/O 22 was about when the three new phones and any other upcoming tech would be available for purchase. The announcements all bore the same tag: coming soon. The Pixel 6a phone will be available this summer; the Pixel 7 and 7 Pro phones and a Pixel Watch are all coming this fall; and a Google tablet will be appearing sometime next year.
The Pixel 6a is moderately priced (from $449) and has a battery that can last more than 24 hours. Its Google Tensor processor is the same as the Pixel 6 Pro. The camera has features like Face Unblur to sharpen out-of-focus faces, Magic Eraser that can remove unwanted objects in the frame, and Night Sight for low-light situations. You can sign up for notifications about availability, which looks possibly like July given the current progress.
The Pixel 7 and 7 Pro will have double and triple cameras mounted in a large metal camera bar on the back. The 7 Pro has a third camera on the back, possibly a telephoto lens. Most of the information about both phones is speculative right now. Even the processing chip remains an unknown next-generation Tensor. If release is on the usual launch for flagship Pixels, expect October to end the wait.
Scheduled for a fall debut, maybe October, the Pixel Watch will run Wear OS and will likely have a fair amount of fitness tracking from FitBit. Google purchased FitBit in November 2019. Judging from the photos, it will have a clean and minimalist design.
Pixel Buds Pro will have active noise cancellation with something called Silent Seal. There’s a control that will trigger a transparency mode, which lets you hear what’s outside when you need to follow what’s going on around you. The earbuds will have wireless charging and will provide 31 hours of listening time. A “Hey Google” prompt from you will get you to the Google Assistant. There are beamforming microphone arrays shaping the sound and wind-blocking mesh covers. From $199, the Buds Pro will come in four colors (see the Google photo above).
A DURABLE MISSION
In the Google blog, Pichai identified what has been the primary goal for Google from the beginning. “Nearly 24 years ago,” he wrote, “Google started with two graduate students, one product, and a big mission: to organize the world’s information and make it universally accessible and useful.” Larry Page and Sergey Bryn are no longer directly at the helm, but their indexing factory has expanded into a true 21st Century Menlo Park where anything digital is likely to end up on one of the workbenches.