Last weekend, I turned to Google Search for help figuring out how many stamps I needed to put on an 8-ounce piece of mail. (Naturally, I was sending a copy of the latest issue of Startup!). It’s the exact sort of question that I hoped Google Search’s new generative AI feature, which I’ve been testing for the past month, would solve much faster than I could through my own browsing.
Google’s clunkily named Search Generative Experience, SGE for short, infuses its search box with ChatGPT-like conversational functionality. You can sign up at Google’s Search Labs. The company says it wants users to converse with its search chatbot, which launched to testers in May, to dive deeper into topics and ask more challenging and intuitive questions than they would type into a boring old query box. And AI-generated answers are meant to organize information more clearly than a traditional search results page—for example, by pulling together information from multiple websites. Most of the world’s web searches run through Google, and it’s been developing AI technologies longer than most companies, so it’s fair to expect a top-notch experience.
So goes the theory. It turns out that in practice the new feature is so far more nuisance than aide. It’s slow, ineffective, verbose, and cluttered—more artificial interference than intelligence.
Once you gain access to Google’s test, the search box looks unchanged. But in response to a query like “How many stamps to mail 8 ounce letter,” a new section takes up a good chunk of the screen, pushing down the conventional list of links. Within that area, Google’s large language models generate a couple of paragraphs similar to what you might find from ChatGPT or Microsoft’s Bing Chat. Buttons at the bottom lead to a chatbot interface where you can ask follow-up questions.
The first thing I noticed about Google’s vision for the future of search was its sluggishness. In tests where I controlled a stopwatch app with one hand and submitted a query with the other, it sometimes took nearly six seconds for Google’s text-generator to spit out its answer. The norm was more than three seconds, compared to no more than one second for Google’s conventional results to appear. Things could have been worse: I did my tests after Google rolled out an update which it claims doubled the search bot’s speed last month. Yet I still often find myself deep into reading the regular results by the time the generative AI finishes up, meaning I end up ignoring its tardily submitted dissertations. Cathy Edwards, a Google Search vice president, tells me speed optimizations of the AI software underpinning the tool are ongoing.
One could excuse the slowness of this new form of search if the results were worthwhile. But accuracy is spotty. Google’s five-sentence generative AI response to my stamps question included apparent errors of both multiplication and subtraction, stamp prices outdated by two years, and suggested follow-up questions that ignored crucial variables for shipping costs, such as shape, size, and destination. The disclaimer Google displays at the top of each AI-generated answer rang resoundingly true: “Generative AI is experimental. Info quality may vary.”
In the same response, Google’s new search feature suggested that I would need either $2.47 or $4 worth of stamps. Navigating to the US Postal Service’s online calculator provided the official answer: I needed $3.03, or five stamps at 66 cents each with a 27-cent overpayment. Google’s Edwards says my humble query pushed the technology’s current boundaries. “It’s definitely on the frontier,” she says.
Unfortunately, dumbing down didn’t end well either. When asked for just the price of a stamp, Google responded with an outdated figure. Only specifying that I wanted the price as of this month got the system to correctly reflect this month’s 3-cent cost hike. To be fair, ChatGPT would flunk this query too because its training data cuts off in 2021—but it is not positioned as a replacement for a search engine.