The Internet’s Streetlight: What AI Doesn’t See


I’m writing from Rimini with the salty air coming off the Adriatic and the soft rasp of my wheelchair tyres on old parquet. The coffee on my desk smells bitter and brave, like every half-finished idea. I’ve spent years as President of Free Astroscience trying to make complex science feel close—like sand warming under your palms at sunset—yet tonight I’m wrestling with a simple, uncomfortable truth. We’ve built an internet that shines bright in some places and leaves others in the dark, and generative AI just follows the light.

I say this as a blogger who loves technology and as a young man who knows what it’s like to navigate systems designed for someone else’s body. The glow from my screen is cool on my fingers; the gap it hides is warmer, human, and messy. If we don’t name that gap now, the future will hear only an echo of the present—thinner, flatter, quieter.



Three Things We’re Getting Wrong

It’s tempting to believe the internet holds almost everything worth knowing; the keyboard clacks agreeably and search bars purr. It’s tempting to believe AI is neutral, a clean mirror with a hospital-white shine. It’s tempting to believe more data will fix everything, like pouring more water into a dry pot and expecting soup by magic.

Those ideas feel good because they’re smooth, like glass façades in morning light—modern, efficient, universal. But smooth isn’t always true, and quiet isn’t always fair. If we keep cuddling those assumptions, the next generation will inherit a library where the loudest books are always the ones on display.

One Number That Changes Everything

Here’s the stat that keeps tapping my shoulder like a drizzle on warm stone: English is about 44% of a giant web dataset called Common Crawl, while Hindi sits around 0.2% and Tamil around 0.04%—despite hundreds of millions of speakers between them. That’s not a rounding error; that’s a canyon you can hear in the hollow echo of a search result. Because AI learns from what’s online, it inherits those gaps—and then politely repeats them with confidence.

In plain language, the streetlight isn’t shining where many people dropped their keys. The air smells faintly of hot circuitry and dust because we keep looking where it’s easy, not where it’s right. If we want a wiser tomorrow, we must point the light at the ground we’ve ignored.

What’s Actually Going On (Lightly Simplified)

A large language model is a pattern-predictor trained on piles of text; think of it as a very fast storyteller that has read millions of pages and learned which words often follow which. When some languages and traditions are barely present in those pages, the model forms weaker “memories” of them—like trying to recall a song you heard once from a neighbour’s open window. The model also leans into the most common patterns; that’s how it stays fluent. Researchers call this tendency to over-favour the frequent mode amplification—jargon aside, it just means the popular gets more popular.

Then we fine-tune the model with human feedback to make it safe and helpful. That feedback is given by people with specific values, contexts, and incentives; the room’s air-con hum turns the room into the room, not the world into the world. I’m simplifying the science here on purpose, so you can feel the shape without the maths—because the point isn’t to memorise equations, it’s to notice what’s missing.

Why This Matters In Real Life

Language is not just a tool; it’s a pantry of local wisdom that smells like fresh basil, wood smoke, or river mud depending on where you grew up. One careful study found that more than 75% of 12,495 medicinal plant uses were unique to a single local language—imagine how quickly those remedies vanish when the language shrinks offline. When AI overlooks a language, it overlooks the practices, plants, and places braided into that language. That silence is not neutral; it’s a door closed with a soft click.

And in our own cities, we’ve learned the hard way that imported “universal” solutions can run hot. Glass-skinned buildings glint beautifully at noon, yet in tropical climates they trap heat and demand more cooling—the hiss of compressors grows louder as streets shimmer. The future needs cool heads and cooler rooms, but not by repeating yesterday’s mistakes with shinier code.

A Story I Can’t Shake

In the Aeon essay that sparked this piece, the author describes a family debate—hospital corridors smelling of disinfectant, kitchen counters sticky with herbal oil—and how online guidance pointed one way while lived tradition tugged another. The tumour story ends in a twist, and the lesson isn’t “trust herbs” or “trust hospitals,” but “trust that the web is not the world.” You can almost hear the clatter of pharmacy boxes next to the quiet of a village clinic. That friction reminds me that humility—in science, in storytelling, in policy—is not weakness; it’s the space where better questions begin.

If we want an internet worthy of our children, we’ll need models that breathe where people actually live. That means ears open to rough edges and languages with different rhythms, like the roll of Romagnolo dialect on a Saturday market. The future will thank us for training our machines to listen before they speak.

So, What Do We Do Now?

I don’t want to leave you with a sigh and a shrug. Tonight the breeze through the window smells like rain on hot pavement, and it feels like a permission slip. Start small and local: publish that recipe your grandmother mumbles in dialect, with photos and the sound of her stirring; translate the name of the plant your neighbour uses for a stubborn cough; record the “how” and the “why,” not just the “what.” When you ask an AI for help, ask it to show sources in your language and in the local language of the place you’re researching; make it sweat a little, like a cyclist pushing into a headwind.

If you write, write bilingually when you can; if you teach, assign readings beyond the usual suspects; if you code, prioritise datasets that smell like soil and sea—not just boardrooms. None of this is glamorous. It’s tacky labels on jam jars, the scratch of pen on paper, the rattle of a cheap scanner. But in ten years, that noise will sound like a choir.

A Simple Rimini Experiment

Here’s my tiny plan for the month—no hype, just action that fits in your hands. I’ll sit with three elders from different neighbourhoods, in cafés where the espresso steams and the chairs wobble, and I’ll ask about one practice each that kept their families afloat—a way of storing water, a trick for cooling rooms without power, a garden remedy that kept colds at bay. I’ll publish their words in Italian and English, with notes a curious teenager could follow, and I’ll link the audio so their voices—grainy, warm, playful—stay findable. Then I’ll prompt an AI with those pages and see what changes.

If this works in Rimini, it can work in Nairobi, Kochi, or Tirana. The future isn’t a monolith; it’s a patchwork quilt that smells like many kitchens. Let’s stitch louder.

The Takeaway, If You Remember One Thing

The web is not the world—and AI learns from the web. When 44% of the digital fuel is English and vast languages sit at 0.2% or 0.04%, the engine hums with a bias you can hear. So our job is to change the fuel: create, translate, preserve, and publish the knowledge that breathes in under-documented languages, then demand systems that ingest it with care. This piece simplified thorny AI ideas on purpose; if you want the deeper dive that inspired me, read the Aeon essay and notice how the author frames missing voices, skewed datasets, and the risk of “knowledge collapse.” The room grows quieter when we stop listening; our task is to turn the volume back up, kindly.

Closing—From Streetlight To Sunrise

I picture a dawn over the sea, a pale gold that smells like salt and fresh bread, and I imagine our tools finally learning to see beyond the circle of easy light. That’s the future I want: machines trained not only on what’s bright, but on what’s true. If we start today—softly, locally, stubbornly—the map of knowledge our children inherit will feel less like a mall and more like a forest after rain.

Post a Comment

Previous Post Next Post