Burning Life is around until October 5th. So many people have written about what it means for the Second Life® virtual world — the conception of a user-generated-content-virtual world comes allegedly from Philip’s totally fascinated experience with the real life “Burning Man” event — that I can only draw the attention to this true festival of pure creativity, which has been growing over the years, to a whopping 22 sims this year. People all over the world engage in their creative buildings and show off the best of the best, in a way that usually we don’t see often in SL.
But flipping over the coin of creativity and innovation, we have also Orange Island’s Innovation Week starting this Monday. This is pretty much a mix-up of evangelists, enthusiasts, visionaries, techies, and creative people who talk about Second Life and how its technology changed over time and influenced people to bring up with fantastic new concepts that were never possible before. Just imagine, three years ago nobody knew what a “mixed-media” event was! These days, we have them all the time — thanks to the intriguing use that people put to SL’s in-built voice, video, and streaming capabilities.
And finally we have this intriguing notion that SL is definitely truly global — an environment for creativity and innovation that skips all barriers of language and geographic distance. Very recently I was trying to hire another project manager for my company — a person living in the middle of the Atlantic. He asked me if being 1600 km distant from the offices in Lisbon (and 4000 km from the offices in New York) would be a problem. I’ve told him that he would actually be one of the “closest” people working with the two-person team in Lisbon (since almost everybody else lives further away), to his great relief and delight. Yes, for SL residents, the (real) world just became tiny, small — and interconnected.
The other day I was browsing across some products on SL Exchange and I saw an interesting advert. A device of some sort that allowed you to translate notecards into other languages: nothing complex, since it obviously uses the Google Translator Services or a similar service. What was interesting was the claim that “if you have your product information only in one language, you’re losing 75% of your customer base”. The numbers are obviously wrong — even if the creator of this device was thinking about English, more than half of all residents in SL speak English at least as a second language. Native speakers of English are probably 30 or 35% or so. Granted, native Portuguese or Japanese speakers are almost that size as well, so there is a point to be made: business in SL is truly international, and people cannot afford any more to target just one language market.
One major reason (it’s not the only one) for several merchants to complain about a “loss in sales” is that the number of non-English speakers is growing far more than the number of English speakers. And these have their own communities, their own content creators, their own shops, their own blogosphere, and everything there is written in a language that is not English. Content creators limited to just English (or any other language, really) will be unable to target the whole of Second Life. And there is also an interesting phenomenon — most (and the best!) content creators in SL are fluent English speakers. This means that an average-quality content creator that targets the Urdu-speaking community will enjoy a lot of sales from there, even if their content is nothing special — but nobody else in the world speaks Urdu, so they have in fact a niche market to explore.
This is sort of “globalisation in reverse”. Instead of targeting a single market and hire labour where it’s cheaper, clever content creators target the whole market, exploring small niches of restricted content in a specific language, because the competition will not go there. Yet. People are heavily investing in translations, from very low-quality ones (provided by automatic translation services) to professional jobs that are costly, but might very well open a window of opportunity to markets where SL still grows significantly, and, more important than that, are little explored.
Thinking about this, I looked at my portfolio of Utterly Useless Devices being offered for sale in-world and from online SL webshops. It seems that the most recent item I still sell is from late 2005 or 2006 or so. I have a lot of semi-abandoned personal projects — there is simply no more time for me to turn them into finished products — and most of them are even more useless than my usual gadgets 🙂
Hardly worth to begin the Innovation Week without being innovative myself!
Full of guilt, I spent the wee hours of my insanely busy weekend (there is no time left for anything, unless I cut on my sleeping hours — which I did 🙁 ) trying to quickly cook up two simple items. One was inspired on the Talk Like a Pirate Day — an English-to-pirate translator, with two “dialects” to pick from. There are dozens of websites doing these kinds of translations, but sadly none of them use APIs, so it means using the old method of grabbing a web page in the background and parsing it for the translation, digging through layers Javascript to figure out where the webmaster is making those nice calls to their translation application.
Obviously, LSL is not up to this task, it’s simply too slow and hasn’t got enough free memory to do it. So I simply left the processing to an external script in PHP, running from this very same website, and let LSL call my script instead. It works like a charm, and the beauty of it is that I can tweak those scripts at will without needing to provide updated versions of the English-to-Pirate translator. Thus, the GUUD Simple Pirate Translator was born 🙂 It’s not even HUD-based, you can attach it anywhere you like, it’s a single invisible prim.
Well, as a proof of concept, something that was done in a few hours (more due to the need to do some nice picture to go with it for the vendors and online shops!) ought to be freely given away. Alas (or should I say “Avast”?), every time someone uses the translator it requires an external web call, which takes the tiniest little bit of bandwidth from my available monthly traffic, so I have to take that into account.
I’m aware that there are far better English-To-Pirate translators available! Worse than that: I totally missed the Talk Like a Pirate Day again, so there went my opportunity to get rich! Arrrgh, shiver me timbers, and another bottle o’ rum for me to forget my woes… at least I’ll be ready for 2009!
It was time to do something bolder. Hamlet Au wrote over two years ago about the first generation of in-world personal translators. At that time I did warn him that no matter how cool those devices are, the results are really quite poor. In fact, you have to be bilingual to fully understand how hilarious those things are — much more appropriate for having a good round of fun (at the expense of the clueless person using the translator!) than using them seriously.
Well, how do they work? Obviously enough, nobody sells a cheap L$500 or $1000 device in Second Life having developed a whole automated translation technology of their own. Translation software is insanely expensive because it takes ages for a bunch of linguists to work together with AI experts to produce something that works more-or-the-less, most-of-the-time (sometimes good enough to do specialised translations, like in the legal world — the European Union uses automated translation systems to have most of the European legislation available in several languages). Nobody would obviously do multi-million-dollar software to sell L$500 HUDs!
No, what almost all these people neglect to inform their customers is that they use publicly available web-based translation services (Babelfish was very popular on the first-generation translation HUDs), and just create a LSL interface (from a HUD) to integrate with these. “Babblers”, as those first-generation translation HUDs were known in SL, are quite crude — and rude, because they spam everybody’s chat all the time. Nothing is more nerve-wrecking than standing surrounded by Mentors trying desperately to talk with their “Babblers” to a handful of newbies, and watching — in despair! — how each “Babbler” picks the text of each other “Babbler” and provides a hilarious translation (specially if several different languages are being used in the same area), in an endless loop of chat spam. It can drive you nuts — and it’s much worse if you can actually understand the different languages!
Second-generation translation HUDs are a bit better. They still use popular web-based services, but they will keep the chat channel clean. You usually have a selection of “translation pairs” to pick from, and use a “hidden” channel to type in your mother language, and let the HUD “speak on your behalf”. Yes, the text will come out green, but at least it will be written just once. And you can talk naturally to any other person around you in your native language without having the translator picking up on your text and adding to the chat spam.
On the reverse side (picking up on what people are saying) nobody is really interested to read how funny the translations are — specially in very busy areas. So the second-generation translators just send the translation to the HUD wearer and to nobody else (let the others buy their own translator HUDs!… that’s a market opportunity for the HUD creators 🙂 ). Some use regular chat that only the HUD wearer can hear (e.g. using llOwnerSay()) which is the preferred (and easiest) option. You might also get IMs from the HUD (specially on older HUDs, created before llOwnerSay() was available in LSL), which has a slight delay. And the more creative HUDs display the text in the HUD itself, using clever hovertext techniques, and allowing you even to pick from a list of nearby avatars which ones you wish to translate, using different languages for them. Quite good for a L$500-1000 device!
The latest generation of HUDs even use more cleverness. Google recently launched a “detection” system, where their translation services will attempt to figure out on its own in what language a web page is written, and translate accordingly. Why is that important? For someone used to, say, Romance languages, it’s hard to distinguish Ukranian from Russian (they both use the same Cyrillic alphabet), or even Japanese from Chinese (both can use Chinese ideograms). So if you get the language wrong — you end up with pure gibberish. Or, worse, you might even be very offensive and rude since similarly spelled words (using the same characters) might mean completely different things, even if for you they look the same! Automated translators will never endow you with a cultural background to tell you if you’re speaking correctly. And although Japanese are used to the rudeness of us gaijin, and Chinese live peacefully with us foreign demons, other cultures might not be so tolerant (specially if the resident is not aware that you’re using a translator!).
Well, Google is by far the most feature-rich service (they also rely on human beings to provide them with more accurate translations — Google has long ago found out that there is a limit to what you can code, and that humans are much better at tagging things correctly), and, best of all, they have a lovely Translation API. Yes, it has webpages in mind; but the awesome Google developers figured out that things like Flash-based sites will not be happy with HTML (or Javascript), so they have a way to send you the raw translation in plain UTF-8-encoded text.
Properly done, it encodes replies using JSON, not only to comply with standards, but also because almost all modern languages fully support JSON-parsing natively. Almost all… LSL, as always, being the exception! (vote on JIRA to get this feature!) Since string parsing is painfully slow in LSL, you need to do some clever programming tricks to skip over what’s irrelevant.
Yes, of course this method has limitations (mostly due to the small footprint of LSL scripting — you have to measure your available memory in bytes, not KiloBytes or MegaBytes…), but it sort of works for short sentences. Anyone trying to translate a 5,000-word-notecard is far better off by pointing to Google’s translation page and use it from there!
Incorporating the above ideas into a quick-and-dirty HUD (I’m not good on designing anything, much less HUDs…) allowed me to launch the first version of the GUUD Universal Translator, boasting over 70 possible languages (that’s 5000 translation pairs! — although obviously Google doesn’t support all of them. If you absolutely wish to translate Cherokee into Bengali, you’re not going to do be able to do that!). Oh, it does have some minor quirks (none of my Utterly Useless Devices are free of bugs!), and there is no provision for you to translate just some of the people instead of everybody, or have conversations in multiple languages simultaneously. Also, I haven’t used the “language detection” facility of Google yet (I need to keep some features to do on the next version!), so you’ll have to figure out first in what language people are speaking. And, as said, the results go from hilarious to offensive — that’s what you get from cheap, automated translation services 🙂
It’s definitely a first step if you’re addressing an international audience and wish to understand if your Philippine customer, who only speaks Tagalog, has returned to your shop to buy something else or demand a refund!