Today I write about the Bing/Microsoft Translator and give you a nice C++ example on how to use the API (using the HTML interface) since there is none to be found elsewhere. You will not find a single C++ example in any of the MS documentation (nor a current complete one any place else on the web). All of the examples are C# and PHP only. It’s understandable after all since it’s easier with these languages/systems due to their native components, but I needed and wanted C/C++.
English “Hello World!” to German translation:
One of the main features of a string extraction data mining tool I’m working on is automatic language translation. Ideally facilitated in C++ since the project is built on it.
I knew this was possible and probably fairly trivial having looked into the Google Translate API in the past.
I wanted free if possible but unfortunately the Google service went paid last year (2011); apparently for reasons discussed here “Why Larry Page killed Google Translate..”.
It should be possible to just hijack the free Google site: http://translate.google.com. Coding what ever it takes to do the web transactions (HTML pages et al), but then probably not so trivial, and there might be unapparent restrictions (like limited words/characters per day); at any rate hardly ideal, especially for a C++ implementation.
There are some free open source alternatives (like Apertium) but haven’t taken the time to evaluate them. It would be nice to see some accuracy/fitness comparisons between the free and paid offerings.
In comes the challenger Bing/Microsoft Translator!
Now, Bing Translate has a nice free offering of 2M characters per month. After that there are reasonable paid offerings. The paid option also apparently lifts a 400K chars per hour limit too.
The simple class layout is partially based on the MS C# samples, grouped in a “BingTranslate” namespace.
I chose to use the WinINet API for the first attempt since it’s such a nice high level layer. Although it’s not ideal for every case since it’s a relativity bulky and so tied into the Internet Explorer. Could be replaced with WinHTTP or some other HTTP net lib.
Also pure sockets is not out of the question, although the authentication part a little more complicated requiring SSL libs for the HTTPS connection et al.
My authentication section uses simple string functions (mainly strstr() and strchr()) to parse out the “temporal access token” in lieu of real JSON parsing. It works, but a real production application could use something like json-cpp, plus proper HTTP header result code parsing.
This as well as for translation operations to catch possible problem results like “AppId is over the quota” or “Quota Exceeded” as stated in the Microsoft Translator API FAQ and gathered from forum posts like this and this.
Only the one key Microsoft.Translator.Translate service method is currently implemented. It might be useful to have Microsoft.Translator.TranslateArray also to do some batch translations (in all the same “to” and “from” set languages). It would require just the addition of some XML encoding and decoding.
Note the input and outputs text per spec is UTF-8 text encoding only.
With the example Main.cpp it so happens that “Hello World!” translates into German as “Hallo Welt!” in just plain ASCII chars. If any others (like glyphs, Asian characters, etc.) then they might look like garbage on the console. It’s possible to show UTF-8 in a Windows console but one has to go through several machinations to set it up (like setting the console font, using undocumented functions for Windows XP, calling SetConsoleOutputCP(CP_UTF8), etc).
You must get a Bing Translator account ID and secret/key that you can sign up for here. You will probably need to sign up for a hotmail email in the process too.
Visual Studio project >>Download< <