You can also use the Wiktionary. The MediaWiki API (Wikionary uses MediaWiki) allows you to query a list of article titles. In wiktionary, article titles are (among other things) words in a dictionary. The only catch is that foreign words are also in the dictionary, so sometimes you can get the βwrongβ matches. Of course, your user will also need Internet access. You can get help and information about the api at: http://en.wiktionary.org/w/api.php
Here is an example URL of your request:
http:
This returns the following xml:
<?xml version="1.0"?> <api> <query> <pages> <page ns="0" title="ogd" missing=""/> <page ns="0" title="odg" missing=""/> <page ns="0" title="gdo" missing=""/> <page pageid="24" ns="0" title="dog"/> <page pageid="5015" ns="0" title="god"/> </pages> </query> </api>
In C #, you can use System.Xml.XPath to get the parts you need (pages using pageid). These are "real words."
I wrote an implementation and tested it (using a simple βdogβ example above). He returned only the "dog" and "god." You should check it in more detail.
public static IEnumerable<string> FilterRealWords(IEnumerable<string> testWords) { string baseUrl = "http://en.wiktionary.org/w/api.php?action=query&format=xml&titles="; string queryUrl = baseUrl + string.Join("|", testWords.ToArray()); WebClient client = new WebClient(); client.Encoding = UnicodeEncoding.UTF8; // this is very important or the text will be junk string rawXml = client.DownloadString(queryUrl); TextReader reader = new StringReader(rawXml); XPathDocument doc = new XPathDocument(reader); XPathNavigator nav = doc.CreateNavigator(); XPathNodeIterator iter = nav.Select(@"//page"); List<string> realWords = new List<string>(); while (iter.MoveNext()) { // if the pageid attribute has a value // add the article title to the list. if (!string.IsNullOrEmpty(iter.Current.GetAttribute("pageid", ""))) { realWords.Add(iter.Current.GetAttribute("title", "")); } } return realWords; }
Name it as follows:
IEnumerable<string> input = new string[] { "dog", "god", "ogd", "odg", "gdo" }; IEnumerable<string> output = FilterRealWords(input);
I tried using LINQ to XML, but I am not familiar with it, so it was a pain, and I abandoned it.
source share