I have a text file which contains a list of about 150000 words. I loaded the the words into a dictionary and word lookup works fine for me. Now i want to search the dictionary to see if the dictionary contains a word starting from a particular alphabet.
I am using
foreach(KeyValuePair pair in dict ){
}
for this purpose. This seems to work fine for smaller word lists. But it doesnt work for 150000 wordlist. Could anyone help please.
void WordAvailable (char sLetter) {
notAvail = true;
int count = 0;
do {
foreach (KeyValuePair<string,string> pair in dict) {
randWord = pair.Value;
count++;
if (randWord [0] == sLetter && !ListTest.usedWordsList.Contains (randWord)) {
notAvail = false;
startingLetter = char.ToString (sLetter);
break;
}
if (count >= dict.Count) {
ChooseRandomAlpha ();
sLetter = alpha;
count = 0;
}
}
} while(notAvail);
}
Now I want to search the dictionary to see if the dictionary contains a word starting from a particular alphabet.
That sounds like you want a SortedSet
rather than a Dictionary
. You can use GetViewBetween
to find all the entries in the set that lie between two bounds. The lower bound would probably be "the string you're starting with" and for the upper bound, you could either work out "the last possible string starting with those characters" or use an exclusive upper bound by manually ignoring anything that doesn't start with your prefix, and "incrementing" the last character of your prefix. So for example, to find all words beginning with "tim" you can use GetViewBetween("tim", "tin")
and ignore tin
if it's in the dictionary.
Note that ordering can be an "interesting" exercise when it comes to multiple cultures etc. If this is just an academic exercise and you'll only have ASCII characters, you might want to lower case each word as you add it to the set, and then use an ordinal comparison. If you do need a culture-sensitive ordering, you could make that case-insensitive easily... but working out the upper bound for the prefix will be trickier.
Example of using GetViewBetween
:
using System;
using System.Collections.Generic;
class Test
{
static void Main()
{
var words = new SortedSet<string>(StringComparer.Ordinal)
{
"cat", "dog", "banana", "laptop", "mug",
"coffee", "microphone", "water", "stairs", "phone"
};
foreach (var word in words.GetViewBetween("d", "n"))
{
Console.WriteLine(word);
}
}
}
Output:
dog
laptop
microphone
mug
An alternative would be to build your own trie implementation (or find an existing one) but I don't know of one in the BCL.
See more on this question at Stackoverflow