Next:
Unsupported.html#Unsupported
Unsupported
,
Up:
Languages-Which-Aspell-can-Support.html#Languages-Which-Aspell-can-Support
Languages Which Aspell can Support
B.1 Supported
Aspell 0.60 should be able to support the following languages:
Code
Language Name
Script
Dictionary Available
Gettext Translation
aa
Afar
Latin
-
-
af
Afrikaans
Latin
0.50
-
ak
Akan
Latin
Planned
-
am
Amharic
Ethiopic
0.60
-
ar
Arabic
Arabic
Planned
-
as
Assamese
Bengali
-
-
ast
Asturian / Bable
Latin
Maybe
-
av
Avar
Cyrillic
-
-
ay
Aymara
Latin
-
-
az
Azerbaijani
Latin
0.60
-
ba
Bashkir
Cyrillic
-
-
ban
Balinese
Latin
Maybe
-
be
Belarusian
Cyrillic
0.50
Incomplete
bem
Bemba
Latin
Maybe
-
bg
Bulgarian
Cyrillic
0.50
-
bh
Bihari
Devanagari
-
-
bi
Bislama
Latin
Maybe
-
bm
Bambara
Latin
-
-
bn
Bengali
Bengali
0.60
-
bo
Tibetan
Tibetan
-
-
br
Breton
Latin
0.50
-
bs
Bosnian
Latin
Maybe
-
ca
Catalan / Valencian
Latin
0.50
-
ce
Chechen
Cyrillic
-
-
ceb
Cebuano
Latin
Maybe
-
ch
Chamorro
Latin
Maybe
-
chk
Chuukese
Latin
Maybe
-
co
Corsican
Latin
Maybe
-
cs
Czech
Latin
0.50
Yes
csb
Kashubian
Latin
0.60
-
cv
Chuvash
Cyrillic
-
-
cy
Welsh
Latin
0.50
-
da
Danish
Latin
0.50
Incomplete
de
German
Latin
0.50
Yes
ee
Ewe
Latin
-
-
el
Greek
Greek
0.50
-
en
English
Latin
0.50
Yes
eo
Esperanto
Latin
0.50
-
es
Spanish
Latin
0.50
Incomplete
et
Estonian
Latin
0.60
-
eu
Basque
Latin
Maybe
-
fa
Persian
Arabic
0.60
-
ff
Fulah
Latin
-
-
fi
Finnish
Latin
0.60
-
fj
Fijian
Latin
Maybe
-
fo
Faroese
Latin
0.50
-
fr
French
Latin
0.50
Yes
fur
Friulian
Latin
Maybe
-
fy
Frisian
Latin
Maybe
-
ga
Irish
Latin
0.50
Yes
gd
Scottish Gaelic
Latin
0.50
-
gl
Gallegan
Latin
0.50
-
gn
Guarani
Latin
Maybe
-
gu
Gujarati
Gujarati
Maybe
-
gv
Manx Gaelic
Latin
0.50
-
ha
Hausa
Latin
Maybe
-
haw
Hawaiian
Latin
Maybe
-
he
Hebrew
Hebrew
0.60
-
hi
Hindi
Devanagari
0.60
-
hil
Hiligaynon
Latin
0.50
-
ho
Hiri Motu
Latin
-
-
hr
Croatian
Latin
0.50
-
hsb
Upper Sorbian
Latin
0.60
-
ht
Haitian Creole
Latin
Maybe
-
hu
Hungarian
Latin
0.60
-
hy
Armenian
Armenian
-
-
hz
Herero
Latin
-
-
ia
Interlingua (IALA)
Latin
0.50
-
iba
Iban
Latin
Maybe
-
id
Indonesian
Latin
0.50
-
ig
Igbo
Latin
Maybe
-
ii
Sichuan Yi
Yi
-
-
ilo
Iloko
Latin
Maybe
-
io
Ido
Latin
-
-
is
Icelandic
Latin
0.50
-
it
Italian
Latin
0.50
-
jv
Javanese
Javanese, Latin
Maybe
-
ka
Georgian
Georgian
-
-
kac
Kachin
Latin
Maybe
-
kg
Kongo
Latin
Maybe
-
kha
Khasi
Bengali, Latin, Latin
Maybe
-
ki
Kikuyu / Gikuyu
Latin
-
-
kj
Kwanyama
Latin
-
-
kk
Kazakh
Cyrillic
-
-
kl
Kalaallisut / Greenlandic
Latin
Maybe
-
kn
Kannada
Kannada
-
-
kok
Konkani
Latin
Maybe
-
kr
Kanuri
Latin
-
-
ks
Kashmiri
Arabic
-
-
ku
Kurdish
Arabic, Cyrillic, Latin
0.50
-
kv
Komi
Cyrillic
-
-
kw
Cornish
Latin
Maybe
-
ky
Kirghiz
Cyrillic
-
-
la
Latin
Latin
0.60
-
lb
Luxembourgish
Latin
Maybe
-
lg
Ganda
Latin
Maybe
-
li
Limburgian
Latin
Maybe
-
ln
Lingala
Latin
Maybe
-
loz
Lozi
Latin
Maybe
-
lt
Lithuanian
Latin
0.60
-
lu
Luba-Katanga
Latin
-
-
luo
Luo (Kenya and Tanzania)
Latin
Maybe
-
lv
Latvian
Latin
0.60
-
mg
Malagasy
Latin
0.50
-
mh
Marshallese
Latin
Maybe
-
mi
Maori
Latin
0.50
-
min
Minangkabau
Latin
Maybe
-
mk
Macedonian
Cyrillic
0.50
-
ml
Malayalam
Malayalam
Maybe
-
mn
Mongolian
Cyrillic
0.60
Yes
mo
Moldavian
Cyrillic
-
-
mr
Marathi
Devanagari
0.60
-
ms
Malay
Latin
0.50
-
mt
Maltese
Latin
0.50
-
my
Burmese
Myanmar
-
-
nb
Norwegian Bokmal
Latin
0.50
-
nd
North Ndebele
Latin
Maybe
-
nds
Low Saxon
Latin
0.60
-
ne
Nepali
Devanagari
Maybe
-
ng
Ndonga
Latin
Maybe
-
niu
Niuean
Latin
Maybe
-
nl
Dutch
Latin
0.50
Yes
nn
Norwegian Nynorsk
Latin
0.50
-
nr
South Ndebele
Latin
-
-
nso
Northern Sotho
Latin
Maybe
-
nv
Navajo
Latin
-
-
ny
Nyanja
Latin
0.50
-
oc
Occitan / Provencal
Latin
Maybe
-
om
Oromo
Latin
-
-
or
Oriya
Oriya
0.60
-
os
Ossetic
Cyrillic
-
-
pa
Punjabi
Gurmukhi
0.60
-
pam
Pampanga
Latin
Maybe
-
pap
Papiamento
Latin
Maybe
-
pl
Polish
Latin
0.50
-
ps
Pushto
Arabic
-
-
pt
Portuguese
Latin
0.50
Incomplete
qu
Quechua
Latin
0.60
-
rar
Rarotongan
Latin
Maybe
-
rn
Rundi
Latin
Maybe
-
ro
Romanian
Latin
0.50
Incomplete
ru
Russian
Cyrillic
0.50
Yes
rw
Kinyarwanda
Latin
0.50
-
sc
Sardinian
Latin
0.50
-
sd
Sindhi
Arabic
-
-
se
Northern Sami
Latin
Maybe
-
sg
Sango
Latin
Maybe
-
si
Sinhalese
Sinhala
-
-
sk
Slovak
Latin
0.50
-
sl
Slovenian
Latin
0.50
-
sm
Samoan
Latin
Maybe
-
sn
Shona
Latin
Maybe
-
so
Somali
Latin
Maybe
-
sq
Albanian
Latin
Maybe
-
sr
Serbian
Cyrillic, Latin
Maybe
Incomplete
ss
Swati
Latin
Maybe
-
st
Southern Sotho
Latin
Maybe
-
su
Sundanese
Latin
Maybe
-
sv
Swedish
Latin
0.50
-
sw
Swahili
Latin
0.50
-
ta
Tamil
Tamil
0.60
-
te
Telugu
Telugu
Maybe
-
tet
Tetum
Latin
0.50
-
tg
Tajik
Cyrillic
Maybe
Incomplete
ti
Tigrinya
Ethiopic
Maybe
-
tk
Turkmen
Latin
Planned
-
tkl
Tokelau
Latin
Maybe
-
tl
Tagalog
Latin
0.50
-
tn
Tswana
Latin
0.50
-
to
Tonga
Latin
Maybe
-
tpi
Tok Pisin
Latin
Maybe
-
tr
Turkish
Latin
0.50
-
ts
Tsonga
Latin
Maybe
-
tt
Tatar
Cyrillic
-
-
tw
Twi
Latin
-
-
ty
Tahitian
Latin
Maybe
-
ug
Uighur
Arabic
-
-
uk
Ukrainian
Cyrillic
0.50
Yes
ur
Urdu
Arabic
Maybe
-
uz
Uzbek
Cyrillic
0.60
-
ve
Venda
Latin
Maybe
-
vi
Vietnamese
Latin
0.60
-
wa
Walloon
Latin
0.50
Incomplete
wo
Wolof
Latin
Maybe
-
xh
Xhosa
Latin
Maybe
-
yi
Yiddish
Hebrew
0.60
-
yo
Yoruba
Latin
Maybe
-
za
Zhuang
Latin
-
-
zu
Zulu
Latin
0.50
-
Dictionaries marked as
0.50
are available for Aspell 0.50.  Ones
marked as
0.60
are available for Aspell 0.60 only.  Ones marked as
Planned
should eventually be available.  Ones marked as
Maybe
might be available in the future.
See
Planned-Dictionaries.html#Planned-Dictionaries
Planned Dictionaries
, for more info.
B.1.1 Notes on Latin Languages
Any word that can be written using one of the Latin ISO-8859 character
sets (ISO-8859-1,2,3,4,9,10,13,14,15,16) can be written, in decomposed
form, using the ASCII characters, the 23 additional letters:
U+00C6 LATIN CAPITAL LETTER AE
U+00D0 LATIN CAPITAL LETTER ETH
U+00D8 LATIN CAPITAL LETTER O WITH STROKE
U+00DE LATIN CAPITAL LETTER THORN
U+00DE LATIN SMALL LETTER THORN
U+00DF LATIN SMALL LETTER SHARP S
U+00E6 LATIN SMALL LETTER AE
U+00F0 LATIN SMALL LETTER ETH
U+00F8 LATIN SMALL LETTER O WITH STROKE
U+0110 LATIN CAPITAL LETTER D WITH STROKE
U+0111 LATIN SMALL LETTER D WITH STROKE
U+0126 LATIN CAPITAL LETTER H WITH STROKE
U+0127 LATIN SMALL LETTER H WITH STROKE
U+0131 LATIN SMALL LETTER DOTLESS I
U+0138 LATIN SMALL LETTER KRA
U+0141 LATIN CAPITAL LETTER L WITH STROKE
U+0142 LATIN SMALL LETTER L WITH STROKE
U+014A LATIN CAPITAL LETTER ENG
U+014B LATIN SMALL LETTER ENG
U+0152 LATIN CAPITAL LIGATURE OE
U+0153 LATIN SMALL LIGATURE OE
U+0166 LATIN CAPITAL LETTER T WITH STROKE
U+0167 LATIN SMALL LETTER T WITH STROKE
and the 14 modifiers:
U+0300 COMBINING GRAVE ACCENT
U+0301 COMBINING ACUTE ACCENT
U+0302 COMBINING CIRCUMFLEX ACCENT
U+0303 COMBINING TILDE
U+0304 COMBINING MACRON
U+0306 COMBINING BREVE
U+0307 COMBINING DOT ABOVE
U+0308 COMBINING DIAERESIS
U+030A COMBINING RING ABOVE
U+030B COMBINING DOUBLE ACUTE ACCENT
U+030C COMBINING CARON
U+0326 COMBINING COMMA BELOW
U+0327 COMBINING CEDILLA
U+0328 COMBINING OGONEK
Which is a total of 37 additional Unicode code points.
All ISO-8859 character leaves the characters 0x00 - 0x1F, and 0x80 -
0x9F unmapped as they are generally used as control characters.  Of
those, 0x01 - 0x0F, 0x11 - 0x1F and 0x80 - 0x9F may be mapped to
anything in Aspell.  This is a total of 62 characters which can be
remapped in any ISO-8859 character set.  Thus, by remapping 37 of the 62
characters to the previously specified Unicode code-points, any modified
ISO-8859 character set can be used for any Latin languages covered by
ISO-8859.  Of course decomposing every single accented character wastes
a lot of space, so only characters that cannot be represented in the
precomposed form should be broken up.  By using this trick it is
possible to store foreign words in the correctly accented form in the
dictionary even if the precomposed character is not in the current
character set.
Any letter in the Unicode range U+0000 - U+0249, U+1E00 - U+1EFF (Basic
Latin, Latin-1 Supplement, Latin Extended-A, Latin Extended-B, and Latin
Extended Additional) can be represented using around 175 basic letters,
and 25 modifiers which is less than 210 and can thus fit in an Aspell
8-bit character set.  Since this Unicode range covers any possible Latin
language this special character set can be used to represent any word
written using the Latin script if so desired.
B.1.2 Syllabic
Syllabic languages use a separate symbol for each syllable of the
language.  Even thought most of them have more than 210 distinct
symbols Aspell can still support them by breaking them up.
B.1.2.1 The Ethiopic Syllabary
Even though the Ethiopic script has more than 210 distinct characters
Aspell can still handle it.  The idea is to split each character into
two parts based on the Consonant and Vowel parts.  This encoding of the
syllabary is far more useful to Aspell than if they were stored in UTF-8
or UTF-16.  In fact, the exiting suggestion strategy of Aspell will work
well with this encoding without any additional modifications.  However,
additional improvements may be possible by taking advantage of the
consonant-vowel structure of this encoding.
In fact, the split consonant-vowel representation may prove to be so
useful that it may be beneficial to encode other syllabary in this
fashion, even if they are less than 210 of them.
The code to break up a syllabary into the consonant-vowel part is part
of the Unicode normalization process.
B.1.2.2 The Yi Syllabary
A very large syllabary with 819 distinct symbols.  However, like
Ethiopic, it should be possible to support this script by breaking it
up.
B.1.2.3 The Ojibwe Syllabary
With only 120 distinct symbols, Aspell can actually support this one as
is.  However, as previously mentioned, it may be beneficial to break it
up into the consonant-vowel representation anyway.
