你现在所在的位置->首页->wap->wap中文问题

wap中文问题

时间：[2005-11-23 15:26:36]　　　　　作者：未知

wap中文问题

解决中文显示问题的四种方法
WML是XML的一种应用，而XML的缺省编码是UTF-8，也就是Unicode的8位编码方式。如果不特殊说明，那么XML将认为采用的是UTF-8的编码方式。这就造成了一个问题，几乎所有的文档内容都采用了GB2312方式，数据库中也不例外。而Unicode和GB2312的编码有很大的不同，可以说根本不一样，这是造成乱码的主要原因。
任何编码方式，包括日文、韩文、希腊文、阿拉伯文等都能轻松转换成Unicode。如果使用Unicode，就可以在同一段文档中加入各种语言。虽然现有的应用软件很少采用Unicode，但Windows NT的内核却采用Unicode来处理字符。Unicode方式有两个吸引人的个性：独立且宽容。
如何解决这些问题，现在常用的有以下的四种方法。

直接采用UTF-8编码
这种方法无需多讲。如果内容可以轻易转换到UTF-8编码还需要什么呢？坏处是需要对内容要进行全面的转换，而且与现有的大多数应用不兼容。

直接使用GB2132编码
这种方法也很简单，在编码声明时，标注采用GB2312编码方式，具体做法如：<?xml version="1.0" encoding="GB2312"?>。但是并不是所有终端都支持GB2132编码，仍然会出现乱码。
笔者做开发的时候就是采用这种办法解决中文问题。当时公司采用的是Motorola公司的网关，使用Motorola L2000www、Nokia 7110和Simens 3568i进行测试。结果都十分满意。
如果使用这样的页面在Nokia WAP Toolkit上直接进行测试，将会发现Nokia WAP Toolkit将对中文进行自动编码，只是按照半个字节进行处理。因此出来之后就变成乱码了。

采用字符转换
其思想是用ASCII字符表现更大字符集中的字符。比如要展现希腊文的小写的alpha。alpha在Unicode的编码中是945，16进制就是3B1，于是写下“α”或者“α”显示的就是小写的“α”。只需要知道汉字的Unicode编码，将其转换成“&#xXXXX;”的形式。用ASCII编码方式，任何平台都能处理，而且HTML也支持。但是这样就增加了文件长度。

配置应用服务直接输出UTF-8编码（适用于IIS）
在IIS的Response Object有一个属性CharSet，按微软的说法只要这么做就行：
<% Response.Charset("UTF-8") %>
这种方法，只适用于Windows NT下IIS的ASP编程。其他的平台和Web服务器就没有如此简单的方式。

GB2132转Unicode
Unicode的原理也很简单。使用“&#x”加上ASCII码的数值文本再加“；”结尾。当然要注意的是：这里所指的ASCII数值并不是简单的内存数值（或者说是GB2132编码）。例如：“饱”字，在内存里面的数值是0xb1a5，而在Unicode映射表内是0x9971。请先看下面的一段小程序。在WML页面中混有英文和中文。中文是采用Unicode书写的。

<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1 //EN" "http://www.wapforum.org/DTD/wml_1.1.xml">
<wml>
<card id="main" title="Chinese">

T;⛳

</card>
</wml>

在Nokia WAP Toolkit中的测试结果为（图2-35所示）：

图 2-35 使用Unicode的中文WML页面

可以看到“⛳”变成了“饱”字。如果用“&#b1a5;”或者直接使用GB2132编码（相当于直接书写中文）那么显示结果就可能是图2-36所示的样子。肯定要问自己，怎么会这样！原因就是在这个位置上的Unicode编码字符是个怪字符。

图 2-36 没有使用Unicode的中文WML页面

对于英文字符可以使用Unicode方式，也可以不使用。那么Unicode编码如何获得呢？
如果使用手工方式，那么可以在“附件”中找到“Unicode映射表”。打开映射表，选择“宋体”和“CJK Unified Ideograph”，在里面可以找到很多中文字符的Unicode编码。随便拷贝一个到WML文件中，百试百灵！就是使用的时候不是很方便。
如果使用Visual Basic脚本语言编写服务端程序（例如：ASP），那么问题就简单多了。可以使用AscW这个函数来解决编码的问题。以下的几个编码函数就是这样做的。
如果使用Visual C&C++编写服务端程序，那么必须使用到Windows的一个API函数：MultiButeToWideChar。这个函数可以实现Unicode的转换。
对于其他服务端脚本语言也应该有类似的函数。如果没有，那么就必须自己做一个转换工具，能把WML中的中文转换为Unicode。在MSDN提供的例子中有一个小的转换工具：Uconvert。但是好像不是很好用，不过可以研究里面的程序。在网络上有不少这样的实用小工具。

适用于ASP的转换程序
<%=replace(Server.HTMLEncode(request.form("text")),chr(13)+chr(10)," ")%> >
以下是您的UNICODE码:
<textarea name="text" cols="40" rows="10">
<%=Server.HTMLEncode(unicode(request.form("text")))%>

--------------------------------------------------------------------------------

第一个Basic程序
function unicode(str)

for i = 1 to Len(str)
c = Mid(str, i, 1)
unicode = unicode & "&#x" & Hex(AscW(c)) & ";"
next

end function

--------------------------------------------------------------------------------

第二个Basic程序
function unicode(str)

dim i,j,c,i1,i2,u,fs,f,p

unicode=""
p=""

for i=1 to len(str)
c=mid(str,i,1)
j=ascw(c)
if j<0 then
j=j+65536
end if
if j>=0 and j<=128 then
if p="c" then
unicode=" "&unicode
p="e"
end if
unicode=unicode&c
else
if p="e" then
unicode=unicode&" "
p="c"
end if
unicode=unicode&"&#"&j&";"
end if
next

end function

全部转换程序
<%
function unicode(str)

dim i,j,c,i1,i2,u,fs,f,p

unicode=""
p=""
for i=1 to len(str)
c=mid(str,i,1)
j=ascw(c)
if j<0 then
j=j+65536
end if
if j>=0 and j<=128 then
if p="c" then
unicode=" "&unicode
p="e"
end if
unicode=unicode&c
else
if p="e" then
unicode=unicode&" "
p="c"
end if
unicode=unicode&"&#"&j&";"
end if
next

end function

function cutline(str,linelen)

dim i,j,c,k

cutline=""
j=0
for i=1 to len(str)
c=mid(str,i,1)
if asc(c)<0 or asc(c)>127 then
k=2
else
if asc(c)<32 then
k=0
if asc(c)=13 then
j=0
cutline=cutline+" "+c
c=""
end if
else
k=1
end if
end if
j=j+k
if j>linelen*2 then
cutline=cutline+" "+vbCrlf+c
j=k
else
cutline=cutline+c
end if
next

end function

function convertsymbol(sStr)

dim i,c

convertsymbol=""
for i=1 to len(sStr)
c=mid(sStr,i,1)
if c=">" then
convertsymbol=convertsymbol & ">"
elseif c="<" then
convertsymbol=convertsymbol & "<"
elseif c="'" then
convertsymbol=convertsymbol & "'"
elseif c="""" then
convertsymbol=convertsymbol & """
elseif c="&" then
convertsymbol=convertsymbol & "&"
elseif c="$" then
convertsymbol=convertsymbol & "$$"
else
convertsymbol=convertsymbol & c
end if
next

end function

function convertstring(sStr)

dim strtemp,asctemp,c

strtemp=""
for i=1 to len(sStr)
c=mid(sStr,i,1)
asctemp=ascw(c)
if (asctemp>47 and asctemp<58) or (asctemp>64 and asctemp<91) or (asctemp>96 and asctemp<123) then
strtemp=strtemp & c
end if
next

convertstring=Lcase(strtemp)

end function
%>
全部转换程序
<%
function unicode(str)

dim i,j,c,i1,i2,u,fs,f,p

unicode=""
p=""
for i=1 to len(str)
c=mid(str,i,1)
j=ascw(c)
if j<0 then
j=j+65536
end if
if j>=0 and j<=128 then
if p="c" then
unicode=" "&unicode
p="e"
end if
unicode=unicode&c
else
if p="e" then
unicode=unicode&" "
p="c"
end if
unicode=unicode&"&#"&j&";"
end if
next

end function

function cutline(str,linelen)

dim i,j,c,k

cutline=""
j=0
for i=1 to len(str)
c=mid(str,i,1)
if asc(c)<0 or asc(c)>127 then
k=2
else
if asc(c)<32 then
k=0
if asc(c)=13 then
j=0
cutline=cutline+" "+c
c=""
end if
else
k=1
end if
end if
j=j+k
if j>linelen*2 then
cutline=cutline+" "+vbCrlf+c
j=k
else
cutline=cutline+c
end if
next

end function

function convertsymbol(sStr)

dim i,c

convertsymbol=""
for i=1 to len(sStr)
c=mid(sStr,i,1)
if c=">" then
convertsymbol=convertsymbol & ">"
elseif c="<" then
convertsymbol=convertsymbol & "<"
elseif c="'" then
convertsymbol=convertsymbol & "'"
elseif c="""" then
convertsymbol=convertsymbol & """
elseif c="&" then
convertsymbol=convertsymbol & "&"
elseif c="$" then
convertsymbol=convertsymbol & "$$"
else
convertsymbol=convertsymbol & c
end if
next

end function

function convertstring(sStr)

dim strtemp,asctemp,c

strtemp=""
for i=1 to len(sStr)
c=mid(sStr,i,1)
asctemp=ascw(c)
if (asctemp>47 and asctemp<58) or (asctemp>64 and asctemp<91) or (asctemp>96 and asctemp<123) then
strtemp=strtemp & c
end if
next

convertstring=Lcase(strtemp)

end function
%>

【声明】本站刊载的《wap中文问题》一文如果有侵害你权益的情况，请联系我们。我们将及时采取措施。
QQ：44637339　Email:just6@163.com　Tel:13355163107 Lining studios