UTF-8,ISO 10646的一种转换格式( 四 )


据 。可标签的文本数据包含没有考虑进ISO/IEC10646修正5(即修正5前的代码点分配)的
Hangul音节编码成UTF-8 。其他的UTF-8数据不应该使用此标签 , 非凡是不包含任何Hangul
音节的数据 。非常重要的强烈建议是反对不考虑ISO/IEC10646修正5的情况下 , 创建任何
新的包含Hangul的数据 。
6、安全考虑
UTF-8实现需要进行安全考虑的方面是如何处理非法的UTF-8序列 。可以想象 , 在某些
环境中攻击者可能进行的攻击是发送一个UTF-8语法不答应的8比特字节序列给不谨慎的
UTF-8分析器 。
这种攻击一个非凡敏感的形态是攻击分析器 。此分析器对输入的UTF-8编码格式执行安
全鉴定有效性检查 , 但是解释了一些非法的8比特字节作为字符 。例如 , 当碰到单个8比特
字节序列00时 , 分析器可能禁止NUL字符 , 但是答应非法的两个8比特字节序列C080 ,
解释它为NUL字符 。另一个例子是禁止8比特字节序列2F2E2E2F("/../")的分析器 , 答应
非法8比特字节序列2FC0AE2E2F 。
鸣谢
下列人员参与本备忘录的起草和讨论:
JamesE.AgenbroadAndriesBrouwer
MartinJ.DrstNedFreed
DavidGoldsmithEdwinF.Hart
KentKarlssonMarkusKuhn
MichaelKungAlainLaBonte
JohnGardinerMyersMurraySargent
KeldSimonsenArnoldWinkler
参考
[CHARSET-REG]Freed,N.,andJ.Postel,"IANACharsetRegistration
Procedures",BCP19,RFC2278,January1998.
[FSS_UTF]X/OpenCAESpecificationC501ISBN1-85912-082-228cm.
22p.pbk.172g.4/95,X/OpenCompanyLtd.,"File
SystemSafeUCSTransformationFormat(FSS_UTF)",
X/OpenPreleminarySpecification,DocumentNumber
P316.AlsopublishedinUnicodeTechnicalReport#4.
[ISO-10646]ISO/IEC10646-1:1993.InternationalStandard--
Informationtechnology--UniversalMultiple-Octet
CodedCharacterSet(UCS)--Part1:Architectureand
BasicMultilingualPlane.Fiveamendmentsanda
technicalcorrigendumhavebeenpublisheduptonow.
UTF-8isdescribedinAnnexR,publishedasAmendment
2.UTF-16isdescribedinAnnexQ,publishedas
Amendment1.17otheramendmentsarecurrentlyat
variousstagesofstandardization.
[MIME]Freed,N.,andN.Borenstein,"MultipurposeInternet
MailExtensions(MIME)PartOne:FormatofInternet
MessageBodies",RFC2045.N.Freed,N.Borenstein,
"MultipurposeInternetMailExtensions(MIME)Part
Two:MediaTypes",RFC2046.K.Moore,"MIME
(MultipurposeInternetMailExtensions)PartThree:
MessageHeaderExtensionsforNon-ASCIIText",RFC
2047.N.Freed,J.Klensin,J.Postel,"Multipurpose
InternetMailExtensions(MIME)PartFour:
RegistrationProcedures",RFC2048.N.Freed,N.
Borenstein,"MultipurposeInternetMailExtensions
(MIME)PartFive:ConformanceCriteriaandExamples",
RFC2049.AllNovember1996.
[RFC2152]Goldsmith,D.,andM.Davis,"UTF-7:AMail-safe
TransformationFormatofUnicode",RFC1642,Taligent
inc.,May1997.(ObsoletesRFC1642)
[UNICODE]TheUnicodeConsortium,"TheUnicodeStandard--
Version2.0",Addison-Wesley,1996.
[US-ASCII]CodedCharacterSet--7-bitAmericanStandardCodefor
InformationInterchange,ANSIX3.4-1986.
作者地址
FrancoisYergeau
AlisTechnologies
100,boul.Alexis-Nihon
Suite600
MontrealQCH4M2P2
Canada
Phone: 1(514)747-2547
Fax: 1(514)747-2561
EMail:fyergeau@alis.com
版权说明
Copyright(C)TheInternetSociety(1998).AllRightsReserved.
Thisdocumentandtranslationsofitmaybecopiedandfurnishedto
others,andderivativeworksthatcommentonorotherwiseeXPlainit
orassistinitsimplementationmaybeprepared,copied,published
anddistributed,inwholeorinpart,withoutrestrictionofany
kind,providedthattheabovecopyrightnoticeandthisparagraphare
includedonallsuchcopiesandderivativeworks.However,this

推荐阅读