#########################
# UTF-8eLXgr[A #
#########################

UTF-8eLXgr[A܂B
UTF-8(http://ja.wikipedia.org/wiki/UTF-8)`̃eLXgt@CAP/ECEŕ\܂B

================
CXg[@
================

1. uutf8view.pexvAP/ECEɓ]ĂB
2. UTF-8`txtt@CAP/ECEɓ]ĂB

========
@
========

----------------
t@CI
----------------

㉺E{^@@@@t@CI܂B
`͂a{^@@@@eLXg\ʂ֐i݂܂B

----------------
eLXg\
----------------

㉺E{^@@@@ʂXN[܂B
`͂a{^@@@@t@CIʂ֖߂܂B

--------
ł
--------

rdkdbs{^@@vOI܂B

========
Ƃ
========

ŋ߂́Af[^t@Cݒt@CAUTF-8`ŕۑĂ鎖Ȃ܂B
P/ECE̓R[h̓VtgJIS`(http://ja.wikipedia.org/wiki/Shift_JIS)ŁAUTF-8`̃eLXg𒼐ڈo܂B
UTF-8`̃eLXgɂ́At@C̓ǂݏUTF-8̃VtgJIS̕R[hϊs܂B
UTF-8̃VtgJIS̕R[hϊ͌vZōs͏oAϊ\gKvL܂B(http://unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/JIS/SHIFTJIS.TXT)

UTF-8̃VtgJIS̕ϊ\́APɎƖ28LoCg炢̃TCYɂȂ܂B
P/ECEł[TCYȂ̂łAR[hϊ̂߂28LoCgLƂ̂͏ܑ̖łB
ŁAǂꂮ炢eʂߖ񂵂UTF-8̃VtgJIS̕R[hϊs邩AĂ݂鎖ɂ܂B

̂@́AxxOłB
̓Iɂ́AUTF-8̃VtgJIS̕ϊ\kĂA1ϊƂɕϊ\擪WJȂ烊jAT[`܂B
ḱAȉ̎Oނ܂B
EuCHC (Canonical Huffman Code)v(http://codezine.jp/article/detail/376)
EuRC (Range Coder)v(http://codezine.jp/article/detail/443)
EuBPE (Byte Pair Encoding)v(http://ja.wikipedia.org/wiki/BPE)

ʂ͂Ȃ܂:

                  k                    CHC   RC    BPE 

@uSJISUTF8ϊ\v̈k̃TCY[Byte] 12580 12846 14832
AkɉWJ[`̃TCY[Byte]   284   164   144
BR[hϊ[`()̃TCY[Byte]   784   784   784
                    @AB̍vTCY[Byte] 13648 13794 15760
  sample1.txt UTF8SJISϊɊ|鎞[b]  26.7  62.4  14.6


kCHCԗǂłB
_CHCRC̕kǂ͂Ȃ̂łARC̓wb_傫ȂXLāACHCƗD򂪋t]Ă܂L悤łB
xBPEԗǂłB
BPECHCŖ{CCHCRC{ƂAՂxɂȂ܂B

kƑxāACHCoXǂƎv̂ŁA̎st@CCHCgărh܂B
ϊ\PɎA񔼕̃ߖɂȂ܂B
x́EEE傫ȃeLXgɂ͑SpIł͂ȂǁAȐݒt@C̓ǂݍ݂ȂΑvEEE(^^;
܂AP/ECËoƂŁA[鎖ɂ܂B

======================
Sun Mar 08 2015
Naoyuki Sawa
nsawa@north.hokkai.net
======================
