PHP截取中文字符串的問題
添加時間:2014-7-28 1:34:31
添加:
思海網(wǎng)絡
以下代碼試用于GB2312編碼,截取中文字符串是PHP中一個頭疼的問題,解決方法是根據(jù)值是否大于等于128來判斷是否是雙字節(jié)字符,以避免出現(xiàn)亂碼的情況。但中英文混合、特殊符號等問題總是存在,現(xiàn)在寫一個比較全面的,僅供參考:
程序說明:
1. len 參數(shù)以中文字符為標準,1len等于2個英文字符,為了形式上好看些
2. 如果將magic參數(shù)設為false,則中文和英文同等看待,取絕對的字符數(shù)
3. 特別適用于用htmlspecialchars()進行過編碼的字符串
4. 能正確處理GB2312中實體字符模式()
程序代碼:
function FSubstr($title,$start,$len="",$magic=true)
{
/**
* powered by Smartpig
* mailto:d.einstein@263.net
*/
$length = 0;
if($len == "") $len = strlen($title);
//判斷起始為不正確位置
if($start > 0)
{
$cnum = 0;
for($i=0;$i<$start;$i++)
{
if(ord(substr($title,$i,1)) >= 128) $cnum ++;
}
if($cnum%2 != 0) $start--;
unset($cnum);
}
if(strlen($title)<=$len) return substr($title,$start,$len);
$alen = 0;
$blen = 0;
$realnum = 0;
for($i=$start;$i<strlen($title);$i++)
{
$ctype = 0;
$cstep = 0;
$cur = substr($title,$i,1);
if($cur == "&")
{
if(substr($title,$i,4) == "<")
{
$cstep = 4;
$length += 4;
$i += 3;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,4) == ">")
{
$cstep = 4;
$length += 4;
$i += 3;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,5) == "&")
{
$cstep = 5;
$length += 5;
$i += 4;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,6) == """)
{
$cstep = 6;
$length += 6;
$i += 5;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,6) == "'")
{
$cstep = 6;
$length += 6;
$i += 5;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(preg_match("/&#(\d+);/i",substr($title,$i,8),$match))
{
$cstep = strlen($match[0]);
$length += strlen($match[0]);
$i += strlen($match[0])-1;
$realnum ++;
if($magic)
{
$blen ++;
$ctype = 1;
}
}
}else{
if(ord($cur)>=128)
{
$cstep = 2;
$length += 2;
$i += 1;
$realnum ++;
if($magic)
{
$blen ++;
$ctype = 1;
}
}else{
$cstep = 1;
$length +=1;
$realnum ++;
if($magic)
{
if(ord($cur) >= 65 && ord($cur) <= 90)
{
$blen++;
}else{
$alen++;
}
}
}
}
if($magic)
{
if(($blen*2+$alen) == ($len*2)) break;
if(($blen*2+$alen) == ($len*2+1))
{
if($ctype == 1)
{
$length -= $cstep;
break;
}else{
break;
}
}
}else{
if($realnum == $len) break;
}
}
unset($cur);
unset($alen);
unset($blen);
unset($realnum);
unset($ctype);
unset($cstep);
return substr($title,$start,$length);
}
關鍵字:PHP、字符串、編碼
程序說明:
1. len 參數(shù)以中文字符為標準,1len等于2個英文字符,為了形式上好看些
2. 如果將magic參數(shù)設為false,則中文和英文同等看待,取絕對的字符數(shù)
3. 特別適用于用htmlspecialchars()進行過編碼的字符串
4. 能正確處理GB2312中實體字符模式()
程序代碼:
function FSubstr($title,$start,$len="",$magic=true)
{
/**
* powered by Smartpig
* mailto:d.einstein@263.net
*/
$length = 0;
if($len == "") $len = strlen($title);
//判斷起始為不正確位置
if($start > 0)
{
$cnum = 0;
for($i=0;$i<$start;$i++)
{
if(ord(substr($title,$i,1)) >= 128) $cnum ++;
}
if($cnum%2 != 0) $start--;
unset($cnum);
}
if(strlen($title)<=$len) return substr($title,$start,$len);
$alen = 0;
$blen = 0;
$realnum = 0;
for($i=$start;$i<strlen($title);$i++)
{
$ctype = 0;
$cstep = 0;
$cur = substr($title,$i,1);
if($cur == "&")
{
if(substr($title,$i,4) == "<")
{
$cstep = 4;
$length += 4;
$i += 3;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,4) == ">")
{
$cstep = 4;
$length += 4;
$i += 3;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,5) == "&")
{
$cstep = 5;
$length += 5;
$i += 4;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,6) == """)
{
$cstep = 6;
$length += 6;
$i += 5;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(substr($title,$i,6) == "'")
{
$cstep = 6;
$length += 6;
$i += 5;
$realnum ++;
if($magic)
{
$alen ++;
}
}
else if(preg_match("/&#(\d+);/i",substr($title,$i,8),$match))
{
$cstep = strlen($match[0]);
$length += strlen($match[0]);
$i += strlen($match[0])-1;
$realnum ++;
if($magic)
{
$blen ++;
$ctype = 1;
}
}
}else{
if(ord($cur)>=128)
{
$cstep = 2;
$length += 2;
$i += 1;
$realnum ++;
if($magic)
{
$blen ++;
$ctype = 1;
}
}else{
$cstep = 1;
$length +=1;
$realnum ++;
if($magic)
{
if(ord($cur) >= 65 && ord($cur) <= 90)
{
$blen++;
}else{
$alen++;
}
}
}
}
if($magic)
{
if(($blen*2+$alen) == ($len*2)) break;
if(($blen*2+$alen) == ($len*2+1))
{
if($ctype == 1)
{
$length -= $cstep;
break;
}else{
break;
}
}
}else{
if($realnum == $len) break;
}
}
unset($cur);
unset($alen);
unset($blen);
unset($realnum);
unset($ctype);
unset($cstep);
return substr($title,$start,$length);
}
關鍵字:PHP、字符串、編碼
新文章:
- CentOS7下圖形配置網(wǎng)絡的方法
- CentOS 7如何添加刪除用戶
- 如何解決centos7雙系統(tǒng)后丟失windows啟動項
- CentOS單網(wǎng)卡如何批量添加不同IP段
- CentOS下iconv命令的介紹
- Centos7 SSH密鑰登陸及密碼密鑰雙重驗證詳解
- CentOS 7.1添加刪除用戶的方法
- CentOS查找/掃描局域網(wǎng)打印機IP講解
- CentOS7使用hostapd實現(xiàn)無AP模式的詳解
- su命令不能切換root的解決方法
- 解決VMware下CentOS7網(wǎng)絡重啟出錯
- 解決Centos7雙系統(tǒng)后丟失windows啟動項
- CentOS下如何避免文件覆蓋
- CentOS7和CentOS6系統(tǒng)有什么不同呢
- Centos 6.6默認iptable規(guī)則詳解