iconv_strlen

(PHP 5)

iconv_strlen — 文字列の文字数を返す

説明

int iconv_strlen ( string $str [, string $charset] )

strlen() とは違い、iconv_strlen() は与えられたバイト列 str 中に現れる文字の数を指定された文字セットに基づいて数えます。この結果は、必ずしも文字列のバイト数と一致するとは限りません。

パラメータ

str: 文字列。
charset: charset パラメータが指定されなかった場合、 str のエンコードは iconv.internal_encoding であると判断されます。

返り値

str の文字数を返します。

参考

strlen()
mb_strlen()

add a note User Contributed Notes
iconv_strlen

hfuecks @ nospam org
25-Feb-2006 09:58


If iconv_strlen is passed a UTF-8 string containing badly formed sequences, it will return FALSE. This is in contrast to mb_strlen of the behaviour of utf8_decode, which strip out any bad sequences;



<?php

# UTF-8 string containing bad sequence: \xe9

$str = "I�t�rn�ti�n\xe9�liz�ti�n";



print "mb_strlen: ".mb_strlen($str,'UTF-8')."\n";

print "strlen/utf8_decode: ".strlen(utf8_decode($str))."\n";

print "iconv_strlen: ".iconv_strlen($str,'UTF-8')."\n";

?>



Displays;



mb_strlen: 20

strlen/utf8_decode: 20

iconv_strlen:



(PHP 5.0.5)



As such it is being "stricter" than mb_strlen and it may mean you need to check for invalid sequences first. A quick way to check is to exploit the behaviour of the PCRE extension (see notes on pattern modifiers);



<?php

if (preg_match('/^.{1}/us',$str,$ar) != 1) {

    die("string contains invalid UTF-8");

}

?>



A slower but stricter check (regex) can be found at: http://www.w3.org/International/questions/qa-forms-utf-8



Similiar applies to iconv_substr, iconv_strpos and iconv_strrpos

add a note

iconv_strpos

" width="11" height="7"/>

iconv_set_encoding

Last updated: Thu, 31 May 2007