名称 最后更新
..
src/voku/helper 正在载入提交数据...
CHANGELOG.md 正在载入提交数据...
LICENSE-APACHE 正在载入提交数据...
LICENSE-GPL 正在载入提交数据...
README.md 正在载入提交数据...
SECURITY.md 正在载入提交数据...
SUMMARY.md 正在载入提交数据...
bootstrap.php 正在载入提交数据...
composer.json 正在载入提交数据...

Build Status Build status FOSSA Status codecov.io Codacy Badge Latest Stable Version Total Downloads License Donate to this project using PayPal Donate to this project using Patreon

🉑 Portable UTF-8

Description

It is written in PHP (PHP 7+) and can work without "mbstring", "iconv" or any other extra encoding php-extension on your server.

The benefit of Portable UTF-8 is that it is easy to use, easy to bundle. This library will also auto-detect your server environment and will use the installed php-extensions if they are available, so you will have the best possible performance.

As a fallback we will use Symfony Polyfills, if needed. (https://github.com/symfony/polyfill)

The project based on ...

Demo

Here you can test some basic functions from this library and you can compare some results with the native php function results.

Index

Alternative

If you like a more Object Oriented Way to edit strings, then you can take a look at voku/Stringy, it's a fork of "danielstjules/Stringy" but it used the "Portable UTF-8"-Class and some extra methods.

// Standard library
strtoupper('fòôbàř');       // 'FòôBàř'
strlen('fòôbàř');           // 10

// mbstring 
// WARNING: if you don't use a polyfill like "Portable UTF-8", you need to install the php-extension "mbstring" on your server
mb_strtoupper('fòôbàř');    // 'FÒÔBÀŘ'
mb_strlen('fòôbàř');        // '6'

// Portable UTF-8
use voku\helper\UTF8;
UTF8::strtoupper('fòôbàř');    // 'FÒÔBÀŘ'
UTF8::strlen('fòôbàř');        // '6'

// voku/Stringy
use Stringy\Stringy as S;
$stringy = S::create('fòôbàř');
$stringy->toUpperCase();    // 'FÒÔBÀŘ'
$stringy->length();         // '6'

Install "Portable UTF-8" via "composer require"

composer require voku/portable-utf8

If your project do not need some of the Symfony polyfills please use the replace section of your composer.json. This removes any overhead from these polyfills as they are no longer part of your project. e.g.:

{
  "replace": {
    "symfony/polyfill-php72": "1.99",
    "symfony/polyfill-iconv": "1.99",
    "symfony/polyfill-intl-grapheme": "1.99",
    "symfony/polyfill-intl-normalizer": "1.99",
    "symfony/polyfill-mbstring": "1.99"
  }
}

Why Portable UTF-8?

PHP 5 and earlier versions have no native Unicode support. To bridge the gap, there exist several extensions like "mbstring", "iconv" and "intl".

The problem with "mbstring" and others is that most of the time you cannot ensure presence of a specific one on a server. If you rely on one of these, your application is no more portable. This problem gets even severe for open source applications that have to run on different servers with different configurations. Considering these, I decided to write a library:

Requirements and Recommendations

  • No extensions are required to run this library. Portable UTF-8 only needs PCRE library that is available by default since PHP 4.2.0 and cannot be disabled since PHP 5.3.0. "\u" modifier support in PCRE for UTF-8 handling is not a must.
  • PHP 5.3 is the minimum requirement, and all later versions are fine with Portable UTF-8.
  • PHP 7.0 is the minimum requirement since version 4.0 of Portable UTF-8, otherwise composer will install an older version
  • PHP 8.0 support is also available and will adapt the behaviours of the native functions.
  • To speed up string handling, it is recommended that you have "mbstring" or "iconv" available on your server, as well as the latest version of PCRE library
  • Although Portable UTF-8 is easy to use; moving from native API to Portable UTF-8 may not be straight-forward for everyone. It is highly recommended that you do not update your scripts to include Portable UTF-8 or replace or change anything before you first know the reason and consequences. Most of the time, some native function may be all what you need.
  • There is also a shim for "mbstring", "iconv" and "intl", so you can use it also on shared webspace.

Usage

Example 1: UTF8::cleanup()

  echo UTF8::cleanup('�Düsseldorf�');

  // will output:
  // Düsseldorf

Example 2: UTF8::strlen()

  $string = 'string <strong>with utf-8 chars åèä</strong> - doo-bee doo-bee dooh';

  echo strlen($string) . "\n<br />";
  echo UTF8::strlen($string) . "\n<br />";

  // will output:
  // 70
  // 67

  $string_test1 = strip_tags($string);
  $string_test2 = UTF8::strip_tags($string);

  echo strlen($string_test1) . "\n<br />";
  echo UTF8::strlen($string_test2) . "\n<br />";

  // will output:
  // 53
  // 50

Example 3: UTF8::fix_utf8()


  echo UTF8::fix_utf8('Düsseldorf');
  echo UTF8::fix_utf8('ä');

  // will output:
  // Düsseldorf
  // ä

Portable UTF-8 | API

The API from the "UTF8"-Class is written as small static methods that will match the default PHP-API.

Class methods

access add_bom_to_string array_change_key_case between
binary_to_str bom callback char_at
chars checkForSupport chr chr_map
chr_size_list chr_to_decimal chr_to_hex chunk_split
clean cleanup codepoints collapse_whitespace
count_chars css_identifier css_stripe_media_queries ctype_loaded
decimal_to_chr decode_mimeheader emoji_decode emoji_encode
emoji_from_country_code encode encode_mimeheader extract_text
file_get_contents file_has_bom filter filter_input
filter_input_array filter_var filter_var_array finfo_loaded
first_char fits_inside fix_simple_utf8 fix_utf8
getCharDirection getSupportInfo get_file_type get_random_string
get_unique_string has_lowercase has_uppercase has_whitespace
hex_to_chr hex_to_int html_encode html_entity_decode
html_escape html_stripe_empty_tags htmlentities htmlspecialchars
iconv_loaded int_to_hex intlChar_loaded intl_loaded
is_alpha is_alphanumeric is_ascii is_base64
is_binary is_binary_file is_blank is_bom
is_empty is_hexadecimal is_html is_json
is_lowercase is_printable is_punctuation is_serialized
is_uppercase is_url is_utf8 is_utf16
is_utf32 json_decode json_encode json_loaded
lcfirst lcwords ltrim max
max_chr_width mbstring_loaded min normalize_encoding
normalize_line_ending normalize_msword normalize_whitespace ord
parse_str pcre_utf8_support range rawurldecode
regex_replace remove_bom remove_duplicates remove_html
remove_html_breaks remove_invisible_characters remove_left remove_right
replace replace_all replace_diamond_question_mark rtrim
showSupport single_chr_html_encode spaces_to_tabs str_camelize
str_capitalize_name str_contains str_contains_all str_contains_any
str_dasherize str_delimit str_detect_encoding str_ends_with
str_ends_with_any str_ensure_left str_ensure_right str_humanize
str_iends_with str_iends_with_any str_insert str_ireplace
str_ireplace_beginning str_ireplace_ending str_istarts_with str_istarts_with_any
str_isubstr_after_first_separator str_isubstr_after_last_separator str_isubstr_before_first_separator str_isubstr_before_last_separator
str_isubstr_first str_isubstr_last str_last_char str_limit
str_limit_after_word str_longest_common_prefix str_longest_common_substring str_longest_common_suffix
str_matches_pattern str_obfuscate str_offset_exists str_offset_get
str_pad str_pad_both str_pad_left str_pad_right
str_repeat str_replace_beginning str_replace_ending str_replace_first
str_replace_last str_shuffle str_slice str_snakeize
str_sort str_split str_split_array str_split_pattern
str_starts_with str_starts_with_any str_substr_after_first_separator str_substr_after_last_separator
str_substr_before_first_separator str_substr_before_last_separator str_substr_first str_substr_last
str_surround str_titleize str_titleize_for_humans str_to_binary
str_to_lines str_to_words str_truncate str_truncate_safe
str_underscored str_upper_camelize str_word_count strcasecmp
strcmp strcspn string string_has_bom
strip_tags strip_whitespace stripos stripos_in_byte
stristr strlen strlen_in_byte strnatcasecmp
strnatcmp strncasecmp strncmp strpbrk
strpos strpos_in_byte strrchr strrev
strrichr strripos strripos_in_byte strrpos
strrpos_in_byte strspn strstr strstr_in_byte
strtocasefold strtolower strtoupper strtr
strwidth substr substr_compare substr_count
substr_count_in_byte substr_count_simple substr_ileft substr_in_byte
substr_iright substr_left substr_replace substr_right
swapCase symfony_polyfill_used tabs_to_spaces titlecase
to_ascii to_boolean to_filename to_int
to_iso8859 to_string to_utf8 to_utf8_string
trim ucfirst ucwords urldecode
utf8_decode utf8_encode whitespace_table words_limit
wordwrap wordwrap_per_line ws
## access(string $str, int $pos, string $encoding): string Return the character at the specified position: $str[1] like functionality. EXAMPLE: UTF8::access('fòô', 1); // 'ò' **Parameters:** - `string $str

A UTF-8 string.

` - `int $pos

The position of character to return.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string

Single multi-byte character.

` -------- ## add_bom_to_string(string $str): string Prepends UTF-8 BOM character to the string and returns the whole string. INFO: If BOM already existed there, the Input string is returned. EXAMPLE: UTF8::add_bom_to_string('fòô'); // "\xEF\xBB\xBF" . 'fòô' **Parameters:** - `string $str

The input string.

` **Return:** - `string

The output string that contains BOM.

` -------- ## array_change_key_case(array $array, int $case, string $encoding): string[] Changes all keys in an array. **Parameters:** - `array $array

The array to work on

` - `int $case [optional]

Either CASE_UPPER
or CASE_LOWER (default)

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string[]

An array with its keys lower- or uppercased.

` -------- ## between(string $str, string $start, string $end, int $offset, string $encoding): string Returns the substring between $start and $end, if found, or an empty string. An optional offset may be supplied from which to begin the search for the start string. **Parameters:** - `string $str` - `string $start

Delimiter marking the start of the substring.

` - `string $end

Delimiter marking the end of the substring.

` - `int $offset [optional]

Index from which to begin the search. Default: 0

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string` -------- ## binary_to_str(string $bin): string Convert binary into a string. INFO: opposite to UTF8::str_to_binary() EXAMPLE: UTF8::binary_to_str('11110000100111111001100010000011'); // '😃' **Parameters:** - `string $bin 1|0` **Return:** - `string` -------- ## bom(): string Returns the UTF-8 Byte Order Mark Character. INFO: take a look at UTF8::$bom for e.g. UTF-16 and UTF-32 BOM values EXAMPLE: UTF8::bom(); // "\xEF\xBB\xBF" **Parameters:** __nothing__ **Return:** - `string

UTF-8 Byte Order Mark.

` -------- ## callback(callable $callback, string $str): string[] **Parameters:** - `callable $callback` - `string $str` **Return:** - `string[]` -------- ## char_at(string $str, int $index, string $encoding): string Returns the character at $index, with indexes starting at 0. **Parameters:** - `string $str

The input string.

` - `int $index

Position of the character.

` - `string $encoding [optional]

Default is UTF-8

` **Return:** - `string

The character at $index.

` -------- ## chars(string $str): string[] Returns an array consisting of the characters in the string. **Parameters:** - `string $str

The input string.

` **Return:** - `string[]

An array of chars.

` -------- ## checkForSupport(): true|null This method will auto-detect your server environment for UTF-8 support. **Parameters:** __nothing__ **Return:** - `true|null` -------- ## chr(int $code_point, string $encoding): string|null Generates a UTF-8 encoded character from the given code point. INFO: opposite to UTF8::ord() EXAMPLE: UTF8::chr(0x2603); // '☃' **Parameters:** - `int $code_point

The code point for which to generate a character.

` - `string $encoding [optional]

Default is UTF-8

` **Return:** - `string|null

Multi-byte character, returns null on failure or empty input.

` -------- ## chr_map(callable $callback, string $str): string[] Applies callback to all characters of a string. EXAMPLE: UTF8::chr_map([UTF8::class, 'strtolower'], 'Κόσμε'); // ['κ','ό', 'σ', 'μ', 'ε'] **Parameters:** - `callable $callback

The callback function.

` - `string $str

UTF-8 string to run callback on.

` **Return:** - `string[]

The outcome of the callback, as array.

` -------- ## chr_size_list(string $str): int[] Generates an array of byte length of each character of a Unicode string. 1 byte => U+0000 - U+007F 2 byte => U+0080 - U+07FF 3 byte => U+0800 - U+FFFF 4 byte => U+10000 - U+10FFFF EXAMPLE: UTF8::chr_size_list('中文空白-test'); // [3, 3, 3, 3, 1, 1, 1, 1, 1] **Parameters:** - `string $str

The original unicode string.

` **Return:** - `int[]

An array of byte lengths of each character.

` -------- ## chr_to_decimal(string $char): int Get a decimal code representation of a specific character. INFO: opposite to UTF8::decimal_to_chr() EXAMPLE: UTF8::chr_to_decimal('§'); // 0xa7 **Parameters:** - `string $char

The input character.

` **Return:** - `int` -------- ## chr_to_hex(int|string $char, string $prefix): string Get hexadecimal code point (U+xxxx) of a UTF-8 encoded character. EXAMPLE: UTF8::chr_to_hex('§'); // U+00a7 **Parameters:** - `int|string $char

The input character

` - `string $prefix [optional]` **Return:** - `string

The code point encoded as U+xxxx.

` -------- ## chunk_split(string $body, int $chunk_length, string $end): string Splits a string into smaller chunks and multiple lines, using the specified line ending character. EXAMPLE: UTF8::chunk_split('ABC-ÖÄÜ-中文空白-κόσμε', 3); // "ABC\r\n-ÖÄ\r\nÜ-中\r\n文空白\r\n-κό\r\nσμε" **Parameters:** - `string $body

The original string to be split.

` - `int $chunk_length [optional]

The maximum character length of a chunk.

` - `string $end [optional]

The character(s) to be inserted at the end of each chunk.

` **Return:** - `string

The chunked string.

` -------- ## clean(string $str, bool $remove_bom, bool $normalize_whitespace, bool $normalize_msword, bool $keep_non_breaking_space, bool $replace_diamond_question_mark, bool $remove_invisible_characters, bool $remove_invisible_characters_url_encoded): string Accepts a string and removes all non-UTF-8 characters from it + extras if needed. EXAMPLE: UTF8::clean("\xEF\xBB\xBF„Abcdef\xc2\xa0\x20…” — 😃 - Düsseldorf", true, true); // '„Abcdef  …” — 😃 - Düsseldorf' **Parameters:** - `string $str

The string to be sanitized.

` - `bool $remove_bom [optional]

Set to true, if you need to remove UTF-BOM.

` - `bool $normalize_whitespace [optional]

Set to true, if you need to normalize the whitespace.

` - `bool $normalize_msword [optional]

Set to true, if you need to normalize MS Word chars e.g.: "…" => "..."

` - `bool $keep_non_breaking_space [optional]

Set to true, to keep non-breaking-spaces, in combination with $normalize_whitespace

` - `bool $replace_diamond_question_mark [optional]

Set to true, if you need to remove diamond question mark e.g.: "�"

` - `bool $remove_invisible_characters [optional]

Set to false, if you not want to remove invisible characters e.g.: "\0"

` - `bool $remove_invisible_characters_url_encoded [optional]

Set to true, if you not want to remove invisible url encoded characters e.g.: "%0B"
WARNING: maybe contains false-positives e.g. aa%0Baa -> aaaa.

` **Return:** - `string

An clean UTF-8 encoded string.

` -------- ## cleanup(string $str): string Clean-up a string and show only printable UTF-8 chars at the end + fix UTF-8 encoding. EXAMPLE: UTF8::cleanup("\xEF\xBB\xBF„Abcdef\xc2\xa0\x20…” — 😃 - Düsseldorf", true, true); // '„Abcdef  …” — 😃 - Düsseldorf' **Parameters:** - `string $str

The input string.

` **Return:** - `string` -------- ## codepoints(string|string[] $arg, bool $use_u_style): int[]|string[] Accepts a string or a array of strings and returns an array of Unicode code points. INFO: opposite to UTF8::string() EXAMPLE: UTF8::codepoints('κöñ'); // array(954, 246, 241) // ... OR ... UTF8::codepoints('κöñ', true); // array('U+03ba', 'U+00f6', 'U+00f1') **Parameters:** - `string|string[] $arg

A UTF-8 encoded string or an array of such strings.

` - `bool $use_u_style

If True, will return code points in U+xxxx format, default, code points will be returned as integers.

` **Return:** - `int[]|string[]

The array of code points:
int[] for $u_style === false
string[] for $u_style === true

` -------- ## collapse_whitespace(string $str): string Trims the string and replaces consecutive whitespace characters with a single space. This includes tabs and newline characters, as well as multibyte whitespace such as the thin space and ideographic space. **Parameters:** - `string $str

The input string.

` **Return:** - `string

A string with trimmed $str and condensed whitespace.

` -------- ## count_chars(string $str, bool $clean_utf8, bool $try_to_use_mb_functions): int[] Returns count of characters used in a string. EXAMPLE: UTF8::count_chars('κaκbκc'); // array('κ' => 3, 'a' => 1, 'b' => 1, 'c' => 1) **Parameters:** - `string $str

The input string.

` - `bool $clean_utf8 [optional]

Remove non UTF-8 chars from the string.

` - `bool $try_to_use_mb_functions [optional]

Set to false, if you don't want to use` **Return:** - `int[]

An associative array of Character as keys and their count as values.

` -------- ## css_identifier(string $str, string[] $filter, bool $strip_tags, bool $strtolower): string Create a valid CSS identifier for e.g. "class"- or "id"-attributes. EXAMPLE: UTF8::css_identifier('123foo/bar!!!'); // _23foo-bar copy&past from https://github.com/drupal/core/blob/8.8.x/lib/Drupal/Component/Utility/Html.php#L95 **Parameters:** - `string $str

INFO: if no identifier is given e.g. " " or "", we will create a unique string automatically

` - `array $filter` - `bool $strip_tags` - `bool $strtolower` **Return:** - `string` -------- ## css_stripe_media_queries(string $str): string Remove css media-queries. **Parameters:** - `string $str` **Return:** - `string` -------- ## ctype_loaded(): bool Checks whether ctype is available on the server. **Parameters:** __nothing__ **Return:** - `bool

true if available, false otherwise

` -------- ## decimal_to_chr(int|string $int): string Converts an int value into a UTF-8 character. INFO: opposite to UTF8::string() EXAMPLE: UTF8::decimal_to_chr(931); // 'Σ' **Parameters:** - `int|string $int` **Return:** - `string` -------- ## decode_mimeheader(string $str, string $encoding): false|string Decodes a MIME header field **Parameters:** - `string $str` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `false|string

A decoded MIME field on success, or false if an error occurs during the decoding.

` -------- ## emoji_decode(string $str, bool $use_reversible_string_mappings): string Decodes a string which was encoded by "UTF8::emoji_encode()". INFO: opposite to UTF8::emoji_encode() EXAMPLE: UTF8::emoji_decode('foo CHARACTER_OGRE', false); // 'foo 👹' // UTF8::emoji_decode('foo _-_PORTABLE_UTF8_-_308095726_-_627590803_-_8FTU_ELBATROP_-_', true); // 'foo 👹' **Parameters:** - `string $str

The input string.

` - `bool $use_reversible_string_mappings [optional]

When TRUE, we se a reversible string mapping between "emoji_encode" and "emoji_decode".

` **Return:** - `string` -------- ## emoji_encode(string $str, bool $use_reversible_string_mappings): string Encode a string with emoji chars into a non-emoji string. INFO: opposite to UTF8::emoji_decode() EXAMPLE: UTF8::emoji_encode('foo 👹', false)); // 'foo CHARACTER_OGRE' // UTF8::emoji_encode('foo 👹', true)); // 'foo _-_PORTABLE_UTF8_-_308095726_-_627590803_-_8FTU_ELBATROP_-_' **Parameters:** - `string $str

The input string

` - `bool $use_reversible_string_mappings [optional]

when TRUE, we use a reversible string mapping between "emoji_encode" and "emoji_decode"

` **Return:** - `string` -------- ## emoji_from_country_code(string $country_code_iso_3166_1): string Convert any two-letter country code (ISO 3166-1) to the corresponding Emoji. **Parameters:** - `string $country_code_iso_3166_1

e.g. DE

` **Return:** - `string

Emoji or empty string on error.

` -------- ## encode(string $to_encoding, string $str, bool $auto_detect_the_from_encoding, string $from_encoding): string Encode a string with a new charset-encoding. INFO: This function will also try to fix broken / double encoding, so you can call this function also on a UTF-8 string and you don't mess up the string. EXAMPLE: UTF8::encode('ISO-8859-1', '-ABC-中文空白-'); // '-ABC-????-' // UTF8::encode('UTF-8', '-ABC-中文空白-'); // '-ABC-中文空白-' // UTF8::encode('HTML', '-ABC-中文空白-'); // '-ABC-中文空白-' // UTF8::encode('BASE64', '-ABC-中文空白-'); // 'LUFCQy3kuK3mlofnqbrnmb0t' **Parameters:** - `string $to_encoding

e.g. 'UTF-16', 'UTF-8', 'ISO-8859-1', etc.

` - `string $str

The input string

` - `bool $auto_detect_the_from_encoding [optional]

Force the new encoding (we try to fix broken / double encoding for UTF-8)
otherwise we auto-detect the current string-encoding

` - `string $from_encoding [optional]

e.g. 'UTF-16', 'UTF-8', 'ISO-8859-1', etc.
A empty string will trigger the autodetect anyway.

` **Return:** - `string` -------- ## encode_mimeheader(string $str, string $from_charset, string $to_charset, string $transfer_encoding, string $linefeed, int $indent): false|string **Parameters:** - `string $str` - `string $from_charset [optional]

Set the input charset.

` - `string $to_charset [optional]

Set the output charset.

` - `string $transfer_encoding [optional]

Set the transfer encoding.

` - `string $linefeed [optional]

Set the used linefeed.

` - `int $indent [optional]

Set the max length indent.

` **Return:** - `false|string

An encoded MIME field on success, or false if an error occurs during the encoding.

` -------- ## extract_text(string $str, string $search, int|null $length, string $replacer_for_skipped_text, string $encoding): string Create an extract from a sentence, so if the search-string was found, it try to centered in the output. **Parameters:** - `string $str

The input string.

` - `string $search

The searched string.

` - `int|null $length [optional]

Default: null === text->length / 2

` - `string $replacer_for_skipped_text [optional]

Default: …

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string` -------- ## file_get_contents(string $filename, bool $use_include_path, resource|null $context, int|null $offset, int|null $max_length, int $timeout, bool $convert_to_utf8, string $from_encoding): false|string Reads entire file into a string. EXAMPLE: UTF8::file_get_contents('utf16le.txt'); // ... WARNING: Do not use UTF-8 Option ($convert_to_utf8) for binary files (e.g.: images) !!! **Parameters:** - `string $filename

Name of the file to read.

` - `bool $use_include_path [optional]

Prior to PHP 5, this parameter is called use_include_path and is a bool. As of PHP 5 the FILE_USE_INCLUDE_PATH can be used to trigger include path search.

` - `resource|null $context [optional]

A valid context resource created with stream_context_create. If you don't need to use a custom context, you can skip this parameter by &null;.

` - `int|null $offset [optional]

The offset where the reading starts.

` - `int|null $max_length [optional]

Maximum length of data read. The default is to read until end of file is reached.

` - `int $timeout

The time in seconds for the timeout.

` - `bool $convert_to_utf8 WARNING!!!

Maybe you can't use this option for some files, because they used non default utf-8 chars. Binary files like images or pdf will not be converted.

` - `string $from_encoding [optional]

e.g. 'UTF-16', 'UTF-8', 'ISO-8859-1', etc.
A empty string will trigger the autodetect anyway.

` **Return:** - `false|string

The function returns the read data as string or false on failure.

` -------- ## file_has_bom(string $file_path): bool Checks if a file starts with BOM (Byte Order Mark) character. EXAMPLE: UTF8::file_has_bom('utf8_with_bom.txt'); // true **Parameters:** - `string $file_path

Path to a valid file.

` **Return:** - `bool

true if the file has BOM at the start, false otherwise

` -------- ## filter(array|object|string $var, int $normalization_form, string $leading_combining): mixed Normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. EXAMPLE: UTF8::filter(array("\xE9", 'à', 'a')); // array('é', 'à', 'a') **Parameters:** - `TFilter $var` - `int $normalization_form` - `string $leading_combining` **Return:** - `mixed` -------- ## filter_input(int $type, string $variable_name, int $filter, int|int[]|null $options): mixed "filter_input()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. Gets a specific external variable by name and optionally filters it. EXAMPLE: // _GET['foo'] = 'bar'; UTF8::filter_input(INPUT_GET, 'foo', FILTER_UNSAFE_RAW)); // 'bar' **Parameters:** - `int $type

One of INPUT_GET, INPUT_POST, INPUT_COOKIE, INPUT_SERVER, or INPUT_ENV.

` - `string $variable_name

Name of a variable to get.

` - `int $filter [optional]

The ID of the filter to apply. The manual page lists the available filters.

` - `int|int[]|null $options [optional]

Associative array of options or bitwise disjunction of flags. If filter accepts options, flags can be provided in "flags" field of array.

` **Return:** - `mixed

Value of the requested variable on success, FALSE if the filter fails, or NULL if the variable_name variable is not set. If the flag FILTER_NULL_ON_FAILURE is used, it returns FALSE if the variable is not set and NULL if the filter fails.

` -------- ## filter_input_array(int $type, array|null $definition, bool $add_empty): mixed "filter_input_array()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. Gets external variables and optionally filters them. EXAMPLE: // _GET['foo'] = 'bar'; UTF8::filter_input_array(INPUT_GET, array('foo' => 'FILTER_UNSAFE_RAW')); // array('bar') **Parameters:** - `int $type

One of INPUT_GET, INPUT_POST, INPUT_COOKIE, INPUT_SERVER, or INPUT_ENV.

` - `array|null $definition [optional]

An array defining the arguments. A valid key is a string containing a variable name and a valid value is either a filter type, or an array optionally specifying the filter, flags and options. If the value is an array, valid keys are filter which specifies the filter type, flags which specifies any flags that apply to the filter, and options which specifies any options that apply to the filter. See the example below for a better understanding.

This parameter can be also an integer holding a filter constant. Then all values in the input array are filtered by this filter.

` - `bool $add_empty [optional]

Add missing keys as NULL to the return value.

` **Return:** - `mixed

An array containing the values of the requested variables on success, or FALSE on failure. An array value will be FALSE if the filter fails, or NULL if the variable is not set. Or if the flag FILTER_NULL_ON_FAILURE is used, it returns FALSE if the variable is not set and NULL if the filter fails.

` -------- ## filter_var(float|int|string|null $variable, int $filter, int|int[]|null $options): mixed "filter_var()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. Filters a variable with a specified filter. EXAMPLE: UTF8::filter_var('-ABC-中文空白-', FILTER_VALIDATE_URL); // false **Parameters:** - `float|int|string|null $variable

Value to filter.

` - `int $filter [optional]

The ID of the filter to apply. The manual page lists the available filters.

` - `int|int[]|null $options [optional]

Associative array of options or bitwise disjunction of flags. If filter accepts options, flags can be provided in "flags" field of array. For the "callback" filter, callable type should be passed. The callback must accept one argument, the value to be filtered, and return the value after filtering/sanitizing it.

// for filters that accept options, use this format $options = array( 'options' => array( 'default' => 3, // value to return if the filter fails // other options here 'min_range' => 0 ), 'flags' => FILTER_FLAG_ALLOW_OCTAL, ); $var = filter_var('0755', FILTER_VALIDATE_INT, $options); // for filter that only accept flags, you can pass them directly $var = filter_var('oops', FILTER_VALIDATE_BOOLEAN, FILTER_NULL_ON_FAILURE); // for filter that only accept flags, you can also pass as an array $var = filter_var('oops', FILTER_VALIDATE_BOOLEAN, array('flags' => FILTER_NULL_ON_FAILURE)); // callback validate filter function foo($value) { // Expected format: Surname, GivenNames if (strpos($value, ", ") === false) return false; list($surname, $givennames) = explode(", ", $value, 2); $empty = (empty($surname) || empty($givennames)); $notstrings = (!is_string($surname) || !is_string($givennames)); if ($empty || $notstrings) { return false; } else { return $value; } } $var = filter_var('Doe, Jane Sue', FILTER_CALLBACK, array('options' => 'foo'));

` **Return:** - `mixed

The filtered data, or FALSE if the filter fails.

` -------- ## filter_var_array(array $data, array|int|null $definition, bool $add_empty): mixed "filter_var_array()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. Gets multiple variables and optionally filters them. EXAMPLE: $filters = [ 'name' => ['filter' => FILTER_CALLBACK, 'options' => [UTF8::class, 'ucwords']], 'age' => ['filter' => FILTER_VALIDATE_INT, 'options' => ['min_range' => 1, 'max_range' => 120]], 'email' => FILTER_VALIDATE_EMAIL, ]; $data = [ 'name' => 'κόσμε', 'age' => '18', 'email' => 'foo@bar.de' ]; UTF8::filter_var_array($data, $filters, true); // ['name' => 'Κόσμε', 'age' => 18, 'email' => 'foo@bar.de'] **Parameters:** - `array $data

An array with string keys containing the data to filter.

` - `array|int|null $definition [optional]

An array defining the arguments. A valid key is a string containing a variable name and a valid value is either a filter type, or an array optionally specifying the filter, flags and options. If the value is an array, valid keys are filter which specifies the filter type, flags which specifies any flags that apply to the filter, and options which specifies any options that apply to the filter. See the example below for a better understanding.

This parameter can be also an integer holding a filter constant. Then all values in the input array are filtered by this filter.

` - `bool $add_empty [optional]

Add missing keys as NULL to the return value.

` **Return:** - `mixed

An array containing the values of the requested variables on success, or FALSE on failure. An array value will be FALSE if the filter fails, or NULL if the variable is not set.

` -------- ## finfo_loaded(): bool Checks whether finfo is available on the server. **Parameters:** __nothing__ **Return:** - `bool

true if available, false otherwise

` -------- ## first_char(string $str, int $n, string $encoding): string Returns the first $n characters of the string. **Parameters:** - `string $str

The input string.

` - `int $n

Number of characters to retrieve from the start.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string` -------- ## fits_inside(string $str, int $box_size): bool Check if the number of Unicode characters isn't greater than the specified integer. EXAMPLE: UTF8::fits_inside('κόσμε', 6); // false **Parameters:** - `string $str the original string to be checked` - `int $box_size the size in number of chars to be checked against string` **Return:** - `bool

TRUE if string is less than or equal to $box_size, FALSE otherwise.

` -------- ## fix_simple_utf8(string $str): string Try to fix simple broken UTF-8 strings. INFO: Take a look at "UTF8::fix_utf8()" if you need a more advanced fix for broken UTF-8 strings. EXAMPLE: UTF8::fix_simple_utf8('Düsseldorf'); // 'Düsseldorf' If you received an UTF-8 string that was converted from Windows-1252 as it was ISO-8859-1 (ignoring Windows-1252 chars from 80 to 9F) use this function to fix it. See: http://en.wikipedia.org/wiki/Windows-1252 **Parameters:** - `string $str

The input string

` **Return:** - `string` -------- ## fix_utf8(string|string[] $str): string|string[] Fix a double (or multiple) encoded UTF8 string. EXAMPLE: UTF8::fix_utf8('Fédération'); // 'Fédération' **Parameters:** - `TFixUtf8 $str you can use a string or an array of strings` **Return:** - `string|string[]

Will return the fixed input-"array" or the fixed input-"string".

` -------- ## getCharDirection(string $char): string Get character of a specific character. EXAMPLE: UTF8::getCharDirection('ا'); // 'RTL' **Parameters:** - `string $char` **Return:** - `string

'RTL' or 'LTR'.

` -------- ## getSupportInfo(string|null $key): mixed Check for php-support. **Parameters:** - `string|null $key` **Return:** - `mixed Return the full support-"array", if $key === null
return bool-value, if $key is used and available
otherwise return null` -------- ## get_file_type(string $str, array $fallback): null[]|string[] Warning: this method only works for some file-types (png, jpg) if you need more supported types, please use e.g. "finfo" **Parameters:** - `string $str` - `array{ext: (null|string), mime: (null|string), type: (null|string)} $fallback

with this keys: 'ext', 'mime', 'type'` **Return:** - `null[]|string[]

with this keys: 'ext', 'mime', 'type'

` -------- ## get_random_string(int $length, string $possible_chars, string $encoding): string **Parameters:** - `int $length

Length of the random string.

` - `string $possible_chars [optional]

Characters string for the random selection.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string` -------- ## get_unique_string(int|string $extra_entropy, bool $use_md5): string **Parameters:** - `int|string $extra_entropy [optional]

Extra entropy via a string or int value.

` - `bool $use_md5 [optional]

Return the unique identifier as md5-hash? Default: true

` **Return:** - `string` -------- ## has_lowercase(string $str): bool Returns true if the string contains a lower case char, false otherwise. **Parameters:** - `string $str

The input string.

` **Return:** - `bool

Whether or not the string contains a lower case character.

` -------- ## has_uppercase(string $str): bool Returns true if the string contains an upper case char, false otherwise. **Parameters:** - `string $str

The input string.

` **Return:** - `bool

Whether or not the string contains an upper case character.

` -------- ## has_whitespace(string $str): bool Returns true if the string contains whitespace, false otherwise. **Parameters:** - `string $str

The input string.

` **Return:** - `bool

Whether or not the string contains whitespace.

` -------- ## hex_to_chr(string $hexdec): false|string Converts a hexadecimal value into a UTF-8 character. INFO: opposite to UTF8::chr_to_hex() EXAMPLE: UTF8::hex_to_chr('U+00a7'); // '§' **Parameters:** - `string $hexdec

The hexadecimal value.

` **Return:** - `false|string one single UTF-8 character` -------- ## hex_to_int(string $hexdec): false|int Converts hexadecimal U+xxxx code point representation to integer. INFO: opposite to UTF8::int_to_hex() EXAMPLE: UTF8::hex_to_int('U+00f1'); // 241 **Parameters:** - `string $hexdec

The hexadecimal code point representation.

` **Return:** - `false|int

The code point, or false on failure.

` -------- ## html_encode(string $str, bool $keep_ascii_chars, string $encoding): string Converts a UTF-8 string to a series of HTML numbered entities. INFO: opposite to UTF8::html_decode() EXAMPLE: UTF8::html_encode('中文空白'); // '中文空白' **Parameters:** - `string $str

The Unicode string to be encoded as numbered entities.

` - `bool $keep_ascii_chars [optional]

Keep ASCII chars.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string HTML numbered entities` -------- ## html_entity_decode(string $str, int|null $flags, string $encoding): string UTF-8 version of html_entity_decode() The reason we are not using html_entity_decode() by itself is because while it is not technically correct to leave out the semicolon at the end of an entity most browsers will still interpret the entity correctly. html_entity_decode() does not convert entities without semicolons, so we are left with our own little solution here. Bummer. Convert all HTML entities to their applicable characters. INFO: opposite to UTF8::html_encode() EXAMPLE: UTF8::html_entity_decode('中文空白'); // '中文空白' **Parameters:** - `string $str

The input string.

` - `int|null $flags [optional]

A bitmask of one or more of the following flags, which specify how to handle quotes and which document type to use. The default is ENT_COMPAT | ENT_HTML401.

Available flags constants
Constant Name Description
ENT_COMPAT Will convert double-quotes and leave single-quotes alone.
ENT_QUOTES Will convert both double and single quotes.
ENT_NOQUOTES Will leave both double and single quotes unconverted.
ENT_HTML401 Handle code as HTML 4.01.
ENT_XML1 Handle code as XML 1.
ENT_XHTML Handle code as XHTML.
ENT_HTML5 Handle code as HTML 5.
` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string the decoded string` -------- ## html_escape(string $str, string $encoding): string Create a escape html version of the string via "UTF8::htmlspecialchars()". **Parameters:** - `string $str` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string` -------- ## html_stripe_empty_tags(string $str): string Remove empty html-tag. e.g.:


**Parameters:**
- `string $str`

**Return:**
- `string`

--------

## htmlentities(string $str, int $flags, string $encoding, bool $double_encode): string

Convert all applicable characters to HTML entities: UTF-8 version of htmlentities().

EXAMPLE: UTF8::htmlentities(''); // '<白-öäü>'

**Parameters:**
- `string $str 

The input string.

` - `int $flags [optional]

A bitmask of one or more of the following flags, which specify how to handle quotes, invalid code unit sequences and the used document type. The default is ENT_COMPAT | ENT_HTML401.

Available flags constants
Constant Name Description
ENT_COMPAT Will convert double-quotes and leave single-quotes alone.
ENT_QUOTES Will convert both double and single quotes.
ENT_NOQUOTES Will leave both double and single quotes unconverted.
ENT_IGNORE Silently discard invalid code unit sequences instead of returning an empty string. Using this flag is discouraged as it may have security implications.
ENT_SUBSTITUTE Replace invalid code unit sequences with a Unicode Replacement Character U+FFFD (UTF-8) or &#38;#FFFD; (otherwise) instead of returning an empty string.
ENT_DISALLOWED Replace invalid code points for the given document type with a Unicode Replacement Character U+FFFD (UTF-8) or &#38;#FFFD; (otherwise) instead of leaving them as is. This may be useful, for instance, to ensure the well-formedness of XML documents with embedded external content.
ENT_HTML401 Handle code as HTML 4.01.
ENT_XML1 Handle code as XML 1.
ENT_XHTML Handle code as XHTML.
ENT_HTML5 Handle code as HTML 5.
` - `string $encoding [optional]

Like htmlspecialchars, htmlentities takes an optional third argument encoding which defines encoding used in conversion. Although this argument is technically optional, you are highly encouraged to specify the correct value for your code.

` - `bool $double_encode [optional]

When double_encode is turned off PHP will not encode existing html entities. The default is to convert everything.

` **Return:** - `string

The encoded string.

If the input string contains an invalid code unit sequence within the given encoding an empty string will be returned, unless either the ENT_IGNORE or ENT_SUBSTITUTE flags are set.

` -------- ## htmlspecialchars(string $str, int $flags, string $encoding, bool $double_encode): string Convert only special characters to HTML entities: UTF-8 version of htmlspecialchars() INFO: Take a look at "UTF8::htmlentities()" EXAMPLE: UTF8::htmlspecialchars(''); // '<白-öäü>' **Parameters:** - `string $str

The string being converted.

` - `int $flags [optional]

A bitmask of one or more of the following flags, which specify how to handle quotes, invalid code unit sequences and the used document type. The default is ENT_COMPAT | ENT_HTML401.

Available flags constants
Constant Name Description
ENT_COMPAT Will convert double-quotes and leave single-quotes alone.
ENT_QUOTES Will convert both double and single quotes.
ENT_NOQUOTES Will leave both double and single quotes unconverted.
ENT_IGNORE Silently discard invalid code unit sequences instead of returning an empty string. Using this flag is discouraged as it may have security implications.
ENT_SUBSTITUTE Replace invalid code unit sequences with a Unicode Replacement Character U+FFFD (UTF-8) or &#38;#FFFD; (otherwise) instead of returning an empty string.
ENT_DISALLOWED Replace invalid code points for the given document type with a Unicode Replacement Character U+FFFD (UTF-8) or &#38;#FFFD; (otherwise) instead of leaving them as is. This may be useful, for instance, to ensure the well-formedness of XML documents with embedded external content.
ENT_HTML401 Handle code as HTML 4.01.
ENT_XML1 Handle code as XML 1.
ENT_XHTML Handle code as XHTML.
ENT_HTML5 Handle code as HTML 5.
` - `string $encoding [optional]

Defines encoding used in conversion.

For the purposes of this function, the encodings ISO-8859-1, ISO-8859-15, UTF-8, cp866, cp1251, cp1252, and KOI8-R are effectively equivalent, provided the string itself is valid for the encoding, as the characters affected by htmlspecialchars occupy the same positions in all of these encodings.

` - `bool $double_encode [optional]

When double_encode is turned off PHP will not encode existing html entities, the default is to convert everything.

` **Return:** - `string the converted string.

If the input string contains an invalid code unit sequence within the given encoding an empty string will be returned, unless either the ENT_IGNORE or ENT_SUBSTITUTE flags are set` -------- ## iconv_loaded(): bool Checks whether iconv is available on the server. **Parameters:** __nothing__ **Return:** - `bool

true if available, false otherwise

` -------- ## int_to_hex(int $int, string $prefix): string Converts Integer to hexadecimal U+xxxx code point representation. INFO: opposite to UTF8::hex_to_int() EXAMPLE: UTF8::int_to_hex(241); // 'U+00f1' **Parameters:** - `int $int

The integer to be converted to hexadecimal code point.

` - `string $prefix [optional]` **Return:** - `string the code point, or empty string on failure` -------- ## intlChar_loaded(): bool Checks whether intl-char is available on the server. **Parameters:** __nothing__ **Return:** - `bool

true if available, false otherwise

` -------- ## intl_loaded(): bool Checks whether intl is available on the server. **Parameters:** __nothing__ **Return:** - `bool

true if available, false otherwise

` -------- ## is_alpha(string $str): bool Returns true if the string contains only alphabetic chars, false otherwise. **Parameters:** - `string $str

The input string.

` **Return:** - `bool

Whether or not $str contains only alphabetic chars.

` -------- ## is_alphanumeric(string $str): bool Returns true if the string contains only alphabetic and numeric chars, false otherwise. **Parameters:** - `string $str

The input string.

` **Return:** - `bool

Whether or not $str contains only alphanumeric chars.

` -------- ## is_ascii(string $str): bool Checks if a string is 7 bit ASCII. EXAMPLE: UTF8::is_ascii('白'); // false **Parameters:** - `string $str

The string to check.

` **Return:** - `bool

true if it is ASCII
false otherwise

` -------- ## is_base64(string|null $str, bool $empty_string_is_valid): bool Returns true if the string is base64 encoded, false otherwise. EXAMPLE: UTF8::is_base64('4KSu4KWL4KSo4KS/4KSa'); // true **Parameters:** - `string|null $str

The input string.

` - `bool $empty_string_is_valid [optional]

Is an empty string valid base64 or not?

` **Return:** - `bool

Whether or not $str is base64 encoded.

` -------- ## is_binary(int|string $input, bool $strict): bool Check if the input is binary... (is look like a hack). EXAMPLE: UTF8::is_binary(01); // true **Parameters:** - `int|string $input` - `bool $strict` **Return:** - `bool` -------- ## is_binary_file(string $file): bool Check if the file is binary. EXAMPLE: UTF8::is_binary('./utf32.txt'); // true **Parameters:** - `string $file` **Return:** - `bool` -------- ## is_blank(string $str): bool Returns true if the string contains only whitespace chars, false otherwise. **Parameters:** - `string $str

The input string.

` **Return:** - `bool

Whether or not $str contains only whitespace characters.

` -------- ## is_bom(string $str): bool Checks if the given string is equal to any "Byte Order Mark". WARNING: Use "UTF8::string_has_bom()" if you will check BOM in a string. EXAMPLE: UTF8::is_bom("\xef\xbb\xbf"); // true **Parameters:** - `string $str

The input string.

` **Return:** - `bool

true if the $utf8_chr is Byte Order Mark, false otherwise.

` -------- ## is_empty(array|float|int|string $str): bool Determine whether the string is considered to be empty. A variable is considered empty if it does not exist or if its value equals FALSE. empty() does not generate a warning if the variable does not exist. **Parameters:** - `array|float|int|string $str` **Return:** - `bool

Whether or not $str is empty().

` -------- ## is_hexadecimal(string $str): bool Returns true if the string contains only hexadecimal chars, false otherwise. **Parameters:** - `string $str

The input string.

` **Return:** - `bool

Whether or not $str contains only hexadecimal chars.

` -------- ## is_html(string $str): bool Check if the string contains any HTML tags. EXAMPLE: UTF8::is_html('lall'); // true **Parameters:** - `string $str

The input string.

` **Return:** - `bool

Whether or not $str contains html elements.

` -------- ## is_json(string $str, bool $only_array_or_object_results_are_valid): bool Try to check if "$str" is a JSON-string. EXAMPLE: UTF8::is_json('{"array":[1,"¥","ä"]}'); // true **Parameters:** - `string $str

The input string.

` - `bool $only_array_or_object_results_are_valid [optional]

Only array and objects are valid json results.

` **Return:** - `bool

Whether or not the $str is in JSON format.

` -------- ## is_lowercase(string $str): bool **Parameters:** - `string $str

The input string.

` **Return:** - `bool

Whether or not $str contains only lowercase chars.

` -------- ## is_printable(string $str, bool $ignore_control_characters): bool Returns true if the string contains only printable (non-invisible) chars, false otherwise. **Parameters:** - `string $str

The input string.

` - `bool $ignore_control_characters [optional]

Ignore control characters like [LRM] or [LSEP].

` **Return:** - `bool

Whether or not $str contains only printable (non-invisible) chars.

` -------- ## is_punctuation(string $str): bool Returns true if the string contains only punctuation chars, false otherwise. **Parameters:** - `string $str

The input string.

` **Return:** - `bool

Whether or not $str contains only punctuation chars.

` -------- ## is_serialized(string $str): bool Returns true if the string is serialized, false otherwise. **Parameters:** - `string $str

The input string.

` **Return:** - `bool

Whether or not $str is serialized.

` -------- ## is_uppercase(string $str): bool Returns true if the string contains only lower case chars, false otherwise. **Parameters:** - `string $str

The input string.

` **Return:** - `bool

Whether or not $str contains only lower case characters.

` -------- ## is_url(string $url, bool $disallow_localhost): bool Check if $url is an correct url. **Parameters:** - `string $url` - `bool $disallow_localhost` **Return:** - `bool` -------- ## is_utf8(int|string|string[]|null $str, bool $strict): bool Checks whether the passed input contains only byte sequences that appear valid UTF-8. EXAMPLE: UTF8::is_utf8(['Iñtërnâtiônàlizætiøn', 'foo']); // true // UTF8::is_utf8(["Iñtërnâtiônàlizætiøn\xA0\xA1", 'bar']); // false **Parameters:** - `int|string|string[]|null $str

The input to be checked.

` - `bool $strict

Check also if the string is not UTF-16 or UTF-32.

` **Return:** - `bool` -------- ## is_utf16(string $str, bool $check_if_string_is_binary): false|int Check if the string is UTF-16. EXAMPLE: UTF8::is_utf16(file_get_contents('utf-16-le.txt')); // 1 // UTF8::is_utf16(file_get_contents('utf-16-be.txt')); // 2 // UTF8::is_utf16(file_get_contents('utf-8.txt')); // false **Parameters:** - `string $str

The input string.

` - `bool $check_if_string_is_binary` **Return:** - `false|int false if is't not UTF-16,
1 for UTF-16LE,
2 for UTF-16BE` -------- ## is_utf32(string $str, bool $check_if_string_is_binary): false|int Check if the string is UTF-32. EXAMPLE: UTF8::is_utf32(file_get_contents('utf-32-le.txt')); // 1 // UTF8::is_utf32(file_get_contents('utf-32-be.txt')); // 2 // UTF8::is_utf32(file_get_contents('utf-8.txt')); // false **Parameters:** - `string $str

The input string.

` - `bool $check_if_string_is_binary` **Return:** - `false|int false if is't not UTF-32,
1 for UTF-32LE,
2 for UTF-32BE` -------- ## json_decode(string $json, bool $assoc, int $depth, int $options): mixed (PHP 5 >= 5.2.0, PECL json >= 1.2.0)
Decodes a JSON string EXAMPLE: UTF8::json_decode('[1,"\u00a5","\u00e4"]'); // array(1, '¥', 'ä') **Parameters:** - `string $json

The json string being decoded.

This function only works with UTF-8 encoded strings.

PHP implements a superset of JSON - it will also encode and decode scalar types and NULL. The JSON standard only supports these values when they are nested inside an array or an object.

` - `bool $assoc [optional]

When TRUE, returned objects will be converted into associative arrays.

` - `int $depth [optional]

User specified recursion depth.

` - `int $options [optional]

Bitmask of JSON decode options. Currently only JSON_BIGINT_AS_STRING is supported (default is to cast large integers as floats)

` **Return:** - `mixed

The value encoded in json in appropriate PHP type. Values true, false and null (case-insensitive) are returned as TRUE, FALSE and NULL respectively. NULL is returned if the json cannot be decoded or if the encoded data is deeper than the recursion limit.

` -------- ## json_encode(mixed $value, int $options, int $depth): false|string (PHP 5 >= 5.2.0, PECL json >= 1.2.0)
Returns the JSON representation of a value. EXAMPLE: UTF8::json_enocde(array(1, '¥', 'ä')); // '[1,"\u00a5","\u00e4"]' **Parameters:** - `mixed $value

The value being encoded. Can be any type except a resource.

All string data must be UTF-8 encoded.

PHP implements a superset of JSON - it will also encode and decode scalar types and NULL. The JSON standard only supports these values when they are nested inside an array or an object.

` - `int $options [optional]

Bitmask consisting of JSON_HEX_QUOT, JSON_HEX_TAG, JSON_HEX_AMP, JSON_HEX_APOS, JSON_NUMERIC_CHECK, JSON_PRETTY_PRINT, JSON_UNESCAPED_SLASHES, JSON_FORCE_OBJECT, JSON_UNESCAPED_UNICODE. The behaviour of these constants is described on the JSON constants page.

` - `int $depth [optional]

Set the maximum depth. Must be greater than zero.

` **Return:** - `false|string A JSON encoded string on success or
FALSE on failure` -------- ## json_loaded(): bool Checks whether JSON is available on the server. **Parameters:** __nothing__ **Return:** - `bool

true if available, false otherwise

` -------- ## lcfirst(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string Makes string's first char lowercase. EXAMPLE: UTF8::lcfirst('ÑTËRNÂTIÔNÀLIZÆTIØN'); // ñTËRNÂTIÔNÀLIZÆTIØN **Parameters:** - `string $str

The input string

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` - `bool $clean_utf8 [optional]

Remove non UTF-8 chars from the string.

` - `string|null $lang [optional]

Set the language for special cases: az, el, lt, tr

` - `bool $try_to_keep_the_string_length [optional]

true === try to keep the string length: e.g. ẞ -> ß

` **Return:** - `string the resulting string` -------- ## lcwords(string $str, string[] $exceptions, string $char_list, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string Lowercase for all words in the string. **Parameters:** - `string $str

The input string.

` - `string[] $exceptions [optional]

Exclusion for some words.

` - `string $char_list [optional]

Additional chars that contains to words and do not start a new word.

` - `string $encoding [optional]

Set the charset.

` - `bool $clean_utf8 [optional]

Remove non UTF-8 chars from the string.

` - `string|null $lang [optional]

Set the language for special cases: az, el, lt, tr

` - `bool $try_to_keep_the_string_length [optional]

true === try to keep the string length: e.g. ẞ -> ß

` **Return:** - `string` -------- ## ltrim(string $str, string|null $chars): string Strip whitespace or other characters from the beginning of a UTF-8 string. EXAMPLE: UTF8::ltrim(' 中文空白  '); // '中文空白  ' **Parameters:** - `string $str

The string to be trimmed

` - `string|null $chars

Optional characters to be stripped

` **Return:** - `string the string with unwanted characters stripped from the left` -------- ## max(string[]|string $arg): string|null Returns the UTF-8 character with the maximum code point in the given data. EXAMPLE: UTF8::max('abc-äöü-中文空白'); // 'ø' **Parameters:** - `string[]|string $arg

A UTF-8 encoded string or an array of such strings.

` **Return:** - `string|null the character with the highest code point than others, returns null on failure or empty input` -------- ## max_chr_width(string $str): int Calculates and returns the maximum number of bytes taken by any UTF-8 encoded character in the given string. EXAMPLE: UTF8::max_chr_width('Intërnâtiônàlizætiøn'); // 2 **Parameters:** - `string $str

The original Unicode string.

` **Return:** - `int

Max byte lengths of the given chars.

` -------- ## mbstring_loaded(): bool Checks whether mbstring is available on the server. **Parameters:** __nothing__ **Return:** - `bool

true if available, false otherwise

` -------- ## min(string|string[] $arg): string|null Returns the UTF-8 character with the minimum code point in the given data. EXAMPLE: UTF8::min('abc-äöü-中文空白'); // '-' **Parameters:** - `string|string[] $arg A UTF-8 encoded string or an array of such strings.` **Return:** - `string|null

The character with the lowest code point than others, returns null on failure or empty input.

` -------- ## normalize_encoding(mixed $encoding, mixed $fallback): mixed|string Normalize the encoding-"name" input. EXAMPLE: UTF8::normalize_encoding('UTF8'); // 'UTF-8' **Parameters:** - `mixed $encoding

e.g.: ISO, UTF8, WINDOWS-1251 etc.

` - `string|TNormalizeEncodingFallback $fallback

e.g.: UTF-8

` **Return:** - `mixed|string

e.g.: ISO-8859-1, UTF-8, WINDOWS-1251 etc.
Will return a empty string as fallback (by default)

` -------- ## normalize_line_ending(string $str, string|string[] $replacer): string Standardize line ending to unix-like. **Parameters:** - `string $str

The input string.

` - `string|string[] $replacer

The replacer char e.g. "\n" (Linux) or "\r\n" (Windows). You can also use \PHP_EOL here.

` **Return:** - `string

A string with normalized line ending.

` -------- ## normalize_msword(string $str): string Normalize some MS Word special characters. EXAMPLE: UTF8::normalize_msword('„Abcdef…”'); // '"Abcdef..."' **Parameters:** - `string $str

The string to be normalized.

` **Return:** - `string

A string with normalized characters for commonly used chars in Word documents.

` -------- ## normalize_whitespace(string $str, bool $keep_non_breaking_space, bool $keep_bidi_unicode_controls, bool $normalize_control_characters): string Normalize the whitespace. EXAMPLE: UTF8::normalize_whitespace("abc-\xc2\xa0-öäü-\xe2\x80\xaf-\xE2\x80\xAC", true); // "abc-\xc2\xa0-öäü- -" **Parameters:** - `string $str

The string to be normalized.

` - `bool $keep_non_breaking_space [optional]

Set to true, to keep non-breaking-spaces.

` - `bool $keep_bidi_unicode_controls [optional]

Set to true, to keep non-printable (for the web) bidirectional text chars.

` - `bool $normalize_control_characters [optional]

Set to true, to convert e.g. LINE-, PARAGRAPH-SEPARATOR with "\n" and LINE TABULATION with "\t".

` **Return:** - `string

A string with normalized whitespace.

` -------- ## ord(string $chr, string $encoding): int Calculates Unicode code point of the given UTF-8 encoded character. INFO: opposite to UTF8::chr() EXAMPLE: UTF8::ord('☃'); // 0x2603 **Parameters:** - `string $chr

The character of which to calculate code point.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `int

Unicode code point of the given character,
0 on invalid UTF-8 byte sequence

` -------- ## parse_str(string $str, array $result, bool $clean_utf8): bool Parses the string into an array (into the the second parameter). WARNING: Unlike "parse_str()", this method does not (re-)place variables in the current scope, if the second parameter is not set! EXAMPLE: UTF8::parse_str('Iñtërnâtiônéàlizætiøn=測試&arr[]=foo+測試&arr[]=ການທົດສອບ', $array); echo $array['Iñtërnâtiônéàlizætiøn']; // '測試' **Parameters:** - `string $str

The input string.

` - `array $result

The result will be returned into this reference parameter.

` - `bool $clean_utf8 [optional]

Remove non UTF-8 chars from the string.

` **Return:** - `bool

Will return false if php can't parse the string and we haven't any $result.

` -------- ## pcre_utf8_support(): bool Checks if \u modifier is available that enables Unicode support in PCRE. **Parameters:** __nothing__ **Return:** - `bool

true if support is available,
false otherwise

` -------- ## range(int|string $var1, int|string $var2, bool $use_ctype, string $encoding, float|int $step): string[] Create an array containing a range of UTF-8 characters. EXAMPLE: UTF8::range('κ', 'ζ'); // array('κ', 'ι', 'θ', 'η', 'ζ',) **Parameters:** - `int|string $var1

Numeric or hexadecimal code points, or a UTF-8 character to start from.

` - `int|string $var2

Numeric or hexadecimal code points, or a UTF-8 character to end at.

` - `bool $use_ctype

use ctype to detect numeric and hexadecimal, otherwise we will use a simple "is_numeric"

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` - `float|int $step [optional]

If a step value is given, it will be used as the increment between elements in the sequence. step should be given as a positive number. If not specified, step will default to 1.

` **Return:** - `string[]` -------- ## rawurldecode(string $str, bool $multi_decode): string Multi decode HTML entity + fix urlencoded-win1252-chars. EXAMPLE: UTF8::rawurldecode('tes%20öäü%20\u00edtest+test'); // 'tes öäü ítest+test' e.g: 'test+test' => 'test+test' 'Düsseldorf' => 'Düsseldorf' 'D%FCsseldorf' => 'Düsseldorf' 'Düsseldorf' => 'Düsseldorf' 'D%26%23xFC%3Bsseldorf' => 'Düsseldorf' 'Düsseldorf' => 'Düsseldorf' 'D%C3%BCsseldorf' => 'Düsseldorf' 'D%C3%83%C2%BCsseldorf' => 'Düsseldorf' 'D%25C3%2583%25C2%25BCsseldorf' => 'Düsseldorf' **Parameters:** - `string $str

The input string.

` - `bool $multi_decode

Decode as often as possible.

` **Return:** - `string

The decoded URL, as a string.

` -------- ## regex_replace(string $str, string $pattern, string $replacement, string $options, string $delimiter): string Replaces all occurrences of $pattern in $str by $replacement. **Parameters:** - `string $str

The input string.

` - `string $pattern

The regular expression pattern.

` - `string $replacement

The string to replace with.

` - `string $options [optional]

Matching conditions to be used.

` - `string $delimiter [optional]

Delimiter the the regex. Default: '/'

` **Return:** - `string` -------- ## remove_bom(string $str): string Remove the BOM from UTF-8 / UTF-16 / UTF-32 strings. EXAMPLE: UTF8::remove_bom("\xEF\xBB\xBFΜπορώ να"); // 'Μπορώ να' **Parameters:** - `string $str

The input string.

` **Return:** - `string

A string without UTF-BOM.

` -------- ## remove_duplicates(string $str, string|string[] $what): string Removes duplicate occurrences of a string in another string. EXAMPLE: UTF8::remove_duplicates('öäü-κόσμεκόσμε-äöü', 'κόσμε'); // 'öäü-κόσμε-äöü' **Parameters:** - `string $str

The base string.

` - `string|string[] $what

String to search for in the base string.

` **Return:** - `string

A string with removed duplicates.

` -------- ## remove_html(string $str, string $allowable_tags): string Remove html via "strip_tags()" from the string. **Parameters:** - `string $str

The input string.

` - `string $allowable_tags [optional]

You can use the optional second parameter to specify tags which should not be stripped. Default: null

` **Return:** - `string

A string with without html tags.

` -------- ## remove_html_breaks(string $str, string $replacement): string Remove all breaks [
| \r\n | \r | \n | ...] from the string. **Parameters:** - `string $str

The input string.

` - `string $replacement [optional]

Default is a empty string.

` **Return:** - `string

A string without breaks.

` -------- ## remove_invisible_characters(string $str, bool $url_encoded, string $replacement, bool $keep_basic_control_characters): string Remove invisible characters from a string. e.g.: This prevents sandwiching null characters between ascii characters, like Java\0script. EXAMPLE: UTF8::remove_invisible_characters("κόσ\0με"); // 'κόσμε' copy&past from https://github.com/bcit-ci/CodeIgniter/blob/develop/system/core/Common.php **Parameters:** - `string $str

The input string.

` - `bool $url_encoded [optional]

Try to remove url encoded control character. WARNING: maybe contains false-positives e.g. aa%0Baa -> aaaa.
Default: false

` - `string $replacement [optional]

The replacement character.

` - `bool $keep_basic_control_characters [optional]

Keep control characters like [LRM] or [LSEP].

` **Return:** - `string

A string without invisible chars.

` -------- ## remove_left(string $str, string $substring, string $encoding): string Returns a new string with the prefix $substring removed, if present. **Parameters:** - `string $str

The input string.

` - `string $substring

The prefix to remove.

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string

A string without the prefix $substring.

` -------- ## remove_right(string $str, string $substring, string $encoding): string Returns a new string with the suffix $substring removed, if present. **Parameters:** - `string $str` - `string $substring

The suffix to remove.

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string

A string having a $str without the suffix $substring.

` -------- ## replace(string $str, string $search, string $replacement, bool $case_sensitive): string Replaces all occurrences of $search in $str by $replacement. **Parameters:** - `string $str

The input string.

` - `string $search

The needle to search for.

` - `string $replacement

The string to replace with.

` - `bool $case_sensitive [optional]

Whether or not to enforce case-sensitivity. Default: true

` **Return:** - `string

A string with replaced parts.

` -------- ## replace_all(string $str, array $search, array|string $replacement, bool $case_sensitive): string Replaces all occurrences of $search in $str by $replacement. **Parameters:** - `string $str

The input string.

` - `array $search

The elements to search for.

` - `array|string $replacement

The string to replace with.

` - `bool $case_sensitive [optional]

Whether or not to enforce case-sensitivity. Default: true

` **Return:** - `string

A string with replaced parts.

` -------- ## replace_diamond_question_mark(string $str, string $replacement_char, bool $process_invalid_utf8_chars): string Replace the diamond question mark (�) and invalid-UTF8 chars with the replacement. EXAMPLE: UTF8::replace_diamond_question_mark('中文空白�', ''); // '中文空白' **Parameters:** - `string $str

The input string

` - `string $replacement_char

The replacement character.

` - `bool $process_invalid_utf8_chars

Convert invalid UTF-8 chars

` **Return:** - `string

A string without diamond question marks (�).

` -------- ## rtrim(string $str, string|null $chars): string Strip whitespace or other characters from the end of a UTF-8 string. EXAMPLE: UTF8::rtrim('-ABC-中文空白- '); // '-ABC-中文空白-' **Parameters:** - `string $str

The string to be trimmed.

` - `string|null $chars

Optional characters to be stripped.

` **Return:** - `string

A string with unwanted characters stripped from the right.

` -------- ## showSupport(bool $useEcho): string|void WARNING: Print native UTF-8 support (libs) by default, e.g. for debugging. **Parameters:** - `bool $useEcho` **Return:** - `string|void` -------- ## single_chr_html_encode(string $char, bool $keep_ascii_chars, string $encoding): string Converts a UTF-8 character to HTML Numbered Entity like "{". EXAMPLE: UTF8::single_chr_html_encode('κ'); // 'κ' **Parameters:** - `string $char

The Unicode character to be encoded as numbered entity.

` - `bool $keep_ascii_chars

Set to true to keep ASCII chars.>` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string

The HTML numbered entity for the given character.

` -------- ## spaces_to_tabs(string $str, int $tab_length): string **Parameters:** - `string $str` - `int $tab_length` **Return:** - `string` -------- ## str_camelize(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string Returns a camelCase version of the string. Trims surrounding spaces, capitalizes letters following digits, spaces, dashes and underscores, and removes spaces, dashes, as well as underscores. **Parameters:** - `string $str

The input string.

` - `string $encoding [optional]

Default: 'UTF-8'

` - `bool $clean_utf8 [optional]

Remove non UTF-8 chars from the string.

` - `string|null $lang [optional]

Set the language for special cases: az, el, lt, tr

` - `bool $try_to_keep_the_string_length [optional]

true === try to keep the string length: e.g. ẞ -> ß

` **Return:** - `string` -------- ## str_capitalize_name(string $str): string Returns the string with the first letter of each word capitalized, except for when the word is a name which shouldn't be capitalized. **Parameters:** - `string $str` **Return:** - `string

A string with $str capitalized.

` -------- ## str_contains(string $haystack, string $needle, bool $case_sensitive): bool Returns true if the string contains $needle, false otherwise. By default the comparison is case-sensitive, but can be made insensitive by setting $case_sensitive to false. **Parameters:** - `string $haystack

The input string.

` - `string $needle

Substring to look for.

` - `bool $case_sensitive [optional]

Whether or not to enforce case-sensitivity. Default: true

` **Return:** - `bool

Whether or not $haystack contains $needle.

` -------- ## str_contains_all(string $haystack, array $needles, bool $case_sensitive): bool Returns true if the string contains all $needles, false otherwise. By default the comparison is case-sensitive, but can be made insensitive by setting $case_sensitive to false. **Parameters:** - `string $haystack

The input string.

` - `array $needles

SubStrings to look for.

` - `bool $case_sensitive [optional]

Whether or not to enforce case-sensitivity. Default: true

` **Return:** - `bool

Whether or not $haystack contains $needle.

` -------- ## str_contains_any(string $haystack, array $needles, bool $case_sensitive): bool Returns true if the string contains any $needles, false otherwise. By default the comparison is case-sensitive, but can be made insensitive by setting $case_sensitive to false. **Parameters:** - `string $haystack

The input string.

` - `array $needles

SubStrings to look for.

` - `bool $case_sensitive [optional]

Whether or not to enforce case-sensitivity. Default: true

` **Return:** - `bool

Whether or not $str contains $needle.

` -------- ## str_dasherize(string $str, string $encoding): string Returns a lowercase and trimmed string separated by dashes. Dashes are inserted before uppercase characters (with the exception of the first character of the string), and in place of spaces as well as underscores. **Parameters:** - `string $str

The input string.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string` -------- ## str_delimit(string $str, string $delimiter, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string Returns a lowercase and trimmed string separated by the given delimiter. Delimiters are inserted before uppercase characters (with the exception of the first character of the string), and in place of spaces, dashes, and underscores. Alpha delimiters are not converted to lowercase. **Parameters:** - `string $str

The input string.

` - `string $delimiter

Sequence used to separate parts of the string.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` - `bool $clean_utf8 [optional]

Remove non UTF-8 chars from the string.

` - `string|null $lang [optional]

Set the language for special cases: az, el, lt, tr

` - `bool $try_to_keep_the_string_length [optional]

true === try to keep the string length: e.g. ẞ -> ß

` **Return:** - `string` -------- ## str_detect_encoding(string $str): false|string Optimized "mb_detect_encoding()"-function -> with support for UTF-16 and UTF-32. EXAMPLE: UTF8::str_detect_encoding('中文空白'); // 'UTF-8' UTF8::str_detect_encoding('Abc'); // 'ASCII' **Parameters:** - `string $str

The input string.

` **Return:** - `false|string

The detected string-encoding e.g. UTF-8 or UTF-16BE,
otherwise it will return false e.g. for BINARY or not detected encoding.

` -------- ## str_ends_with(string $haystack, string $needle): bool Check if the string ends with the given substring. EXAMPLE: UTF8::str_ends_with('BeginMiddleΚόσμε', 'Κόσμε'); // true UTF8::str_ends_with('BeginMiddleΚόσμε', 'κόσμε'); // false **Parameters:** - `string $haystack

The string to search in.

` - `string $needle

The substring to search for.

` **Return:** - `bool` -------- ## str_ends_with_any(string $str, string[] $substrings): bool Returns true if the string ends with any of $substrings, false otherwise. - case-sensitive **Parameters:** - `string $str

The input string.

` - `string[] $substrings

Substrings to look for.

` **Return:** - `bool

Whether or not $str ends with $substring.

` -------- ## str_ensure_left(string $str, string $substring): string Ensures that the string begins with $substring. If it doesn't, it's prepended. **Parameters:** - `string $str

The input string.

` - `string $substring

The substring to add if not present.

` **Return:** - `string` -------- ## str_ensure_right(string $str, string $substring): string Ensures that the string ends with $substring. If it doesn't, it's appended. **Parameters:** - `string $str

The input string.

` - `string $substring

The substring to add if not present.

` **Return:** - `string` -------- ## str_humanize(string $str): string Capitalizes the first word of the string, replaces underscores with spaces, and strips '_id'. **Parameters:** - `string $str` **Return:** - `string` -------- ## str_iends_with(string $haystack, string $needle): bool Check if the string ends with the given substring, case-insensitive. EXAMPLE: UTF8::str_iends_with('BeginMiddleΚόσμε', 'Κόσμε'); // true UTF8::str_iends_with('BeginMiddleΚόσμε', 'κόσμε'); // true **Parameters:** - `string $haystack

The string to search in.

` - `string $needle

The substring to search for.

` **Return:** - `bool` -------- ## str_iends_with_any(string $str, string[] $substrings): bool Returns true if the string ends with any of $substrings, false otherwise. - case-insensitive **Parameters:** - `string $str

The input string.

` - `string[] $substrings

Substrings to look for.

` **Return:** - `bool

Whether or not $str ends with $substring.

` -------- ## str_insert(string $str, string $substring, int $index, string $encoding): string Inserts $substring into the string at the $index provided. **Parameters:** - `string $str

The input string.

` - `string $substring

String to be inserted.

` - `int $index

The index at which to insert the substring.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string` -------- ## str_ireplace(string|string[] $search, string|string[] $replacement, string|string[] $subject, int $count): string|string[] Case-insensitive and UTF-8 safe version of str_replace. EXAMPLE: UTF8::str_ireplace('lIzÆ', 'lise', 'Iñtërnâtiônàlizætiøn'); // 'Iñtërnâtiônàlisetiøn' **Parameters:** - `string|string[] $search

Every replacement with search array is performed on the result of previous replacement.

` - `string|string[] $replacement

The replacement.

` - `TStrIReplaceSubject $subject

If subject is an array, then the search and replace is performed with every entry of subject, and the return value is an array as well.

` - `int $count [optional]

The number of matched and replaced needles will be returned in count which is passed by reference.

` **Return:** - `string|string[]

A string or an array of replacements.

` -------- ## str_ireplace_beginning(string $str, string $search, string $replacement): string Replaces $search from the beginning of string with $replacement. **Parameters:** - `string $str

The input string.

` - `string $search

The string to search for.

` - `string $replacement

The replacement.

` **Return:** - `string

The string after the replacement.

` -------- ## str_ireplace_ending(string $str, string $search, string $replacement): string Replaces $search from the ending of string with $replacement. **Parameters:** - `string $str

The input string.

` - `string $search

The string to search for.

` - `string $replacement

The replacement.

` **Return:** - `string

The string after the replacement.

` -------- ## str_istarts_with(string $haystack, string $needle): bool Check if the string starts with the given substring, case-insensitive. EXAMPLE: UTF8::str_istarts_with('ΚόσμεMiddleEnd', 'Κόσμε'); // true UTF8::str_istarts_with('ΚόσμεMiddleEnd', 'κόσμε'); // true **Parameters:** - `string $haystack

The string to search in.

` - `string $needle

The substring to search for.

` **Return:** - `bool` -------- ## str_istarts_with_any(string $str, array $substrings): bool Returns true if the string begins with any of $substrings, false otherwise. - case-insensitive **Parameters:** - `string $str

The input string.

` - `array $substrings

Substrings to look for.

` **Return:** - `bool

Whether or not $str starts with $substring.

` -------- ## str_isubstr_after_first_separator(string $str, string $separator, string $encoding): string Gets the substring after the first occurrence of a separator. **Parameters:** - `string $str

The input string.

` - `string $separator

The string separator.

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string` -------- ## str_isubstr_after_last_separator(string $str, string $separator, string $encoding): string Gets the substring after the last occurrence of a separator. **Parameters:** - `string $str

The input string.

` - `string $separator

The string separator.

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string` -------- ## str_isubstr_before_first_separator(string $str, string $separator, string $encoding): string Gets the substring before the first occurrence of a separator. **Parameters:** - `string $str

The input string.

` - `string $separator

The string separator.

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string` -------- ## str_isubstr_before_last_separator(string $str, string $separator, string $encoding): string Gets the substring before the last occurrence of a separator. **Parameters:** - `string $str

The input string.

` - `string $separator

The string separator.

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string` -------- ## str_isubstr_first(string $str, string $needle, bool $before_needle, string $encoding): string Gets the substring after (or before via "$before_needle") the first occurrence of the "$needle". **Parameters:** - `string $str

The input string.

` - `string $needle

The string to look for.

` - `bool $before_needle [optional]

Default: false

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string` -------- ## str_isubstr_last(string $str, string $needle, bool $before_needle, string $encoding): string Gets the substring after (or before via "$before_needle") the last occurrence of the "$needle". **Parameters:** - `string $str

The input string.

` - `string $needle

The string to look for.

` - `bool $before_needle [optional]

Default: false

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string` -------- ## str_last_char(string $str, int $n, string $encoding): string Returns the last $n characters of the string. **Parameters:** - `string $str

The input string.

` - `int $n

Number of characters to retrieve from the end.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string` -------- ## str_limit(string $str, int $length, string $str_add_on, string $encoding): string Limit the number of characters in a string. **Parameters:** - `string $str

The input string.

` - `int $length [optional]

Default: 100

` - `string $str_add_on [optional]

Default: …

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string` -------- ## str_limit_after_word(string $str, int $length, string $str_add_on, string $encoding): string Limit the number of characters in a string, but also after the next word. EXAMPLE: UTF8::str_limit_after_word('fòô bàř fòô', 8, ''); // 'fòô bàř' **Parameters:** - `string $str

The input string.

` - `int $length [optional]

Default: 100

` - `string $str_add_on [optional]

Default: …

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string` -------- ## str_longest_common_prefix(string $str1, string $str2, string $encoding): string Returns the longest common prefix between the $str1 and $str2. **Parameters:** - `string $str1

The input sting.

` - `string $str2

Second string for comparison.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string` -------- ## str_longest_common_substring(string $str1, string $str2, string $encoding): string Returns the longest common substring between the $str1 and $str2. In the case of ties, it returns that which occurs first. **Parameters:** - `string $str1` - `string $str2

Second string for comparison.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string

A string with its $str being the longest common substring.

` -------- ## str_longest_common_suffix(string $str1, string $str2, string $encoding): string Returns the longest common suffix between the $str1 and $str2. **Parameters:** - `string $str1` - `string $str2

Second string for comparison.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string` -------- ## str_matches_pattern(string $str, string $pattern): bool Returns true if $str matches the supplied pattern, false otherwise. **Parameters:** - `string $str

The input string.

` - `string $pattern

Regex pattern to match against.

` **Return:** - `bool

Whether or not $str matches the pattern.

` -------- ## str_obfuscate(string $str, float $percent, string $obfuscateChar, string[] $keepChars): string Convert a string into a obfuscate string. EXAMPLE: UTF8::str_obfuscate('lars@moelleken.org', 0.5, '*', ['@', '.']); // e.g. "l***@m**lleke*.*r*" **Parameters:** - `string $str` - `float $percent` - `string $obfuscateChar` - `string[] $keepChars` **Return:** - `string

The obfuscate string.

` -------- ## str_offset_exists(string $str, int $offset, string $encoding): bool Returns whether or not a character exists at an index. Offsets may be negative to count from the last character in the string. Implements part of the ArrayAccess interface. **Parameters:** - `string $str

The input string.

` - `int $offset

The index to check.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `bool

Whether or not the index exists.

` -------- ## str_offset_get(string $str, int $index, string $encoding): string Returns the character at the given index. Offsets may be negative to count from the last character in the string. Implements part of the ArrayAccess interface, and throws an OutOfBoundsException if the index does not exist. **Parameters:** - `string $str

The input string.

` - `int $index

The index from which to retrieve the char.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string

The character at the specified index.

` -------- ## str_pad(string $str, int $pad_length, string $pad_string, int|string $pad_type, string $encoding): string Pad a UTF-8 string to a given length with another string. EXAMPLE: UTF8::str_pad('中文空白', 10, '_', STR_PAD_BOTH); // '___中文空白___' **Parameters:** - `string $str

The input string.

` - `int $pad_length

The length of return string.

` - `string $pad_string [optional]

String to use for padding the input string.

` - `int|string $pad_type [optional]

Can be STR_PAD_RIGHT (default), [or string "right"]
STR_PAD_LEFT [or string "left"] or
STR_PAD_BOTH [or string "both"]

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string

Returns the padded string.

` -------- ## str_pad_both(string $str, int $length, string $pad_str, string $encoding): string Returns a new string of a given length such that both sides of the string are padded. Alias for "UTF8::str_pad()" with a $pad_type of 'both'. **Parameters:** - `string $str` - `int $length

Desired string length after padding.

` - `string $pad_str [optional]

String used to pad, defaults to space. Default: ' '

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string

The string with padding applied.

` -------- ## str_pad_left(string $str, int $length, string $pad_str, string $encoding): string Returns a new string of a given length such that the beginning of the string is padded. Alias for "UTF8::str_pad()" with a $pad_type of 'left'. **Parameters:** - `string $str` - `int $length

Desired string length after padding.

` - `string $pad_str [optional]

String used to pad, defaults to space. Default: ' '

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string

The string with left padding.

` -------- ## str_pad_right(string $str, int $length, string $pad_str, string $encoding): string Returns a new string of a given length such that the end of the string is padded. Alias for "UTF8::str_pad()" with a $pad_type of 'right'. **Parameters:** - `string $str` - `int $length

Desired string length after padding.

` - `string $pad_str [optional]

String used to pad, defaults to space. Default: ' '

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string

The string with right padding.

` -------- ## str_repeat(string $str, int $multiplier): string Repeat a string. EXAMPLE: UTF8::str_repeat("°~\xf0\x90\x28\xbc", 2); // '°~ð(¼°~ð(¼' **Parameters:** - `string $str

The string to be repeated.

` - `int $multiplier

Number of time the input string should be repeated.

multiplier has to be greater than or equal to 0. If the multiplier is set to 0, the function will return an empty string.

` **Return:** - `string

The repeated string.

` -------- ## str_replace_beginning(string $str, string $search, string $replacement): string Replaces $search from the beginning of string with $replacement. **Parameters:** - `string $str

The input string.

` - `string $search

The string to search for.

` - `string $replacement

The replacement.

` **Return:** - `string

A string after the replacements.

` -------- ## str_replace_ending(string $str, string $search, string $replacement): string Replaces $search from the ending of string with $replacement. **Parameters:** - `string $str

The input string.

` - `string $search

The string to search for.

` - `string $replacement

The replacement.

` **Return:** - `string

A string after the replacements.

` -------- ## str_replace_first(string $search, string $replace, string $subject): string Replace the first "$search"-term with the "$replace"-term. **Parameters:** - `string $search` - `string $replace` - `string $subject` **Return:** - `string` -------- ## str_replace_last(string $search, string $replace, string $subject): string Replace the last "$search"-term with the "$replace"-term. **Parameters:** - `string $search` - `string $replace` - `string $subject` **Return:** - `string` -------- ## str_shuffle(string $str, string $encoding): string Shuffles all the characters in the string. INFO: uses random algorithm which is weak for cryptography purposes EXAMPLE: UTF8::str_shuffle('fòô bàř fòô'); // 'àòôřb ffòô ' **Parameters:** - `string $str

The input string

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string

The shuffled string.

` -------- ## str_slice(string $str, int $start, int|null $end, string $encoding): false|string Returns the substring beginning at $start, and up to, but not including the index specified by $end. If $end is omitted, the function extracts the remaining string. If $end is negative, it is computed from the end of the string. **Parameters:** - `string $str` - `int $start

Initial index from which to begin extraction.

` - `int|null $end [optional]

Index at which to end extraction. Default: null

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `false|string

The extracted substring.

If str is shorter than start characters long, FALSE will be returned.` -------- ## str_snakeize(string $str, string $encoding): string Convert a string to e.g.: "snake_case" **Parameters:** - `string $str` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string

A string in snake_case.

` -------- ## str_sort(string $str, bool $unique, bool $desc): string Sort all characters according to code points. EXAMPLE: UTF8::str_sort(' -ABC-中文空白- '); // ' ---ABC中文白空' **Parameters:** - `string $str

A UTF-8 string.

` - `bool $unique

Sort unique. If true, repeated characters are ignored.

` - `bool $desc

If true, will sort characters in reverse code point order.

` **Return:** - `string

A string of sorted characters.

` -------- ## str_split(int|string $input, int $length, bool $clean_utf8, bool $try_to_use_mb_functions): string[] Convert a string to an array of unicode characters. EXAMPLE: UTF8::str_split('中文空白'); // array('中', '文', '空', '白') **Parameters:** - `int|string $input

The string or int to split into array.

` - `int $length [optional]

Max character length of each array element.

` - `bool $clean_utf8 [optional]

Remove non UTF-8 chars from the string.

` - `bool $try_to_use_mb_functions [optional]

Set to false, if you don't want to use "mb_substr"

` **Return:** - `string[]

An array containing chunks of chars from the input.

` -------- ## str_split_array(int[]|string[] $input, int $length, bool $clean_utf8, bool $try_to_use_mb_functions): string[][] Convert a string to an array of Unicode characters. EXAMPLE: UTF8::str_split_array(['中文空白', 'test'], 2); // [['中文', '空白'], ['te', 'st']] **Parameters:** - `int[]|string[] $input

The string[] or int[] to split into array.

` - `int $length [optional]

Max character length of each array lement.

` - `bool $clean_utf8 [optional]

Remove non UTF-8 chars from the string.

` - `bool $try_to_use_mb_functions [optional]

Set to false, if you don't want to use "mb_substr"

` **Return:** - `string[][]

An array containing chunks of the input.

` -------- ## str_split_pattern(string $str, string $pattern, int $limit): string[] Splits the string with the provided regular expression, returning an array of strings. An optional integer $limit will truncate the results. **Parameters:** - `string $str` - `string $pattern

The regex with which to split the string.

` - `int $limit [optional]

Maximum number of results to return. Default: -1 === no limit

` **Return:** - `string[]

An array of strings.

` -------- ## str_starts_with(string $haystack, string $needle): bool Check if the string starts with the given substring. EXAMPLE: UTF8::str_starts_with('ΚόσμεMiddleEnd', 'Κόσμε'); // true UTF8::str_starts_with('ΚόσμεMiddleEnd', 'κόσμε'); // false **Parameters:** - `string $haystack

The string to search in.

` - `string $needle

The substring to search for.

` **Return:** - `bool` -------- ## str_starts_with_any(string $str, array $substrings): bool Returns true if the string begins with any of $substrings, false otherwise. - case-sensitive **Parameters:** - `string $str

The input string.

` - `array $substrings

Substrings to look for.

` **Return:** - `bool

Whether or not $str starts with $substring.

` -------- ## str_substr_after_first_separator(string $str, string $separator, string $encoding): string Gets the substring after the first occurrence of a separator. **Parameters:** - `string $str

The input string.

` - `string $separator

The string separator.

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string` -------- ## str_substr_after_last_separator(string $str, string $separator, string $encoding): string Gets the substring after the last occurrence of a separator. **Parameters:** - `string $str

The input string.

` - `string $separator

The string separator.

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string` -------- ## str_substr_before_first_separator(string $str, string $separator, string $encoding): string Gets the substring before the first occurrence of a separator. **Parameters:** - `string $str

The input string.

` - `string $separator

The string separator.

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string` -------- ## str_substr_before_last_separator(string $str, string $separator, string $encoding): string Gets the substring before the last occurrence of a separator. **Parameters:** - `string $str

The input string.

` - `string $separator

The string separator.

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string` -------- ## str_substr_first(string $str, string $needle, bool $before_needle, string $encoding): string Gets the substring after (or before via "$before_needle") the first occurrence of the "$needle". **Parameters:** - `string $str

The input string.

` - `string $needle

The string to look for.

` - `bool $before_needle [optional]

Default: false

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string` -------- ## str_substr_last(string $str, string $needle, bool $before_needle, string $encoding): string Gets the substring after (or before via "$before_needle") the last occurrence of the "$needle". **Parameters:** - `string $str

The input string.

` - `string $needle

The string to look for.

` - `bool $before_needle [optional]

Default: false

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string` -------- ## str_surround(string $str, string $substring): string Surrounds $str with the given substring. **Parameters:** - `string $str` - `string $substring

The substring to add to both sides.

` **Return:** - `string

A string with the substring both prepended and appended.

` -------- ## str_titleize(string $str, array|string[]|null $ignore, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length, bool $use_trim_first, string|null $word_define_chars): string Returns a trimmed string with the first letter of each word capitalized. Also accepts an array, $ignore, allowing you to list words not to be capitalized. **Parameters:** - `string $str` - `array|string[]|null $ignore [optional]

An array of words not to capitalize or null. Default: null

` - `string $encoding [optional]

Default: 'UTF-8'

` - `bool $clean_utf8 [optional]

Remove non UTF-8 chars from the string.

` - `string|null $lang [optional]

Set the language for special cases: az, el, lt, tr

` - `bool $try_to_keep_the_string_length [optional]

true === try to keep the string length: e.g. ẞ -> ß

` - `bool $use_trim_first [optional]

true === trim the input string, first

` - `string|null $word_define_chars [optional]

An string of chars that will be used as whitespace separator === words.

` **Return:** - `string

The titleized string.

` -------- ## str_titleize_for_humans(string $str, array $ignore, string $encoding): string Returns a trimmed string in proper title case. Also accepts an array, $ignore, allowing you to list words not to be capitalized. Adapted from John Gruber's script. **Parameters:** - `string $str` - `array $ignore

An array of words not to capitalize.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `string

The titleized string.

` -------- ## str_to_binary(string $str): false|string Get a binary representation of a specific string. EXAPLE: UTF8::str_to_binary('😃'); // '11110000100111111001100010000011' **Parameters:** - `string $str

The input string.

` **Return:** - `false|string

false on error

` -------- ## str_to_lines(string $str, bool $remove_empty_values, int|null $remove_short_values): string[] **Parameters:** - `string $str` - `bool $remove_empty_values

Remove empty values.

` - `int|null $remove_short_values

The min. string length or null to disable

` **Return:** - `string[]` -------- ## str_to_words(string $str, string $char_list, bool $remove_empty_values, int|null $remove_short_values): string[] Convert a string into an array of words. EXAMPLE: UTF8::str_to_words('中文空白 oöäü#s', '#') // array('', '中文空白', ' ', 'oöäü#s', '') **Parameters:** - `string $str` - `string $char_list

Additional chars for the definition of "words".

` - `bool $remove_empty_values

Remove empty values.

` - `int|null $remove_short_values

The min. string length or null to disable

` **Return:** - `string[]` -------- ## str_truncate(string $str, int $length, string $substring, string $encoding): string Truncates the string to a given length. If $substring is provided, and truncating occurs, the string is further truncated so that the substring may be appended without exceeding the desired length. **Parameters:** - `string $str` - `int $length

Desired length of the truncated string.

` - `string $substring [optional]

The substring to append if it can fit. Default: ''

` - `string $encoding [optional]

Default: 'UTF-8'

` **Return:** - `string

A string after truncating.

` -------- ## str_truncate_safe(string $str, int $length, string $substring, string $encoding, bool $ignore_do_not_split_words_for_one_word): string Truncates the string to a given length, while ensuring that it does not split words. If $substring is provided, and truncating occurs, the string is further truncated so that the substring may be appended without exceeding the desired length. **Parameters:** - `string $str` - `int $length

Desired length of the truncated string.

` - `string $substring [optional]

The substring to append if it can fit. Default: ''

` - `string $encoding [optional]

Default: 'UTF-8'

` - `bool $ignore_do_not_split_words_for_one_word [optional]

Default: false

` **Return:** - `string

A string after truncating.

` -------- ## str_underscored(string $str): string Returns a lowercase and trimmed string separated by underscores. Underscores are inserted before uppercase characters (with the exception of the first character of the string), and in place of spaces as well as dashes. **Parameters:** - `string $str` **Return:** - `string

The underscored string.

` -------- ## str_upper_camelize(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string Returns an UpperCamelCase version of the supplied string. It trims surrounding spaces, capitalizes letters following digits, spaces, dashes and underscores, and removes spaces, dashes, underscores. **Parameters:** - `string $str

The input string.

` - `string $encoding [optional]

Default: 'UTF-8'

` - `bool $clean_utf8 [optional]

Remove non UTF-8 chars from the string.

` - `string|null $lang [optional]

Set the language for special cases: az, el, lt, tr

` - `bool $try_to_keep_the_string_length [optional]

true === try to keep the string length: e.g. ẞ -> ß

` **Return:** - `string

A string in UpperCamelCase.

` -------- ## str_word_count(string $str, int $format, string $char_list): int|string[] Get the number of words in a specific string. EXAMPLES: // format: 0 -> return only word count (int) // UTF8::str_word_count('中文空白 öäü abc#c'); // 4 UTF8::str_word_count('中文空白 öäü abc#c', 0, '#'); // 3 // format: 1 -> return words (array) // UTF8::str_word_count('中文空白 öäü abc#c', 1); // array('中文空白', 'öäü', 'abc', 'c') UTF8::str_word_count('中文空白 öäü abc#c', 1, '#'); // array('中文空白', 'öäü', 'abc#c') // format: 2 -> return words with offset (array) // UTF8::str_word_count('中文空白 öäü ab#c', 2); // array(0 => '中文空白', 5 => 'öäü', 9 => 'abc', 13 => 'c') UTF8::str_word_count('中文空白 öäü ab#c', 2, '#'); // array(0 => '中文空白', 5 => 'öäü', 9 => 'abc#c') **Parameters:** - `string $str

The input string.

` - `int $format [optional]

0 => return a number of words (default)
1 => return an array of words
2 => return an array of words with word-offset as key

` - `string $char_list [optional]

Additional chars that contains to words and do not start a new word.

` **Return:** - `int|string[]

The number of words in the string.

` -------- ## strcasecmp(string $str1, string $str2, string $encoding): int Case-insensitive string comparison. INFO: Case-insensitive version of UTF8::strcmp() EXAMPLE: UTF8::strcasecmp("iñtërnâtiôn\nàlizætiøn", "Iñtërnâtiôn\nàlizætiøn"); // 0 **Parameters:** - `string $str1

The first string.

` - `string $str2

The second string.

` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `int < 0 if str1 is less than str2;
> 0 if str1 is greater than str2,
0 if they are equal` -------- ## strcmp(string $str1, string $str2): int Case-sensitive string comparison. EXAMPLE: UTF8::strcmp("iñtërnâtiôn\nàlizætiøn", "iñtërnâtiôn\nàlizætiøn"); // 0 **Parameters:** - `string $str1

The first string.

` - `string $str2

The second string.

` **Return:** - `int < 0 if str1 is less than str2
> 0 if str1 is greater than str2
0 if they are equal` -------- ## strcspn(string $str, string $char_list, int $offset, int|null $length, string $encoding): int Find length of initial segment not matching mask. **Parameters:** - `string $str` - `string $char_list` - `int $offset` - `int|null $length` - `string $encoding [optional]

Set the charset for e.g. "mb_" function

` **Return:** - `int` -------- ## string(int|int[]|string|string[] $intOrHex): string Create a UTF-8 string from code points. INFO: opposite to UTF8::codepoints() EXAMPLE: UTF8::string(array(246, 228, 252)); // 'öäü' **Parameters:** - `int[]|numeric-string[]|int|numeric-string $intOrHex

Integer or Hexadecimal codepoints.

` **Return:** - `string

A UTF-8 encoded string.

` -------- ## string_has_bom(string $str): bool Checks if string starts with "BOM" (Byte Order Mark Character) character. EXAMPLE: UTF8::string_has_bom("\xef\xbb\xbf foobar"); // true **Parameters:** - `string $str

The input string.

` **Return:** - `bool

true if the string has BOM at the start,
false otherwise

` -------- ## strip_tags(string $str, string|null $allowable_tags, bool $clean_utf8): string Strip HTML and PHP tags from a string + clean invalid UTF-8. EXAMPLE: UTF8::strip_tags("κόσμε\xa0\xa1"); // 'κόσμε' **Parameters:** - `string $str

The input string.

` - `string|null $allowable_tags [optional]

You can use the optional second parameter to specify tags which should not be stripped.

HTML comments and PHP tags are also stripped. This is hardcoded and can not be changed with allowable_tags.

`
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • string <p>The stripped string.</p>

strip_whitespace(string $str): string

Strip all whitespace characters. This includes tabs and newline characters, as well as multibyte whitespace such as the thin space and ideographic space.

EXAMPLE: UTF8::strip_whitespace(' Ο συγγραφέας '); // 'Οσυγγραφέας'

Parameters:

  • string $str

Return:

  • string

stripos(string $haystack, string $needle, int $offset, string $encoding, bool $clean_utf8): false|int

Find the position of the first occurrence of a substring in a string, case-insensitive.

INFO: use UTF8::stripos_in_byte() for the byte-length

EXAMPLE: UTF8::stripos('aσσb', 'ΣΣ'); // 1 (σσ == ΣΣ)

Parameters:

  • string $haystack <p>The string from which to get the position of the first occurrence of needle.</p>
  • string $needle <p>The string to find in haystack.</p>
  • int $offset [optional] <p>The position in haystack to start searching.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • false|int Return the <strong>(int)</strong> numeric position of the first occurrence of needle in the haystack string,<br> or <strong>false</strong> if needle is not found

stripos_in_byte(string $haystack, string $needle, int $offset): false|int

Find the position of the first occurrence of a substring in a string, case-insensitive.

Parameters:

  • string $haystack <p> The string being checked. </p>
  • string $needle <p> The position counted from the beginning of haystack. </p>
  • int $offset [optional] <p> The search offset. If it is not specified, 0 is used. </p>

Return:

  • false|int <p>The numeric position of the first occurrence of needle in the haystack string. If needle is not found, it returns false.</p>

stristr(string $haystack, string $needle, bool $before_needle, string $encoding, bool $clean_utf8): false|string

Returns all of haystack starting from and including the first occurrence of needle to the end.

EXAMPLE: $str = 'iñtërnâtiônàlizætiøn'; $search = 'NÂT';

UTF8::stristr($str, $search)); // 'nâtiônàlizætiøn' UTF8::stristr($str, $search, true)); // 'iñtër'

Parameters:

  • string $haystack <p>The input string. Must be valid UTF-8.</p>
  • string $needle <p>The string to look for. Must be valid UTF-8.</p>
  • bool $before_needle [optional] <p> If <b>TRUE</b>, it returns the part of the haystack before the first occurrence of the needle (excluding the needle). </p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • false|string <p>A sub-string,<br>or <strong>false</strong> if needle is not found.</p>

strlen(string $str, string $encoding, bool $clean_utf8): false|int

Get the string length, not the byte-length!

INFO: use UTF8::strwidth() for the char-length

EXAMPLE: UTF8::strlen("Iñtërnâtiôn\xE9àlizætiøn")); // 20

Parameters:

  • string $str <p>The string being checked for length.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • false|int <p> The number <strong>(int)</strong> of characters in the string $str having character encoding $encoding. (One multi-byte character counted as +1). <br> Can return <strong>false</strong>, if e.g. mbstring is not installed and we process invalid chars. </p>

strlen_in_byte(string $str): int

Get string length in byte.

Parameters:

  • string $str

Return:

  • int

strnatcasecmp(string $str1, string $str2, string $encoding): int

Case-insensitive string comparisons using a "natural order" algorithm.

INFO: natural order version of UTF8::strcasecmp()

EXAMPLES: UTF8::strnatcasecmp('2', '10Hello WORLD 中文空白!'); // -1 UTF8::strcasecmp('2Hello world 中文空白!', '10Hello WORLD 中文空白!'); // 1

UTF8::strnatcasecmp('10Hello world 中文空白!', '2Hello WORLD 中文空白!'); // 1 UTF8::strcasecmp('10Hello world 中文空白!', '2Hello WORLD 中文空白!'); // -1

Parameters:

  • string $str1 <p>The first string.</p>
  • string $str2 <p>The second string.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>

Return:

  • int <strong>&lt; 0</strong> if str1 is less than str2<br> <strong>&gt; 0</strong> if str1 is greater than str2<br> <strong>0</strong> if they are equal

strnatcmp(string $str1, string $str2): int

String comparisons using a "natural order" algorithm

INFO: natural order version of UTF8::strcmp()

EXAMPLES: UTF8::strnatcmp('2Hello world 中文空白!', '10Hello WORLD 中文空白!'); // -1 UTF8::strcmp('2Hello world 中文空白!', '10Hello WORLD 中文空白!'); // 1

UTF8::strnatcmp('10Hello world 中文空白!', '2Hello WORLD 中文空白!'); // 1 UTF8::strcmp('10Hello world 中文空白!', '2Hello WORLD 中文空白!'); // -1

Parameters:

  • string $str1 <p>The first string.</p>
  • string $str2 <p>The second string.</p>

Return:

  • int <strong>&lt; 0</strong> if str1 is less than str2;<br> <strong>&gt; 0</strong> if str1 is greater than str2;<br> <strong>0</strong> if they are equal

strncasecmp(string $str1, string $str2, int $len, string $encoding): int

Case-insensitive string comparison of the first n characters.

EXAMPLE: UTF8::strcasecmp("iñtërnâtiôn\nàlizætiøn321", "iñtërnâtiôn\nàlizætiøn123", 5); // 0

Parameters:

  • string $str1 <p>The first string.</p>
  • string $str2 <p>The second string.</p>
  • int $len <p>The length of strings to be used in the comparison.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>

Return:

  • int <strong>&lt; 0</strong> if <i>str1</i> is less than <i>str2</i>;<br> <strong>&gt; 0</strong> if <i>str1</i> is greater than <i>str2</i>;<br> <strong>0</strong> if they are equal

strncmp(string $str1, string $str2, int $len, string $encoding): int

String comparison of the first n characters.

EXAMPLE: UTF8::strncmp("Iñtërnâtiôn\nàlizætiøn321", "Iñtërnâtiôn\nàlizætiøn123", 5); // 0

Parameters:

  • string $str1 <p>The first string.</p>
  • string $str2 <p>The second string.</p>
  • int $len <p>Number of characters to use in the comparison.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>

Return:

  • int <strong>&lt; 0</strong> if <i>str1</i> is less than <i>str2</i>;<br> <strong>&gt; 0</strong> if <i>str1</i> is greater than <i>str2</i>;<br> <strong>0</strong> if they are equal

strpbrk(string $haystack, string $char_list): false|string

Search a string for any of a set of characters.

EXAMPLE: UTF8::strpbrk('-中文空白-', '白'); // '白-'

Parameters:

  • string $haystack <p>The string where char_list is looked for.</p>
  • string $char_list <p>This parameter is case-sensitive.</p>

Return:

  • false|string <p>The string starting from the character found, or false if it is not found.</p>

strpos(string $haystack, int|string $needle, int $offset, string $encoding, bool $clean_utf8): false|int

Find the position of the first occurrence of a substring in a string.

INFO: use UTF8::strpos_in_byte() for the byte-length

EXAMPLE: UTF8::strpos('ABC-ÖÄÜ-中文空白-中文空白', '中'); // 8

Parameters:

  • string $haystack <p>The string from which to get the position of the first occurrence of needle.</p>
  • int|string $needle <p>The string to find in haystack.<br>Or a code point as int.</p>
  • int $offset [optional] <p>The search offset. If it is not specified, 0 is used.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • false|int The <strong>(int)</strong> numeric position of the first occurrence of needle in the haystack string.<br> If needle is not found it returns false.

strpos_in_byte(string $haystack, string $needle, int $offset): false|int

Find the position of the first occurrence of a substring in a string.

Parameters:

  • string $haystack <p> The string being checked. </p>
  • string $needle <p> The position counted from the beginning of haystack. </p>
  • int $offset [optional] <p> The search offset. If it is not specified, 0 is used. </p>

Return:

  • false|int <p>The numeric position of the first occurrence of needle in the haystack string. If needle is not found, it returns false.</p>

strrchr(string $haystack, string $needle, bool $before_needle, string $encoding, bool $clean_utf8): false|string

Find the last occurrence of a character in a string within another.

EXAMPLE: UTF8::strrchr('κόσμεκόσμε-äöü', 'κόσμε'); // 'κόσμε-äöü'

Parameters:

  • string $haystack <p>The string from which to get the last occurrence of needle.</p>
  • string $needle <p>The string to find in haystack</p>
  • bool $before_needle [optional] <p> Determines which portion of haystack this function returns. If set to true, it returns all of haystack from the beginning to the last occurrence of needle. If set to false, it returns all of haystack from the last occurrence of needle to the end, </p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • false|string <p>The portion of haystack or false if needle is not found.</p>

strrev(string $str, string $encoding): string

Reverses characters order in the string.

EXAMPLE: UTF8::strrev('κ-öäü'); // 'üäö-κ'

Parameters:

  • string $str <p>The input string.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>

Return:

  • string <p>The string with characters in the reverse sequence.</p>

strrichr(string $haystack, string $needle, bool $before_needle, string $encoding, bool $clean_utf8): false|string

Find the last occurrence of a character in a string within another, case-insensitive.

EXAMPLE: UTF8::strrichr('Aκόσμεκόσμε-äöü', 'aκόσμε'); // 'Aκόσμεκόσμε-äöü'

Parameters:

  • string $haystack <p>The string from which to get the last occurrence of needle.</p>
  • string $needle <p>The string to find in haystack.</p>
  • bool $before_needle [optional] <p> Determines which portion of haystack this function returns. If set to true, it returns all of haystack from the beginning to the last occurrence of needle. If set to false, it returns all of haystack from the last occurrence of needle to the end, </p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • false|string <p>The portion of haystack or<br>false if needle is not found.</p>

strripos(string $haystack, int|string $needle, int $offset, string $encoding, bool $clean_utf8): false|int

Find the position of the last occurrence of a substring in a string, case-insensitive.

EXAMPLE: UTF8::strripos('ABC-ÖÄÜ-中文空白-中文空白', '中'); // 13

Parameters:

  • string $haystack <p>The string to look in.</p>
  • int|string $needle <p>The string to look for.</p>
  • int $offset [optional] <p>Number of characters to ignore in the beginning or end.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • false|int <p>The <strong>(int)</strong> numeric position of the last occurrence of needle in the haystack string.<br>If needle is not found, it returns false.</p>

strripos_in_byte(string $haystack, string $needle, int $offset): false|int

Finds position of last occurrence of a string within another, case-insensitive.

Parameters:

  • string $haystack <p> The string from which to get the position of the last occurrence of needle. </p>
  • string $needle <p> The string to find in haystack. </p>
  • int $offset [optional] <p> The position in haystack to start searching. </p>

Return:

  • false|int <p>eturn the numeric position of the last occurrence of needle in the haystack string, or false if needle is not found.</p>

strrpos(string $haystack, int|string $needle, int $offset, string $encoding, bool $clean_utf8): false|int

Find the position of the last occurrence of a substring in a string.

EXAMPLE: UTF8::strrpos('ABC-ÖÄÜ-中文空白-中文空白', '中'); // 13

Parameters:

  • string $haystack <p>The string being checked, for the last occurrence of needle</p>
  • int|string $needle <p>The string to find in haystack.<br>Or a code point as int.</p>
  • int $offset [optional] <p>May be specified to begin searching an arbitrary number of characters into the string. Negative values will stop searching at an arbitrary point prior to the end of the string. </p>
  • string $encoding [optional] <p>Set the charset.</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • false|int <p>The <strong>(int)</strong> numeric position of the last occurrence of needle in the haystack string.<br>If needle is not found, it returns false.</p>

strrpos_in_byte(string $haystack, string $needle, int $offset): false|int

Find the position of the last occurrence of a substring in a string.

Parameters:

  • string $haystack <p> The string being checked, for the last occurrence of needle. </p>
  • string $needle <p> The string to find in haystack. </p>
  • int $offset [optional] <p>May be specified to begin searching an arbitrary number of characters into the string. Negative values will stop searching at an arbitrary point prior to the end of the string. </p>

Return:

  • false|int <p>The numeric position of the last occurrence of needle in the haystack string. If needle is not found, it returns false.</p>

strspn(string $str, string $mask, int $offset, int|null $length, string $encoding): false|int

Finds the length of the initial segment of a string consisting entirely of characters contained within a given mask.

EXAMPLE: UTF8::strspn('iñtërnâtiônàlizætiøn', 'itñ'); // '3'

Parameters:

  • string $str <p>The input string.</p>
  • string $mask <p>The mask of chars</p>
  • int $offset [optional]
  • int|null $length [optional]
  • string $encoding [optional] <p>Set the charset.</p>

Return:

  • false|int

strstr(string $haystack, string $needle, bool $before_needle, string $encoding, bool $clean_utf8): false|string

Returns part of haystack string from the first occurrence of needle to the end of haystack.

EXAMPLE: $str = 'iñtërnâtiônàlizætiøn'; $search = 'nât';

UTF8::strstr($str, $search)); // 'nâtiônàlizætiøn' UTF8::strstr($str, $search, true)); // 'iñtër'

Parameters:

  • string $haystack <p>The input string. Must be valid UTF-8.</p>
  • string $needle <p>The string to look for. Must be valid UTF-8.</p>
  • bool $before_needle [optional] <p> If <b>TRUE</b>, strstr() returns the part of the haystack before the first occurrence of the needle (excluding the needle). </p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • false|string <p>A sub-string,<br>or <strong>false</strong> if needle is not found.</p>

strstr_in_byte(string $haystack, string $needle, bool $before_needle): false|string

Finds first occurrence of a string within another.

Parameters:

  • string $haystack <p> The string from which to get the first occurrence of needle. </p>
  • string $needle <p> The string to find in haystack. </p>
  • bool $before_needle [optional] <p> Determines which portion of haystack this function returns. If set to true, it returns all of haystack from the beginning to the first occurrence of needle. If set to false, it returns all of haystack from the first occurrence of needle to the end, </p>

Return:

  • false|string <p>The portion of haystack, or false if needle is not found.</p>

strtocasefold(string $str, bool $full, bool $clean_utf8, string $encoding, string|null $lang, bool $lower): string

Unicode transformation for case-less matching.

EXAMPLE: UTF8::strtocasefold('ǰ◌̱'); // 'ǰ◌̱'

Parameters:

  • string $str <p>The input string.</p>
  • bool $full [optional] <p> <b>true</b>, replace full case folding chars (default)<br> <b>false</b>, use only limited static array [UTF8::$COMMON_CASE_FOLD] </p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>
  • string $encoding [optional] <p>Set the charset.</p>
  • string|null $lang [optional] <p>Set the language for special cases: az, el, lt, tr</p>
  • bool $lower [optional] <p>Use lowercase string, otherwise use uppercase string. PS: uppercase is for some languages better ...</p>

Return:

  • string

strtolower(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string

Make a string lowercase.

EXAMPLE: UTF8::strtolower('DÉJÀ Σσς Iıİi'); // 'déjà σσς iıii'

Parameters:

  • string $str <p>The string being lowercased.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>
  • string|null $lang [optional] <p>Set the language for special cases: az, el, lt, tr</p>
  • bool $try_to_keep_the_string_length [optional] <p>true === try to keep the string length: e.g. ẞ -> ß</p>

Return:

  • string <p>String with all alphabetic characters converted to lowercase.</p>

strtoupper(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string

Make a string uppercase.

EXAMPLE: UTF8::strtoupper('Déjà Σσς Iıİi'); // 'DÉJÀ ΣΣΣ IIİI'

Parameters:

  • string $str <p>The string being uppercased.</p>
  • string $encoding [optional] <p>Set the charset.</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>
  • string|null $lang [optional] <p>Set the language for special cases: az, el, lt, tr</p>
  • bool $try_to_keep_the_string_length [optional] <p>true === try to keep the string length: e.g. ẞ -> ß</p>

Return:

  • string <p>String with all alphabetic characters converted to uppercase.</p>

strtr(string $str, string|string[] $from, string|string[] $to): string

Translate characters or replace sub-strings.

EXAMPLE: $array = [ 'Hello' => '○●◎', '中文空白' => 'earth', ]; UTF8::strtr('Hello 中文空白', $array); // '○●◎ earth'

Parameters:

  • string $str <p>The string being translated.</p>
  • string|string[] $from <p>The string replacing from.</p>
  • string|string[] $to [optional] <p>The string being translated to to.</p>

Return:

  • string <p>This function returns a copy of str, translating all occurrences of each character in "from" to the corresponding character in "to".</p>

strwidth(string $str, string $encoding, bool $clean_utf8): int

Return the width of a string.

INFO: use UTF8::strlen() for the byte-length

EXAMPLE: UTF8::strwidth("Iñtërnâtiôn\xE9àlizætiøn")); // 21

Parameters:

  • string $str <p>The input string.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • int

substr(string $str, int $offset, int|null $length, string $encoding, bool $clean_utf8): false|string

Get part of a string.

EXAMPLE: UTF8::substr('中文空白', 1, 2); // '文空'

Parameters:

  • string $str <p>The string being checked.</p>
  • int $offset <p>The first position used in str.</p>
  • int|null $length [optional] <p>The maximum length of the returned string.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • false|string The portion of <i>str</i> specified by the <i>offset</i> and <i>length</i> parameters.</p><p>If <i>str</i> is shorter than <i>offset</i> characters long, <b>FALSE</b> will be returned.

substr_compare(string $str1, string $str2, int $offset, int|null $length, bool $case_insensitivity, string $encoding): int

Binary-safe comparison of two strings from an offset, up to a length of characters.

EXAMPLE: UTF8::substr_compare("○●◎\r", '●◎', 0, 2); // -1 UTF8::substr_compare("○●◎\r", '◎●', 1, 2); // 1 UTF8::substr_compare("○●◎\r", '●◎', 1, 2); // 0

Parameters:

  • string $str1 <p>The main string being compared.</p>
  • string $str2 <p>The secondary string being compared.</p>
  • int $offset [optional] <p>The start position for the comparison. If negative, it starts counting from the end of the string.</p>
  • int|null $length [optional] <p>The length of the comparison. The default value is the largest of the length of the str compared to the length of main_str less the offset.</p>
  • bool $case_insensitivity [optional] <p>If case_insensitivity is TRUE, comparison is case insensitive.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>

Return:

  • int <strong>&lt; 0</strong> if str1 is less than str2;<br> <strong>&gt; 0</strong> if str1 is greater than str2,<br> <strong>0</strong> if they are equal

substr_count(string $haystack, string $needle, int $offset, int|null $length, string $encoding, bool $clean_utf8): false|int

Count the number of substring occurrences.

EXAMPLE: UTF8::substr_count('中文空白', '文空', 1, 2); // 1

Parameters:

  • string $haystack <p>The string to search in.</p>
  • string $needle <p>The substring to search for.</p>
  • int $offset [optional] <p>The offset where to start counting.</p>
  • int|null $length [optional] <p> The maximum length after the specified offset to search for the substring. It outputs a warning if the offset plus the length is greater than the haystack length. </p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • false|int <p>This functions returns an integer or false if there isn't a string.</p>

substr_count_in_byte(string $haystack, string $needle, int $offset, int|null $length): false|int

Count the number of substring occurrences.

Parameters:

  • string $haystack <p> The string being checked. </p>
  • string $needle <p> The string being found. </p>
  • int $offset [optional] <p> The offset where to start counting </p>
  • int|null $length [optional] <p> The maximum length after the specified offset to search for the substring. It outputs a warning if the offset plus the length is greater than the haystack length. </p>

Return:

  • false|int <p>The number of times the needle substring occurs in the haystack string.</p>

substr_count_simple(string $str, string $substring, bool $case_sensitive, string $encoding): int

Returns the number of occurrences of $substring in the given string.

By default, the comparison is case-sensitive, but can be made insensitive by setting $case_sensitive to false.

Parameters:

  • string $str <p>The input string.</p>
  • string $substring <p>The substring to search for.</p>
  • bool $case_sensitive [optional] <p>Whether or not to enforce case-sensitivity. Default: true</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>

Return:

  • int

substr_ileft(string $haystack, string $needle): string

Removes a prefix ($needle) from the beginning of the string ($haystack), case-insensitive.

EXMAPLE: UTF8::substr_ileft('ΚόσμεMiddleEnd', 'Κόσμε'); // 'MiddleEnd' UTF8::substr_ileft('ΚόσμεMiddleEnd', 'κόσμε'); // 'MiddleEnd'

Parameters:

  • string $haystack <p>The string to search in.</p>
  • string $needle <p>The substring to search for.</p>

Return:

  • string <p>Return the sub-string.</p>

substr_in_byte(string $str, int $offset, int|null $length): false|string

Get part of a string process in bytes.

Parameters:

  • string $str <p>The string being checked.</p>
  • int $offset <p>The first position used in str.</p>
  • int|null $length [optional] <p>The maximum length of the returned string.</p>

Return:

  • false|string The portion of <i>str</i> specified by the <i>offset</i> and <i>length</i> parameters.</p><p>If <i>str</i> is shorter than <i>offset</i> characters long, <b>FALSE</b> will be returned.

substr_iright(string $haystack, string $needle): string

Removes a suffix ($needle) from the end of the string ($haystack), case-insensitive.

EXAMPLE: UTF8::substr_iright('BeginMiddleΚόσμε', 'Κόσμε'); // 'BeginMiddle' UTF8::substr_iright('BeginMiddleΚόσμε', 'κόσμε'); // 'BeginMiddle'

Parameters:

  • string $haystack <p>The string to search in.</p>
  • string $needle <p>The substring to search for.</p>

Return:

  • string <p>Return the sub-string.<p>

substr_left(string $haystack, string $needle): string

Removes a prefix ($needle) from the beginning of the string ($haystack).

EXAMPLE: UTF8::substr_left('ΚόσμεMiddleEnd', 'Κόσμε'); // 'MiddleEnd' UTF8::substr_left('ΚόσμεMiddleEnd', 'κόσμε'); // 'ΚόσμεMiddleEnd'

Parameters:

  • string $haystack <p>The string to search in.</p>
  • string $needle <p>The substring to search for.</p>

Return:

  • string <p>Return the sub-string.</p>

substr_replace(string|string[] $str, string|string[] $replacement, int|int[] $offset, int|int[]|null $length, string $encoding): string|string[]

Replace text within a portion of a string.

EXAMPLE: UTF8::substr_replace(array('Iñtërnâtiônàlizætiøn', 'foo'), 'æ', 1); // array('Iæñtërnâtiônàlizætiøn', 'fæoo')

source: https://gist.github.com/stemar/8287074

Parameters:

  • TSubReplace $str <p>The input string or an array of stings.</p>
  • string|string[] $replacement <p>The replacement string or an array of stings.</p>
  • int|int[] $offset <p> If start is positive, the replacing will begin at the start'th offset into string. <br><br> If start is negative, the replacing will begin at the start'th character from the end of string. </p>
  • int|int[]|null $length [optional] <p>If given and is positive, it represents the length of the portion of string which is to be replaced. If it is negative, it represents the number of characters from the end of string at which to stop replacing. If it is not given, then it will default to strlen( string ); i.e. end the replacing at the end of string. Of course, if length is zero then this function will have the effect of inserting replacement into string at the given start offset.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>

Return:

  • string|string[] <p>The result string is returned. If string is an array then array is returned.</p>

substr_right(string $haystack, string $needle, string $encoding): string

Removes a suffix ($needle) from the end of the string ($haystack).

EXAMPLE: UTF8::substr_right('BeginMiddleΚόσμε', 'Κόσμε'); // 'BeginMiddle' UTF8::substr_right('BeginMiddleΚόσμε', 'κόσμε'); // 'BeginMiddleΚόσμε'

Parameters:

  • string $haystack <p>The string to search in.</p>
  • string $needle <p>The substring to search for.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>

Return:

  • string <p>Return the sub-string.</p>

swapCase(string $str, string $encoding, bool $clean_utf8): string

Returns a case swapped version of the string.

EXAMPLE: UTF8::swapCase('déJÀ σσς iıII'); // 'DÉjà ΣΣΣ IIii'

Parameters:

  • string $str <p>The input string.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • string <p>Each character's case swapped.</p>

symfony_polyfill_used(): bool

Checks whether symfony-polyfills are used.

Parameters: nothing

Return:

  • bool <p><strong>true</strong> if in use, <strong>false</strong> otherwise</p>

tabs_to_spaces(string $str, int $tab_length): string

Parameters:

  • string $str
  • int $tab_length

Return:

  • string

titlecase(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string

Converts the first character of each word in the string to uppercase and all other chars to lowercase.

Parameters:

  • string $str <p>The input string.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>
  • string|null $lang [optional] <p>Set the language for special cases: az, el, lt, tr</p>
  • bool $try_to_keep_the_string_length [optional] <p>true === try to keep the string length: e.g. ẞ -> ß</p>

Return:

  • string <p>A string with all characters of $str being title-cased.</p>

to_ascii(string $str, string $unknown, bool $strict): string

Convert a string into ASCII.

EXAMPLE: UTF8::to_ascii('déjà σσς iıii'); // 'deja sss iiii'

Parameters:

  • string $str <p>The input string.</p>
  • string $unknown [optional] <p>Character use if character unknown. (default is ?)</p>
  • bool $strict [optional] <p>Use "transliterator_transliterate()" from PHP-Intl | WARNING: bad performance</p>

Return:

  • string

to_boolean(bool|float|int|string $str): bool

Parameters:

  • bool|float|int|string $str

Return:

  • bool

to_filename(string $str, bool $use_transliterate, string $fallback_char): string

Convert given string to safe filename (and keep string case).

Parameters:

  • string $str
  • bool $use_transliterate No transliteration, conversion etc. is done by default - unsafe characters are simply replaced with hyphen.
  • string $fallback_char

Return:

  • string

to_int(string $str): int|null

Returns the given string as an integer, or null if the string isn't numeric.

Parameters:

  • string $str

Return:

  • int|null <p>null if the string isn't numeric</p>

to_iso8859(string|string[] $str): string|string[]

Convert a string into "ISO-8859"-encoding (Latin-1).

EXAMPLE: UTF8::to_utf8(UTF8::to_iso8859(' -ABC-中文空白- ')); // ' -ABC-????- '

Parameters:

  • TToIso8859 $str

Return:

  • string|string[]

to_string(float|int|object|string|null $input): string|null

Returns the given input as string, or null if the input isn't int|float|string and do not implement the "__toString()" method.

Parameters:

  • float|int|object|string|null $input

Return:

  • string|null <p>null if the input isn't int|float|string and has no "__toString()" method</p>

to_utf8(string|string[] $str, bool $decode_html_entity_to_utf8): string|string[]

This function leaves UTF-8 characters alone, while converting almost all non-UTF8 to UTF8.

  • It decode UTF-8 codepoints and Unicode escape sequences.
  • It assumes that the encoding of the original string is either WINDOWS-1252 or ISO-8859.
  • WARNING: It does not remove invalid UTF-8 characters, so you maybe need to use "UTF8::clean()" for this case.

EXAMPLE: UTF8::to_utf8(["\u0063\u0061\u0074"]); // array('cat')

Parameters:

  • TToUtf8 $str <p>Any string or array of strings.</p>
  • bool $decode_html_entity_to_utf8 <p>Set to true, if you need to decode html-entities.</p>

Return:

  • string|string[] <p>The UTF-8 encoded string</p>

to_utf8_string(string $str, bool $decode_html_entity_to_utf8): string

This function leaves UTF-8 characters alone, while converting almost all non-UTF8 to UTF8.

  • It decode UTF-8 codepoints and Unicode escape sequences.
  • It assumes that the encoding of the original string is either WINDOWS-1252 or ISO-8859.
  • WARNING: It does not remove invalid UTF-8 characters, so you maybe need to use "UTF8::clean()" for this case.

EXAMPLE: UTF8::to_utf8_string("\u0063\u0061\u0074"); // 'cat'

Parameters:

  • string $str <p>Any string.</p>
  • bool $decode_html_entity_to_utf8 <p>Set to true, if you need to decode html-entities.</p>

Return:

  • string <p>The UTF-8 encoded string</p>

trim(string $str, string|null $chars): string

Strip whitespace or other characters from the beginning and end of a UTF-8 string.

INFO: This is slower then "trim()"

We can only use the original-function, if we use <= 7-Bit in the string / chars but the check for ASCII (7-Bit) cost more time, then we can safe here.

EXAMPLE: UTF8::trim(' -ABC-中文空白- '); // '-ABC-中文空白-'

Parameters:

  • string $str <p>The string to be trimmed</p>
  • string|null $chars [optional] <p>Optional characters to be stripped</p>

Return:

  • string <p>The trimmed string.</p>

ucfirst(string $str, string $encoding, bool $clean_utf8, string|null $lang, bool $try_to_keep_the_string_length): string

Makes string's first char uppercase.

EXAMPLE: UTF8::ucfirst('ñtërnâtiônàlizætiøn foo'); // 'Ñtërnâtiônàlizætiøn foo'

Parameters:

  • string $str <p>The input string.</p>
  • string $encoding [optional] <p>Set the charset for e.g. "mb_" function</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>
  • string|null $lang [optional] <p>Set the language for special cases: az, el, lt, tr</p>
  • bool $try_to_keep_the_string_length [optional] <p>true === try to keep the string length: e.g. ẞ -> ß</p>

Return:

  • string <p>The resulting string with with char uppercase.</p>

ucwords(string $str, string[] $exceptions, string $char_list, string $encoding, bool $clean_utf8): string

Uppercase for all words in the string.

EXAMPLE: UTF8::ucwords('iñt ërn âTi ônà liz æti øn'); // 'Iñt Ërn ÂTi Ônà Liz Æti Øn'

Parameters:

  • string $str <p>The input string.</p>
  • string[] $exceptions [optional] <p>Exclusion for some words.</p>
  • string $char_list [optional] <p>Additional chars that contains to words and do not start a new word.</p>
  • string $encoding [optional] <p>Set the charset.</p>
  • bool $clean_utf8 [optional] <p>Remove non UTF-8 chars from the string.</p>

Return:

  • string

urldecode(string $str, bool $multi_decode): string

Multi decode HTML entity + fix urlencoded-win1252-chars.

EXAMPLE: UTF8::urldecode('tes%20öäü%20\u00edtest+test'); // 'tes öäü ítest test'

e.g: 'test+test' => 'test test' 'Düsseldorf' => 'Düsseldorf' 'D%FCsseldorf' => 'Düsseldorf' 'Düsseldorf' => 'Düsseldorf' 'D%26%23xFC%3Bsseldorf' => 'Düsseldorf' 'Düsseldorf' => 'Düsseldorf' 'D%C3%BCsseldorf' => 'Düsseldorf' 'D%C3%83%C2%BCsseldorf' => 'Düsseldorf' 'D%25C3%2583%25C2%25BCsseldorf' => 'Düsseldorf'

Parameters:

  • string $str <p>The input string.</p>
  • bool $multi_decode <p>Decode as often as possible.</p>

Return:

  • string

utf8_decode(string $str, bool $keep_utf8_chars): string

Decodes a UTF-8 string to ISO-8859-1.

EXAMPLE: UTF8::encode('UTF-8', UTF8::utf8_decode('-ABC-中文空白-')); // '-ABC-????-'

Parameters:

  • string $str <p>The input string.</p>
  • bool $keep_utf8_chars

Return:

  • string

utf8_encode(string $str): string

Encodes an ISO-8859-1 string to UTF-8.

EXAMPLE: UTF8::utf8_decode(UTF8::utf8_encode('-ABC-中文空白-')); // '-ABC-中文空白-'

Parameters:

  • string $str <p>The input string.</p>

Return:

  • string

whitespace_table(): string[]

Returns an array with all utf8 whitespace characters.

Parameters: nothing

Return:

  • string[] An array with all known whitespace characters as values and the type of whitespace as keys as defined in above URL

words_limit(string $str, int $limit, string $str_add_on): string

Limit the number of words in a string.

EXAMPLE: UTF8::words_limit('fòô bàř fòô', 2, ''); // 'fòô bàř'

Parameters:

  • string $str <p>The input string.</p>
  • int $limit <p>The limit of words as integer.</p>
  • string $str_add_on <p>Replacement for the striped string.</p>

Return:

  • string

wordwrap(string $str, int $width, string $break, bool $cut): string

Wraps a string to a given number of characters

EXAMPLE: UTF8::wordwrap('Iñtërnâtiônàlizætiøn', 2, '
', true)); // 'Iñ

rn
ât


li

ti
øn'

Parameters:

  • string $str <p>The input string.</p>
  • int $width [optional] <p>The column width.</p>
  • string $break [optional] <p>The line is broken using the optional break parameter.</p>
  • bool $cut [optional] <p> If the cut is set to true, the string is always wrapped at or before the specified width. So if you have a word that is larger than the given width, it is broken apart. </p>

Return:

  • string <p>The given string wrapped at the specified column.</p>

wordwrap_per_line(string $str, int $width, string $break, bool $cut, bool $add_final_break, string|null $delimiter): string

Line-Wrap the string after $limit, but split the string by "$delimiter" before ... ... so that we wrap the per line.

Parameters:

  • string $str <p>The input string.</p>
  • int $width [optional] <p>The column width.</p>
  • string $break [optional] <p>The line is broken using the optional break parameter.</p>
  • bool $cut [optional] <p> If the cut is set to true, the string is always wrapped at or before the specified width. So if you have a word that is larger than the given width, it is broken apart. </p>
  • bool $add_final_break [optional] <p> If this flag is true, then the method will add a $break at the end of the result string. </p>
  • string|null $delimiter [optional] <p> You can change the default behavior, where we split the string by newline. </p>

Return:

  • string

ws(): string[]

Returns an array of Unicode White Space characters.

Parameters: nothing

Return:

  • string[] <p>An array with numeric code point as key and White Space Character as value.</p>

Unit Test

1) Composer is a prerequisite for running the tests.

composer install

2) The tests can be executed by running this command from the root directory:

./vendor/bin/phpunit

Support

For support and donations please visit GitHub | Issues | PayPal | Patreon.

For status updates and release announcements please visit Releases | Twitter | Patreon.

For professional support please contact me.

Thanks

  • Thanks to GitHub (Microsoft) for hosting the code and a good infrastructure including Issues-Management, etc.
  • Thanks to IntelliJ as they make the best IDEs for PHP and they gave me an open source license for PhpStorm!
  • Thanks to Travis CI for being the most awesome, easiest continuous integration tool out there!
  • Thanks to StyleCI for the simple but powerful code style check.
  • Thanks to PHPStan && Psalm for really great Static analysis tools and for discovering bugs in the code!

License and Copyright

"Portable UTF8" is free software; you can redistribute it and/or modify it under the terms of the (at your option):

Unicode handling requires tedious work to be implemented and maintained on the long run. As such, contributions such as unit tests, bug reports, comments or patches licensed under both licenses are really welcomed.

FOSSA Status