�Դ� PHP 4.4.0 �� 5.1.0�� ���������ת��������ѡ�� UTF-8ģʽʱ����ƥ��ͨ���ַ����͡������ǣ�
���� xx ������������������� Unicode ͨ����������ԡ� ÿ���ַ�����һ��������ȷ�������ԣ�ͨ��������д����ĸָ���� Ϊ���� perl ���ݣ� ������������ { �������� ^ ��ʾȡ�������磺 \p{^Lu} �͵�ͬ�� \P{Lu}��
���ͨ�� \p �� \P ��ָ����һ����ĸ�������������������ĸ��ͷ�����ԡ� ����������£������ŵ�ת�������ǿ�ѡ�ġ�
\p{L} \pL
Property | Matches | Notes |
---|---|---|
C | Other | |
Cc | Control | |
Cf | Format | |
Cn | Unassigned | |
Co | Private use | |
Cs | Surrogate | |
L | Letter | Includes the following properties: Ll, Lm, Lo, Lt and Lu. |
Ll | Lower case letter | |
Lm | Modifier letter | |
Lo | Other letter | |
Lt | Title case letter | |
Lu | Upper case letter | |
M | Mark | |
Mc | Spacing mark | |
Me | Enclosing mark | |
Mn | Non-spacing mark | |
N | Number | |
Nd | Decimal number | |
Nl | Letter number | |
No | Other number | |
P | Punctuation | |
Pc | Connector punctuation | |
Pd | Dash punctuation | |
Pe | Close punctuation | |
Pf | Final punctuation | |
Pi | Initial punctuation | |
Po | Other punctuation | |
Ps | Open punctuation | |
S | Symbol | |
Sc | Currency symbol | |
Sk | Modifier symbol | |
Sm | Mathematical symbol | |
So | Other symbol | |
Z | Separator | |
Zl | Line separator | |
Zp | Paragraph separator | |
Zs | Space separator |
InMusicalSymbols ����չ������ PCRE �в�֧��
ָ����Сд������ƥ�����Щת�����в������Ӱ�죬���磬 \p{Lu} ʼ��ƥ���д��ĸ��
Unicode �ַ����ھ��������ж��塣ʹ������������ƥ����Щ�ַ����е�һ���ַ������磺
����ȷ�������е����е� Common����ǰ�������б����У�
Arabic | Armenian | Avestan | Balinese | Bamum | |
Batak | Bengali | Bopomofo | Brahmi | Braille | |
Buginese | Buhid | Canadian_Aboriginal | Carian | Chakma | |
Cham | Cherokee | Common | Coptic | Cuneiform | |
Cypriot | Cyrillic | Deseret | Devanagari | Egyptian_Hieroglyphs | |
Ethiopic | Georgian | Glagolitic | Gothic | Greek | |
Gujarati | Gurmukhi | Han | Hangul | Hanunoo | |
Hebrew | Hiragana | Imperial_Aramaic | Inherited | Inscriptional_Pahlavi | |
Inscriptional_Parthian | Javanese | Kaithi | Kannada | Katakana | |
Kayah_Li | Kharoshthi | Khmer | Lao | Latin | |
Lepcha | Limbu | Linear_B | Lisu | Lycian | |
Lydian | Malayalam | Mandaic | Meetei_Mayek | Meroitic_Cursive | |
Meroitic_Hieroglyphs | Miao | Mongolian | Myanmar | New_Tai_Lue | |
Nko | Ogham | Old_Italic | Old_Persian | Old_South_Arabian | |
Old_Turkic | Ol_Chiki | Oriya | Osmanya | Phags_Pa | |
Phoenician | Rejang | Runic | Samaritan | Saurashtra | |
Sharada | Shavian | Sinhala | Sora_Sompeng | Sundanese | |
Syloti_Nagri | Syriac | Tagalog | Tagbanwa | Tai_Le | |
Tai_Tham | Tai_Viet | Takri | Tamil | Telugu | |
Thaana | Thai | Tibetan | Tifinagh | Ugaritic | |
Vai | Yi |
\X ת��ƥ������������ Unicode �ַ��� \X �ȼ��� (?>\PM\pM*)
Ҳ����˵����ƥ��һ��û�� "mark" ���Ե��ַ����������������� "mark" ���Ե��ַ��� �������������Ϊ��һ��ԭ����(�������)�� ���͵��� "mark" ���Ե��ַ���Ӱ�쵽ǰ����ַ�����������
�� Unicode ������ƥ���ַ������죬 ��Ϊ PCRE ��Ҫȥ����һ���������� 15000 �ַ������ݽṹ�� �����Ϊʲô�� PCRE�� Ҫʹ�ô�ͳ��ת������\d�� \w ����ʹ�� Unicode ���Ե�ԭ��