|
PostgreSQL Source Code git master
|
Data Structures | |
| class | Codepoint |
Functions | |
| print_record (codepoint, letter) | |
| is_mark_to_remove (codepoint) | |
| is_plain_letter (codepoint) | |
| is_mark (codepoint) | |
| is_letter_with_marks (codepoint, table) | |
| is_letter (codepoint, table) | |
| get_plain_letter (codepoint, table) | |
| is_ligature (codepoint, table) | |
| get_plain_letters (codepoint, table) | |
| parse_cldr_latin_ascii_transliterator (latinAsciiFilePath) | |
| special_cases () | |
| main (args) | |
Variables | |
| stdout | |
| tuple | PLAIN_LETTER_RANGES |
| tuple | COMBINING_MARK_RANGES |
| parser = argparse.ArgumentParser(description='This script builds unaccent.rules on standard output when given the contents of UnicodeData.txt and Latin-ASCII.xml given as arguments.') | |
| help | |
| type | |
| str | |
| required | |
| True | |
| dest | |
| action | |
| args = parser.parse_args() | |
| generate_unaccent_rules.get_plain_letter | ( | codepoint, | |
| table | |||
| ) |
Return the base codepoint without marks. If this codepoint has more than one combining character, do a recursive lookup on the table to find out its plain base letter.
Definition at line 131 of file generate_unaccent_rules.py.
References fb(), get_plain_letter(), is_letter_with_marks(), is_plain_letter(), and len.
Referenced by get_plain_letter(), get_plain_letters(), and main().
| generate_unaccent_rules.get_plain_letters | ( | codepoint, | |
| table | |||
| ) |
Return a list of plain letters from a ligature.
Definition at line 154 of file generate_unaccent_rules.py.
References assert, fb(), get_plain_letter(), and is_ligature().
Referenced by main().
| generate_unaccent_rules.is_letter | ( | codepoint, | |
| table | |||
| ) |
Return true for letter with or without diacritical marks.
Definition at line 126 of file generate_unaccent_rules.py.
References is_letter_with_marks(), and is_plain_letter().
Referenced by is_ligature().
| generate_unaccent_rules.is_letter_with_marks | ( | codepoint, | |
| table | |||
| ) |
Returns true for letters combined with one or more marks.
Definition at line 103 of file generate_unaccent_rules.py.
References fb(), is_letter_with_marks(), is_mark(), is_plain_letter(), and len.
Referenced by get_plain_letter(), is_letter(), is_letter_with_marks(), and main().
| generate_unaccent_rules.is_ligature | ( | codepoint, | |
| table | |||
| ) |
Return true for letters combined with letters.
Definition at line 150 of file generate_unaccent_rules.py.
References fb(), and is_letter().
Referenced by get_plain_letters(), and main().
| generate_unaccent_rules.is_mark | ( | codepoint | ) |
Returns true for diacritical marks (combining codepoints).
Definition at line 98 of file generate_unaccent_rules.py.
References fb().
Referenced by is_letter_with_marks(), and is_mark_to_remove().
| generate_unaccent_rules.is_mark_to_remove | ( | codepoint | ) |
Return true if this is a combining mark to remove.
Definition at line 79 of file generate_unaccent_rules.py.
References fb(), and is_mark().
Referenced by main().
| generate_unaccent_rules.is_plain_letter | ( | codepoint | ) |
Return true if codepoint represents a "plain letter".
Definition at line 90 of file generate_unaccent_rules.py.
References fb().
Referenced by get_plain_letter(), is_letter(), and is_letter_with_marks().
| generate_unaccent_rules.main | ( | args | ) |
Definition at line 228 of file generate_unaccent_rules.py.
References fb(), get_plain_letter(), get_plain_letters(), is_letter_with_marks(), is_ligature(), is_mark_to_remove(), len, parse_cldr_latin_ascii_transliterator(), print_record(), and special_cases().
| generate_unaccent_rules.parse_cldr_latin_ascii_transliterator | ( | latinAsciiFilePath | ) |
Parse the XML file and return a set of tuples (src, trg), where "src" is the original character and "trg" the substitute.
Definition at line 160 of file generate_unaccent_rules.py.
References assert, fb(), and len.
Referenced by main().
| generate_unaccent_rules.print_record | ( | codepoint, | |
| letter | |||
| ) |
Definition at line 59 of file generate_unaccent_rules.py.
Referenced by main().
| generate_unaccent_rules.special_cases | ( | ) |
Returns the special cases which are not handled by other methods
Definition at line 213 of file generate_unaccent_rules.py.
References fb().
Referenced by main().
| generate_unaccent_rules.action |
Definition at line 287 of file generate_unaccent_rules.py.
| generate_unaccent_rules.args = parser.parse_args() |
Definition at line 288 of file generate_unaccent_rules.py.
| tuple generate_unaccent_rules.COMBINING_MARK_RANGES |
Definition at line 54 of file generate_unaccent_rules.py.
| generate_unaccent_rules.dest |
Definition at line 285 of file generate_unaccent_rules.py.
| generate_unaccent_rules.help | ( | void | ) |
Definition at line 285 of file generate_unaccent_rules.py.
| generate_unaccent_rules.parser = argparse.ArgumentParser(description='This script builds unaccent.rules on standard output when given the contents of UnicodeData.txt and Latin-ASCII.xml given as arguments.') |
Definition at line 284 of file generate_unaccent_rules.py.
| tuple generate_unaccent_rules.PLAIN_LETTER_RANGES |
| generate_unaccent_rules.required |
Definition at line 285 of file generate_unaccent_rules.py.
| generate_unaccent_rules.stdout |
Definition at line 35 of file generate_unaccent_rules.py.
| generate_unaccent_rules.str |
Definition at line 285 of file generate_unaccent_rules.py.
| generate_unaccent_rules.True |
Definition at line 285 of file generate_unaccent_rules.py.
| generate_unaccent_rules.type |
Definition at line 285 of file generate_unaccent_rules.py.