maha.cleaners.functions.replace_fn#

Functions that operate on a string and replace specific characters with others.

Module Contents#

Functions#

connect_single_letter_word(text[, waw, feh, ...])

Connects single-letter word with the letter following it.

arabic_numbers_to_english(text)

Converts Arabic numbers ARABIC_NUMBERS to the corresponding English numbers ENGLISH_NUMBERS

replace_expression(text, expression, with_value)

Matches characters from the input text using the given expression and replaces all matched characters with the given value.

replace(text, strings, with_value)

Replaces the input strings in the given text with the given value

replace_except(text, strings, with_value)

Replaces everything except the input strings in the given text with the given value

replace_pairs(text, keys, values)

Replaces each key with its corresponding value in the given text

connect_single_letter_word(text, waw=None, feh=None, beh=None, lam=None, kaf=None, teh=None, all=None, custom_strings=None)[source]#

Connects single-letter word with the letter following it.

Parameters
  • text (str) – Text to process

  • waw (bool, optional) – Connect WAW letter, by default None

  • feh (bool, optional) – Connect FEH letter, by default None

  • beh (bool, optional) – Connect BEH letter, by default None

  • lam (bool, optional) – Connect LAM letter, by default None

  • kaf (bool, optional) – Connect KAF letter, by default None

  • teh (bool, optional) – Connect TEH letter, by default None

  • all (bool, optional) – Connect all letter except the ones set to False, by default None

  • custom_strings (Union[List[str], str], optional) – Include any other string(s) to connect, by default None

arabic_numbers_to_english(text)[source]#

Converts Arabic numbers ARABIC_NUMBERS to the corresponding English numbers ENGLISH_NUMBERS

Parameters

text (str) – Text to process

Returns

Processed text with all occurrences of Arabic numbers converted to English numbers

Return type

str

Examples

>>> from maha.cleaners.functions import arabic_numbers_to_english
>>> text = "٣"
>>> arabic_numbers_to_english(text)
'3'
>>> from maha.cleaners.functions import arabic_numbers_to_english
>>> text = "١٠"
>>> arabic_numbers_to_english(text)
'10'
replace_expression(text, expression, with_value)[source]#

Matches characters from the input text using the given expression and replaces all matched characters with the given value.

Parameters
  • text (str) – Text to process

  • expression (Expression | ExpressionGroup | str) – Pattern/Expression used to match characters from the text

  • with_value (Callable[..., str] | str) – Value to replace the matched characters with

Returns

Processed text

Return type

str

Examples

>>> from maha.cleaners.functions import replace_expression
>>> text = "ولقد حصلت على ١٠ من ١٠ "
>>> replace_expression(text, "١٠", "عشرة")
'ولقد حصلت على عشرة من عشرة '
>>> from maha.cleaners.functions import replace_expression
>>> text = "ذهبت الفتاه إلى المدرسه"
>>> replace_expression(text, "ه( |$)", "ة ").strip()
'ذهبت الفتاة إلى المدرسة'
replace(text, strings, with_value)[source]#

Replaces the input strings in the given text with the given value

Parameters
  • text (str) – Text to process

  • strings (list[str] | str) – Strings to replace

  • with_value (str) – Value to replace the input strings with

Returns

Processed text

Return type

str

Examples

>>> from maha.cleaners.functions import replace
>>> text = "حصل الولد على معدل 50%"
>>> replace(text, "%", " بالمئة")
'حصل الولد على معدل 50 بالمئة'
>>> from maha.cleaners.functions import replace
>>> text = "ولقد كلف هذا المنتج 100 $"
>>> replace(text, "$", "دولار")
'ولقد كلف هذا المنتج 100 دولار'
replace_except(text, strings, with_value)[source]#

Replaces everything except the input strings in the given text with the given value

Parameters
  • text (str) – Text to process

  • strings (list[str] | str) – Strings to preserve (not replace)

  • with_value (str) – Value to replace all other strings with.

Returns

Processed text

Return type

str

Example

>>> from maha.cleaners.functions import replace_except
>>> from maha.constants import ARABIC_LETTERS, SPACE, EMPTY
>>> text = "لَيتَ الذينَ تُحبُّ العيّنَ رؤيَتهم"
>>> replace_except(text, ARABIC_LETTERS + [SPACE], EMPTY)
'ليت الذين تحب العين رؤيتهم'
replace_pairs(text, keys, values)[source]#

Replaces each key with its corresponding value in the given text

Parameters
  • text (str) – Text to process

  • keys (list[str]) – Strings to be replaced

  • values (list[str]) – Strings to be replaced with

Returns

Processed text

Return type

str

Raises

ValueError – If keys and values are of different lengths

Example

>>> from maha.cleaners.functions import replace_pairs
>>> text = 'شلونك يا محمد؟'
>>> replace_pairs(text, ['شلونك'] , ['كيف حالك'])
'كيف حالك يا محمد؟'