architxt.utils#

Functions

get_commit_batch_size(commit)

Derive the batch size for commit operations.

update_url_queries(url, **p)

Update query parameters in a URL.

windowed_shuffle(iterable[, window_size])

Shuffle an Iterable by yielding items in a randomized order using a sliding window buffer.

architxt.utils.get_commit_batch_size(commit)[source]#

Derive the batch size for commit operations.

Parameters:

commit (Union[bool, int]) – Commit mode. - If True or False, returns the default BATCH_SIZE. - If a positive integer, returns that value as the batch size.

Return type:

int

Returns:

The batch size to use for chunked operations.

Raises:

ValueError – If commit is a non-positive integer.

architxt.utils.update_url_queries(url, **p)[source]#

Update query parameters in a URL.

Merges existing query parameters with provided keyword arguments. If a parameter already exists, it will be overwritten.

>>> update_url_queries('https://example.com?foo=1', bar='2')
'https://example.com?foo=1&bar=2'
>>> update_url_queries('https://example.com?foo=1', foo='overwritten')
'https://example.com?foo=overwritten'
Parameters:
  • url (str) – The URL to update.

  • p – Query parameters to add or update.

Return type:

str

Returns:

The URL with updated query parameters.

architxt.utils.windowed_shuffle(iterable, window_size=10)[source]#

Shuffle an Iterable by yielding items in a randomized order using a sliding window buffer.

Parameters:
  • iterable (Iterable[T]) – Iterable to shuffle.

  • window_size (int) – Size of the sliding window buffer.

Yield:

Shuffled items.

Raises:

ValueError – If window_size is <= 1.

Return type:

Generator[T, None, None]

architxt.utils.T = TypeVar(T)#

Type:    TypeVar

Invariant TypeVar.