architxt.utils#
Functions
|
Derive the batch size for commit operations. |
|
Update query parameters in a URL. |
|
Shuffle an |
- architxt.utils.get_commit_batch_size(commit)[source]#
Derive the batch size for commit operations.
- Parameters:
commit (
Union[bool,int]) – Commit mode. - If True or False, returns the default BATCH_SIZE. - If a positive integer, returns that value as the batch size.- Return type:
- Returns:
The batch size to use for chunked operations.
- Raises:
ValueError – If commit is a non-positive integer.
- architxt.utils.update_url_queries(url, **p)[source]#
Update query parameters in a URL.
Merges existing query parameters with provided keyword arguments. If a parameter already exists, it will be overwritten.
>>> update_url_queries('https://example.com?foo=1', bar='2') 'https://example.com?foo=1&bar=2'
>>> update_url_queries('https://example.com?foo=1', foo='overwritten') 'https://example.com?foo=overwritten'
- architxt.utils.windowed_shuffle(iterable, window_size=10)[source]#
Shuffle an
Iterableby yielding items in a randomized order using a sliding window buffer.- Parameters:
iterable (
Iterable[T]) – Iterable to shuffle.window_size (
int) – Size of the sliding window buffer.
- Yield:
Shuffled items.
- Raises:
ValueError – If window_size is <= 1.
- Return type:
Generator[T, None, None]