Skip to content

Commit 5f2769d

Browse files
authored
Increasing num shards allowed (#388)
* Removed test on too many shards
1 parent cfaa491 commit 5f2769d

File tree

2 files changed

+1
-32
lines changed

2 files changed

+1
-32
lines changed

ciphercore-base/src/graphs.rs

Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2450,20 +2450,7 @@ impl Graph {
24502450
/// Adds a node that computes sharding of a given table according to a given sharding config.
24512451
/// Sharding config contains names of the columns whose hashed values are used for sharding.
24522452
/// The size of each shard (i.e., the number of rows) and the number of shards is given in the sharding config.
2453-
/// The number of shards should be smaller than 700.
2454-
///
2455-
///
2456-
/// If some resulting shards don't have `shard_size` elements, they're padded with zeros to reach this size.
2457-
/// If the size of some shards exceeds `shard_size`, sharding fails.
2458-
///
2459-
/// To choose these parameters, consult [the following paper](http://wwwmayr.informatik.tu-muenchen.de/personen/raab/publ/balls.pdf).
2460-
/// Note that for large shard sizes and small number of shards, it holds that
2461-
///
2462-
/// `shard_size = num_input_rows / num_shards + alpha * sqrt(2 * num_input_rows / num_shards * log(num_shards))`.
2463-
///
2464-
/// With `alpha = 2`, it is possible to achieve failure probability 2^(-40) if `num_shards < 700` and `shard_size > 2^17`.
2465-
///
2466-
///
2453+
24672454
/// Each shard is accompanied by a Boolean mask indicating whether a corresponding row stems from the input table or padded (1 if a row comes from input).
24682455
/// The output is given in the form of a tuple of `(mask, shard)`, where `mask` is a binary array and `shard` is a table, i.e., named tuple.
24692456
///

ciphercore-base/src/type_inference.rs

Lines changed: 0 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1574,13 +1574,6 @@ impl TypeInferenceWorker {
15741574
check_table_and_extract_column_types(input_t.clone(), false, false)?;
15751575
let headers: Vec<String> = headers_types.keys().cloned().collect();
15761576

1577-
if shard_config.num_shards > 700 {
1578-
return Err(runtime_error!(
1579-
"No more than 700 shards can be handled, {} provided",
1580-
shard_config.num_shards
1581-
));
1582-
}
1583-
15841577
let num_elements_in_all_shards = shard_config.shard_size * shard_config.num_shards;
15851578
if num_entries > num_elements_in_all_shards {
15861579
return Err(runtime_error!("Input elements can't fit given shards. Shards can contain {} elements, while input has {}", num_elements_in_all_shards, num_entries));
@@ -4771,17 +4764,6 @@ mod tests {
47714764
shard_headers: vec!["ID".to_owned(), "ID".to_owned()],
47724765
},
47734766
)?;
4774-
test_shard_fail(
4775-
named_tuple_type(vec![
4776-
("Income".to_owned(), array_type(vec![20], UINT64)),
4777-
("ID".to_owned(), array_type(vec![20], UINT64)),
4778-
]),
4779-
ShardConfig {
4780-
num_shards: 800,
4781-
shard_size: 10,
4782-
shard_headers: vec!["ID".to_owned()],
4783-
},
4784-
)?;
47854767

47864768
Ok(())
47874769
}

0 commit comments

Comments
 (0)