Skip to content

Commit f92f4eb

Browse files
feat: [google-cloud-dlp] add Dataplex Catalog action for discovery configs (#13951)
BEGIN_COMMIT_OVERRIDE feat: add Dataplex Catalog action for discovery configs feat: add a project ID to table reference so that org parents can create single table discovery configs. feat: new fields for data profile finding. docs: various doc revisions END_COMMIT_OVERRIDE - [ ] Regenerate this pull request now. feat: add a project ID to table reference so that org parents can create single table discovery configs. feat: new fields for data profile finding. docs: various doc revisions PiperOrigin-RevId: 763907074 Source-Link: googleapis/googleapis@d8bb284 Source-Link: googleapis/googleapis-gen@3b2654a Copy-Tag: eyJwIjoicGFja2FnZXMvZ29vZ2xlLWNsb3VkLWRscC8uT3dsQm90LnlhbWwiLCJoIjoiM2IyNjU0YWFiNzUxMDcxYzU0YTM1NTEwNTI4ZjA1ZTMyYTU0N2JlOSJ9 --------- Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
1 parent 80c61d3 commit f92f4eb

File tree

2 files changed

+134
-20
lines changed

2 files changed

+134
-20
lines changed

packages/google-cloud-dlp/google/cloud/dlp_v2/types/dlp.py

Lines changed: 124 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -6859,21 +6859,12 @@ class PublishFindingsToCloudDataCatalog(proto.Message):
68596859
"""
68606860

68616861
class Deidentify(proto.Message):
6862-
r"""Create a de-identified copy of the requested table or files.
6862+
r"""Create a de-identified copy of a storage bucket. Only
6863+
compatible with Cloud Storage buckets.
68636864
68646865
A TransformationDetail will be created for each transformation.
68656866
6866-
If any rows in BigQuery are skipped during de-identification
6867-
(transformation errors or row size exceeds BigQuery insert API
6868-
limits) they are placed in the failure output table. If the original
6869-
row exceeds the BigQuery insert API limit it will be truncated when
6870-
written to the failure output table. The failure output table can be
6871-
set in the
6872-
action.deidentify.output.big_query_output.deidentified_failure_output_table
6873-
field, if no table is set, a table will be automatically created in
6874-
the same project and dataset as the original table.
6875-
6876-
Compatible with: Inspect
6867+
Compatible with: Inspection of Cloud Storage
68776868
68786869
68796870
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
@@ -6884,14 +6875,76 @@ class Deidentify(proto.Message):
68846875
configs for structured, unstructured, and image
68856876
files.
68866877
transformation_details_storage_config (google.cloud.dlp_v2.types.TransformationDetailsStorageConfig):
6887-
Config for storing transformation details. This is separate
6888-
from the de-identified content, and contains metadata about
6889-
the successful transformations and/or failures that occurred
6890-
while de-identifying. This needs to be set in order for
6891-
users to access information about the status of each
6892-
transformation (see
6878+
Config for storing transformation details.
6879+
6880+
This field specifies the configuration for storing detailed
6881+
metadata about each transformation performed during a
6882+
de-identification process. The metadata is stored separately
6883+
from the de-identified content itself and provides a
6884+
granular record of both successful transformations and any
6885+
failures that occurred.
6886+
6887+
Enabling this configuration is essential for users who need
6888+
to access comprehensive information about the status,
6889+
outcome, and specifics of each transformation. The details
6890+
are captured in the
68936891
[TransformationDetails][google.privacy.dlp.v2.TransformationDetails]
6894-
message for more information about what is noted).
6892+
message for each operation.
6893+
6894+
Key use cases:
6895+
6896+
- **Auditing and compliance**
6897+
6898+
- Provides a verifiable audit trail of de-identification
6899+
activities, which is crucial for meeting regulatory
6900+
requirements and internal data governance policies.
6901+
- Logs what data was transformed, what transformations
6902+
were applied, when they occurred, and their success
6903+
status. This helps demonstrate accountability and due
6904+
diligence in protecting sensitive data.
6905+
6906+
- **Troubleshooting and debugging**
6907+
6908+
- Offers detailed error messages and context if a
6909+
transformation fails. This information is useful for
6910+
diagnosing and resolving issues in the
6911+
de-identification pipeline.
6912+
- Helps pinpoint the exact location and nature of
6913+
failures, speeding up the debugging process.
6914+
6915+
- **Process verification and quality assurance**
6916+
6917+
- Allows users to confirm that de-identification rules
6918+
and transformations were applied correctly and
6919+
consistently across the dataset as intended.
6920+
- Helps in verifying the effectiveness of the chosen
6921+
de-identification strategies.
6922+
6923+
- **Data lineage and impact analysis**
6924+
6925+
- Creates a record of how data elements were modified,
6926+
contributing to data lineage. This is useful for
6927+
understanding the provenance of de-identified data.
6928+
- Aids in assessing the potential impact of
6929+
de-identification choices on downstream analytical
6930+
processes or data usability.
6931+
6932+
- **Reporting and operational insights**
6933+
6934+
- You can analyze the metadata stored in a queryable
6935+
BigQuery table to generate reports on transformation
6936+
success rates, common error types, processing volumes
6937+
(e.g., transformedBytes), and the types of
6938+
transformations applied.
6939+
- These insights can inform optimization of
6940+
de-identification configurations and resource
6941+
planning.
6942+
6943+
To take advantage of these benefits, set this configuration.
6944+
The stored details include a description of the
6945+
transformation, success or error codes, error messages, the
6946+
number of bytes transformed, the location of the transformed
6947+
content, and identifiers for the job and source data.
68956948
cloud_storage_output (str):
68966949
Required. User settable Cloud Storage bucket
68976950
and folders to store de-identified files. This
@@ -7909,6 +7962,12 @@ class DataProfileAction(proto.Message):
79097962
Tags the profiled resources with the
79107963
specified tag values.
79117964
7965+
This field is a member of `oneof`_ ``action``.
7966+
publish_to_dataplex_catalog (google.cloud.dlp_v2.types.DataProfileAction.PublishToDataplexCatalog):
7967+
Publishes a portion of each profile to
7968+
Dataplex Catalog with the aspect type Sensitive
7969+
Data Protection Profile.
7970+
79127971
This field is a member of `oneof`_ ``action``.
79137972
"""
79147973

@@ -8070,6 +8129,29 @@ class PublishToSecurityCommandCenter(proto.Message):
80708129
80718130
"""
80728131

8132+
class PublishToDataplexCatalog(proto.Message):
8133+
r"""Create Dataplex Catalog aspects for profiled resources with
8134+
the aspect type Sensitive Data Protection Profile. To learn more
8135+
about aspects, see
8136+
https://cloud.google.com/sensitive-data-protection/docs/add-aspects.
8137+
8138+
Attributes:
8139+
lower_data_risk_to_low (bool):
8140+
Whether creating a Dataplex Catalog aspect
8141+
for a profiled resource should lower the risk of
8142+
the profile for that resource. This also lowers
8143+
the data risk of resources at the lower levels
8144+
of the resource hierarchy. For example, reducing
8145+
the data risk of a table data profile also
8146+
reduces the data risk of the constituent column
8147+
data profiles.
8148+
"""
8149+
8150+
lower_data_risk_to_low: bool = proto.Field(
8151+
proto.BOOL,
8152+
number=1,
8153+
)
8154+
80738155
class TagResources(proto.Message):
80748156
r"""If set, attaches the [tags]
80758157
(https://cloud.google.com/resource-manager/docs/tags/tags-overview)
@@ -8203,6 +8285,12 @@ class TagValue(proto.Message):
82038285
oneof="action",
82048286
message=TagResources,
82058287
)
8288+
publish_to_dataplex_catalog: PublishToDataplexCatalog = proto.Field(
8289+
proto.MESSAGE,
8290+
number=9,
8291+
oneof="action",
8292+
message=PublishToDataplexCatalog,
8293+
)
82068294

82078295

82088296
class DataProfileFinding(proto.Message):
@@ -8234,6 +8322,12 @@ class DataProfileFinding(proto.Message):
82348322
Where the content was found.
82358323
resource_visibility (google.cloud.dlp_v2.types.ResourceVisibility):
82368324
How broadly a resource has been shared.
8325+
full_resource_name (str):
8326+
The `full resource
8327+
name <https://cloud.google.com/apis/design/resource_names#full_resource_name>`__
8328+
of the resource profiled for this finding.
8329+
data_source_type (google.cloud.dlp_v2.types.DataSourceType):
8330+
The type of the resource that was profiled.
82378331
"""
82388332

82398333
quote: str = proto.Field(
@@ -8273,6 +8367,15 @@ class DataProfileFinding(proto.Message):
82738367
number=8,
82748368
enum="ResourceVisibility",
82758369
)
8370+
full_resource_name: str = proto.Field(
8371+
proto.STRING,
8372+
number=9,
8373+
)
8374+
data_source_type: "DataSourceType" = proto.Field(
8375+
proto.MESSAGE,
8376+
number=10,
8377+
message="DataSourceType",
8378+
)
82768379

82778380

82788381
class DataProfileFindingLocation(proto.Message):
@@ -13050,7 +13153,8 @@ class FileStoreDataProfile(proto.Message):
1305013153
The BigQuery table to which the sample
1305113154
findings are written.
1305213155
file_store_is_empty (bool):
13053-
The file store does not have any files.
13156+
The file store does not have any files. If
13157+
the profiling operation failed, this is false.
1305413158
tags (MutableSequence[google.cloud.dlp_v2.types.Tag]):
1305513159
The tags attached to the resource, including
1305613160
any tags attached during profiling.

packages/google-cloud-dlp/google/cloud/dlp_v2/types/storage.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1509,6 +1509,12 @@ class TableReference(proto.Message):
15091509
Dataset ID of the table.
15101510
table_id (str):
15111511
Name of the table.
1512+
project_id (str):
1513+
The Google Cloud project ID of the project
1514+
containing the table. If omitted, the project ID
1515+
is inferred from the parent project. This field
1516+
is required if the parent resource is an
1517+
organization.
15121518
"""
15131519

15141520
dataset_id: str = proto.Field(
@@ -1519,6 +1525,10 @@ class TableReference(proto.Message):
15191525
proto.STRING,
15201526
number=2,
15211527
)
1528+
project_id: str = proto.Field(
1529+
proto.STRING,
1530+
number=3,
1531+
)
15221532

15231533

15241534
class BigQueryField(proto.Message):

0 commit comments

Comments
 (0)