All Superinterfaces:

com.google.protobuf.MessageLiteOrBuilder, com.google.protobuf.MessageOrBuilder

All Known Implementing Classes:

InputDataConfig, InputDataConfig.Builder
```
public interface InputDataConfigOrBuilder
extends com.google.protobuf.MessageOrBuilder
```

Method Summary

All Methods Instance Methods Abstract Methods
Modifier and Type	Method	Description
`String`	`getAnnotationSchemaUri()`	Applicable only to custom training with Datasets that have DataItems and Annotations.
`com.google.protobuf.ByteString`	`getAnnotationSchemaUriBytes()`	Applicable only to custom training with Datasets that have DataItems and Annotations.
`String`	`getAnnotationsFilter()`	Applicable only to Datasets that have DataItems and Annotations.
`com.google.protobuf.ByteString`	`getAnnotationsFilterBytes()`	Applicable only to Datasets that have DataItems and Annotations.
`BigQueryDestination`	`getBigqueryDestination()`	Only applicable to custom training with tabular Dataset with BigQuery source.
`BigQueryDestinationOrBuilder`	`getBigqueryDestinationOrBuilder()`	Only applicable to custom training with tabular Dataset with BigQuery source.
`String`	`getDatasetId()`	Required.
`com.google.protobuf.ByteString`	`getDatasetIdBytes()`	Required.
`InputDataConfig.DestinationCase`	`getDestinationCase()`
`FilterSplit`	`getFilterSplit()`	Split based on the provided filters for each set.
`FilterSplitOrBuilder`	`getFilterSplitOrBuilder()`	Split based on the provided filters for each set.
`FractionSplit`	`getFractionSplit()`	Split based on fractions defining the size of each set.
`FractionSplitOrBuilder`	`getFractionSplitOrBuilder()`	Split based on fractions defining the size of each set.
`GcsDestination`	`getGcsDestination()`	The Cloud Storage location where the training data is to be written to.
`GcsDestinationOrBuilder`	`getGcsDestinationOrBuilder()`	The Cloud Storage location where the training data is to be written to.
`boolean`	`getPersistMlUseAssignment()`	Whether to persist the ML use assignment to data item system labels.
`PredefinedSplit`	`getPredefinedSplit()`	Supported only for tabular Datasets.
`PredefinedSplitOrBuilder`	`getPredefinedSplitOrBuilder()`	Supported only for tabular Datasets.
`String`	`getSavedQueryId()`	Only applicable to Datasets that have SavedQueries.
`com.google.protobuf.ByteString`	`getSavedQueryIdBytes()`	Only applicable to Datasets that have SavedQueries.
`InputDataConfig.SplitCase`	`getSplitCase()`
`StratifiedSplit`	`getStratifiedSplit()`	Supported only for tabular Datasets.
`StratifiedSplitOrBuilder`	`getStratifiedSplitOrBuilder()`	Supported only for tabular Datasets.
`TimestampSplit`	`getTimestampSplit()`	Supported only for tabular Datasets.
`TimestampSplitOrBuilder`	`getTimestampSplitOrBuilder()`	Supported only for tabular Datasets.
`boolean`	`hasBigqueryDestination()`	Only applicable to custom training with tabular Dataset with BigQuery source.
`boolean`	`hasFilterSplit()`	Split based on the provided filters for each set.
`boolean`	`hasFractionSplit()`	Split based on fractions defining the size of each set.
`boolean`	`hasGcsDestination()`	The Cloud Storage location where the training data is to be written to.
`boolean`	`hasPredefinedSplit()`	Supported only for tabular Datasets.
`boolean`	`hasStratifiedSplit()`	Supported only for tabular Datasets.
`boolean`	`hasTimestampSplit()`	Supported only for tabular Datasets.

Methods inherited from interface com.google.protobuf.MessageLiteOrBuilder
isInitialized

Methods inherited from interface com.google.protobuf.MessageOrBuilder
findInitializationErrors, getAllFields, getDefaultInstanceForType, getDescriptorForType, getField, getInitializationErrorString, getOneofFieldDescriptor, getRepeatedField, getRepeatedFieldCount, getUnknownFields, hasField, hasOneof

Method Detail

hasFractionSplit
```
boolean hasFractionSplit()
```
```
 Split based on fractions defining the size of each set.
 
```
.google.cloud.aiplatform.v1.FractionSplit fraction_split = 2;
Returns:

Whether the fractionSplit field is set.

getFractionSplit
```
FractionSplit getFractionSplit()
```
```
 Split based on fractions defining the size of each set.
 
```
.google.cloud.aiplatform.v1.FractionSplit fraction_split = 2;
Returns:

The fractionSplit.

getFractionSplitOrBuilder

FractionSplitOrBuilder getFractionSplitOrBuilder()

 Split based on fractions defining the size of each set.

.google.cloud.aiplatform.v1.FractionSplit fraction_split = 2;

hasFilterSplit
```
boolean hasFilterSplit()
```
```
 Split based on the provided filters for each set.
 
```
.google.cloud.aiplatform.v1.FilterSplit filter_split = 3;
Returns:

Whether the filterSplit field is set.

getFilterSplit
```
FilterSplit getFilterSplit()
```
```
 Split based on the provided filters for each set.
 
```
.google.cloud.aiplatform.v1.FilterSplit filter_split = 3;
Returns:

The filterSplit.

getFilterSplitOrBuilder

FilterSplitOrBuilder getFilterSplitOrBuilder()

 Split based on the provided filters for each set.

.google.cloud.aiplatform.v1.FilterSplit filter_split = 3;

hasPredefinedSplit
```
boolean hasPredefinedSplit()
```
```
 Supported only for tabular Datasets.

 Split based on a predefined key.
 
```
.google.cloud.aiplatform.v1.PredefinedSplit predefined_split = 4;
Returns:

Whether the predefinedSplit field is set.

getPredefinedSplit

PredefinedSplit getPredefinedSplit()

 Supported only for tabular Datasets.

 Split based on a predefined key.

.google.cloud.aiplatform.v1.PredefinedSplit predefined_split = 4;

Returns:: The predefinedSplit.

getPredefinedSplitOrBuilder

PredefinedSplitOrBuilder getPredefinedSplitOrBuilder()

 Supported only for tabular Datasets.

 Split based on a predefined key.

.google.cloud.aiplatform.v1.PredefinedSplit predefined_split = 4;

hasTimestampSplit
```
boolean hasTimestampSplit()
```
```
 Supported only for tabular Datasets.

 Split based on the timestamp of the input data pieces.
 
```
.google.cloud.aiplatform.v1.TimestampSplit timestamp_split = 5;
Returns:

Whether the timestampSplit field is set.

getTimestampSplit

TimestampSplit getTimestampSplit()

 Supported only for tabular Datasets.

 Split based on the timestamp of the input data pieces.

.google.cloud.aiplatform.v1.TimestampSplit timestamp_split = 5;

Returns:: The timestampSplit.

getTimestampSplitOrBuilder

TimestampSplitOrBuilder getTimestampSplitOrBuilder()

 Supported only for tabular Datasets.

 Split based on the timestamp of the input data pieces.

.google.cloud.aiplatform.v1.TimestampSplit timestamp_split = 5;

hasStratifiedSplit
```
boolean hasStratifiedSplit()
```
```
 Supported only for tabular Datasets.

 Split based on the distribution of the specified column.
 
```
.google.cloud.aiplatform.v1.StratifiedSplit stratified_split = 12;
Returns:

Whether the stratifiedSplit field is set.

getStratifiedSplit

StratifiedSplit getStratifiedSplit()

 Supported only for tabular Datasets.

 Split based on the distribution of the specified column.

.google.cloud.aiplatform.v1.StratifiedSplit stratified_split = 12;

Returns:: The stratifiedSplit.

getStratifiedSplitOrBuilder

StratifiedSplitOrBuilder getStratifiedSplitOrBuilder()

 Supported only for tabular Datasets.

 Split based on the distribution of the specified column.

.google.cloud.aiplatform.v1.StratifiedSplit stratified_split = 12;

hasGcsDestination

boolean hasGcsDestination()

 The Cloud Storage location where the training data is to be
 written to. In the given directory a new directory is created with
 name:
 `dataset-<dataset-id>-<annotation-type>-<timestamp-of-training-call>`
 where timestamp is in YYYY-MM-DDThh:mm:ss.sssZ ISO-8601 format.
 All training input data is written into that directory.

 The Vertex AI environment variables representing Cloud Storage
 data URIs are represented in the Cloud Storage wildcard
 format to support sharded data. e.g.: "gs://.../training-*.jsonl"

 * AIP_DATA_FORMAT = "jsonl" for non-tabular data, "csv" for tabular data
 * AIP_TRAINING_DATA_URI =
 "gcs_destination/dataset-<dataset-id>-<annotation-type>-<time>/training-*.${AIP_DATA_FORMAT}"

 * AIP_VALIDATION_DATA_URI =
 "gcs_destination/dataset-<dataset-id>-<annotation-type>-<time>/validation-*.${AIP_DATA_FORMAT}"

 * AIP_TEST_DATA_URI =
 "gcs_destination/dataset-<dataset-id>-<annotation-type>-<time>/test-*.${AIP_DATA_FORMAT}"

.google.cloud.aiplatform.v1.GcsDestination gcs_destination = 8;

Returns:: Whether the gcsDestination field is set.

getGcsDestination

GcsDestination getGcsDestination()

 The Cloud Storage location where the training data is to be
 written to. In the given directory a new directory is created with
 name:
 `dataset-<dataset-id>-<annotation-type>-<timestamp-of-training-call>`
 where timestamp is in YYYY-MM-DDThh:mm:ss.sssZ ISO-8601 format.
 All training input data is written into that directory.

 The Vertex AI environment variables representing Cloud Storage
 data URIs are represented in the Cloud Storage wildcard
 format to support sharded data. e.g.: "gs://.../training-*.jsonl"

 * AIP_DATA_FORMAT = "jsonl" for non-tabular data, "csv" for tabular data
 * AIP_TRAINING_DATA_URI =
 "gcs_destination/dataset-<dataset-id>-<annotation-type>-<time>/training-*.${AIP_DATA_FORMAT}"

 * AIP_VALIDATION_DATA_URI =
 "gcs_destination/dataset-<dataset-id>-<annotation-type>-<time>/validation-*.${AIP_DATA_FORMAT}"

 * AIP_TEST_DATA_URI =
 "gcs_destination/dataset-<dataset-id>-<annotation-type>-<time>/test-*.${AIP_DATA_FORMAT}"

.google.cloud.aiplatform.v1.GcsDestination gcs_destination = 8;

Returns:: The gcsDestination.

getGcsDestinationOrBuilder

GcsDestinationOrBuilder getGcsDestinationOrBuilder()

 The Cloud Storage location where the training data is to be
 written to. In the given directory a new directory is created with
 name:
 `dataset-<dataset-id>-<annotation-type>-<timestamp-of-training-call>`
 where timestamp is in YYYY-MM-DDThh:mm:ss.sssZ ISO-8601 format.
 All training input data is written into that directory.

 The Vertex AI environment variables representing Cloud Storage
 data URIs are represented in the Cloud Storage wildcard
 format to support sharded data. e.g.: "gs://.../training-*.jsonl"

 * AIP_DATA_FORMAT = "jsonl" for non-tabular data, "csv" for tabular data
 * AIP_TRAINING_DATA_URI =
 "gcs_destination/dataset-<dataset-id>-<annotation-type>-<time>/training-*.${AIP_DATA_FORMAT}"

 * AIP_VALIDATION_DATA_URI =
 "gcs_destination/dataset-<dataset-id>-<annotation-type>-<time>/validation-*.${AIP_DATA_FORMAT}"

 * AIP_TEST_DATA_URI =
 "gcs_destination/dataset-<dataset-id>-<annotation-type>-<time>/test-*.${AIP_DATA_FORMAT}"

.google.cloud.aiplatform.v1.GcsDestination gcs_destination = 8;

hasBigqueryDestination

boolean hasBigqueryDestination()

 Only applicable to custom training with tabular Dataset with BigQuery
 source.

 The BigQuery project location where the training data is to be written
 to. In the given project a new dataset is created with name
 `dataset_<dataset-id>_<annotation-type>_<timestamp-of-training-call>`
 where timestamp is in YYYY_MM_DDThh_mm_ss_sssZ format. All training
 input data is written into that dataset. In the dataset three
 tables are created, `training`, `validation` and `test`.

 * AIP_DATA_FORMAT = "bigquery".
 * AIP_TRAINING_DATA_URI  =
 "bigquery_destination.dataset_<dataset-id>_<annotation-type>_<time>.training"

 * AIP_VALIDATION_DATA_URI =
 "bigquery_destination.dataset_<dataset-id>_<annotation-type>_<time>.validation"

 * AIP_TEST_DATA_URI =
 "bigquery_destination.dataset_<dataset-id>_<annotation-type>_<time>.test"

.google.cloud.aiplatform.v1.BigQueryDestination bigquery_destination = 10;

Returns:: Whether the bigqueryDestination field is set.

getBigqueryDestination

BigQueryDestination getBigqueryDestination()

 Only applicable to custom training with tabular Dataset with BigQuery
 source.

 The BigQuery project location where the training data is to be written
 to. In the given project a new dataset is created with name
 `dataset_<dataset-id>_<annotation-type>_<timestamp-of-training-call>`
 where timestamp is in YYYY_MM_DDThh_mm_ss_sssZ format. All training
 input data is written into that dataset. In the dataset three
 tables are created, `training`, `validation` and `test`.

 * AIP_DATA_FORMAT = "bigquery".
 * AIP_TRAINING_DATA_URI  =
 "bigquery_destination.dataset_<dataset-id>_<annotation-type>_<time>.training"

 * AIP_VALIDATION_DATA_URI =
 "bigquery_destination.dataset_<dataset-id>_<annotation-type>_<time>.validation"

 * AIP_TEST_DATA_URI =
 "bigquery_destination.dataset_<dataset-id>_<annotation-type>_<time>.test"

.google.cloud.aiplatform.v1.BigQueryDestination bigquery_destination = 10;

Returns:: The bigqueryDestination.

getBigqueryDestinationOrBuilder

BigQueryDestinationOrBuilder getBigqueryDestinationOrBuilder()

 Only applicable to custom training with tabular Dataset with BigQuery
 source.

 The BigQuery project location where the training data is to be written
 to. In the given project a new dataset is created with name
 `dataset_<dataset-id>_<annotation-type>_<timestamp-of-training-call>`
 where timestamp is in YYYY_MM_DDThh_mm_ss_sssZ format. All training
 input data is written into that dataset. In the dataset three
 tables are created, `training`, `validation` and `test`.

 * AIP_DATA_FORMAT = "bigquery".
 * AIP_TRAINING_DATA_URI  =
 "bigquery_destination.dataset_<dataset-id>_<annotation-type>_<time>.training"

 * AIP_VALIDATION_DATA_URI =
 "bigquery_destination.dataset_<dataset-id>_<annotation-type>_<time>.validation"

 * AIP_TEST_DATA_URI =
 "bigquery_destination.dataset_<dataset-id>_<annotation-type>_<time>.test"

.google.cloud.aiplatform.v1.BigQueryDestination bigquery_destination = 10;

getDatasetId

String getDatasetId()

 Required. The ID of the Dataset in the same Project and Location which data
 will be used to train the Model. The Dataset must use schema compatible
 with Model being trained, and what is compatible should be described in the
 used TrainingPipeline's [training_task_definition]
 [google.cloud.aiplatform.v1.TrainingPipeline.training_task_definition].
 For tabular Datasets, all their data is exported to training, to pick
 and choose from.

string dataset_id = 1 [(.google.api.field_behavior) = REQUIRED];

Returns:: The datasetId.

getDatasetIdBytes

com.google.protobuf.ByteString getDatasetIdBytes()

 Required. The ID of the Dataset in the same Project and Location which data
 will be used to train the Model. The Dataset must use schema compatible
 with Model being trained, and what is compatible should be described in the
 used TrainingPipeline's [training_task_definition]
 [google.cloud.aiplatform.v1.TrainingPipeline.training_task_definition].
 For tabular Datasets, all their data is exported to training, to pick
 and choose from.

string dataset_id = 1 [(.google.api.field_behavior) = REQUIRED];

Returns:: The bytes for datasetId.

getAnnotationsFilter

String getAnnotationsFilter()

 Applicable only to Datasets that have DataItems and Annotations.

 A filter on Annotations of the Dataset. Only Annotations that both
 match this filter and belong to DataItems not ignored by the split method
 are used in respectively training, validation or test role, depending on
 the role of the DataItem they are on (for the auto-assigned that role is
 decided by Vertex AI). A filter with same syntax as the one used in
 [ListAnnotations][google.cloud.aiplatform.v1.DatasetService.ListAnnotations]
 may be used, but note here it filters across all Annotations of the
 Dataset, and not just within a single DataItem.

string annotations_filter = 6;

Returns:: The annotationsFilter.

getAnnotationsFilterBytes

com.google.protobuf.ByteString getAnnotationsFilterBytes()

 Applicable only to Datasets that have DataItems and Annotations.

 A filter on Annotations of the Dataset. Only Annotations that both
 match this filter and belong to DataItems not ignored by the split method
 are used in respectively training, validation or test role, depending on
 the role of the DataItem they are on (for the auto-assigned that role is
 decided by Vertex AI). A filter with same syntax as the one used in
 [ListAnnotations][google.cloud.aiplatform.v1.DatasetService.ListAnnotations]
 may be used, but note here it filters across all Annotations of the
 Dataset, and not just within a single DataItem.

string annotations_filter = 6;

Returns:: The bytes for annotationsFilter.

getAnnotationSchemaUri

String getAnnotationSchemaUri()

 Applicable only to custom training with Datasets that have DataItems and
 Annotations.

 Cloud Storage URI that points to a YAML file describing the annotation
 schema. The schema is defined as an OpenAPI 3.0.2 [Schema
 Object](https://github.com/OAI/OpenAPI-Specification/blob/main/versions/3.0.2.md#schemaObject).
 The schema files that can be used here are found in
 gs://google-cloud-aiplatform/schema/dataset/annotation/ , note that the
 chosen schema must be consistent with
 [metadata][google.cloud.aiplatform.v1.Dataset.metadata_schema_uri] of the
 Dataset specified by
 [dataset_id][google.cloud.aiplatform.v1.InputDataConfig.dataset_id].

 Only Annotations that both match this schema and belong to DataItems not
 ignored by the split method are used in respectively training, validation
 or test role, depending on the role of the DataItem they are on.

 When used in conjunction with
 [annotations_filter][google.cloud.aiplatform.v1.InputDataConfig.annotations_filter],
 the Annotations used for training are filtered by both
 [annotations_filter][google.cloud.aiplatform.v1.InputDataConfig.annotations_filter]
 and
 [annotation_schema_uri][google.cloud.aiplatform.v1.InputDataConfig.annotation_schema_uri].

string annotation_schema_uri = 9;

Returns:: The annotationSchemaUri.

getAnnotationSchemaUriBytes

com.google.protobuf.ByteString getAnnotationSchemaUriBytes()

 Applicable only to custom training with Datasets that have DataItems and
 Annotations.

 Cloud Storage URI that points to a YAML file describing the annotation
 schema. The schema is defined as an OpenAPI 3.0.2 [Schema
 Object](https://github.com/OAI/OpenAPI-Specification/blob/main/versions/3.0.2.md#schemaObject).
 The schema files that can be used here are found in
 gs://google-cloud-aiplatform/schema/dataset/annotation/ , note that the
 chosen schema must be consistent with
 [metadata][google.cloud.aiplatform.v1.Dataset.metadata_schema_uri] of the
 Dataset specified by
 [dataset_id][google.cloud.aiplatform.v1.InputDataConfig.dataset_id].

 Only Annotations that both match this schema and belong to DataItems not
 ignored by the split method are used in respectively training, validation
 or test role, depending on the role of the DataItem they are on.

 When used in conjunction with
 [annotations_filter][google.cloud.aiplatform.v1.InputDataConfig.annotations_filter],
 the Annotations used for training are filtered by both
 [annotations_filter][google.cloud.aiplatform.v1.InputDataConfig.annotations_filter]
 and
 [annotation_schema_uri][google.cloud.aiplatform.v1.InputDataConfig.annotation_schema_uri].

string annotation_schema_uri = 9;

Returns:: The bytes for annotationSchemaUri.

getSavedQueryId

String getSavedQueryId()

 Only applicable to Datasets that have SavedQueries.

 The ID of a SavedQuery (annotation set) under the Dataset specified by
 [dataset_id][google.cloud.aiplatform.v1.InputDataConfig.dataset_id] used
 for filtering Annotations for training.

 Only Annotations that are associated with this SavedQuery are used in
 respectively training. When used in conjunction with
 [annotations_filter][google.cloud.aiplatform.v1.InputDataConfig.annotations_filter],
 the Annotations used for training are filtered by both
 [saved_query_id][google.cloud.aiplatform.v1.InputDataConfig.saved_query_id]
 and
 [annotations_filter][google.cloud.aiplatform.v1.InputDataConfig.annotations_filter].

 Only one of
 [saved_query_id][google.cloud.aiplatform.v1.InputDataConfig.saved_query_id]
 and
 [annotation_schema_uri][google.cloud.aiplatform.v1.InputDataConfig.annotation_schema_uri]
 should be specified as both of them represent the same thing: problem type.

string saved_query_id = 7;

Returns:: The savedQueryId.

getSavedQueryIdBytes

com.google.protobuf.ByteString getSavedQueryIdBytes()

 Only applicable to Datasets that have SavedQueries.

 The ID of a SavedQuery (annotation set) under the Dataset specified by
 [dataset_id][google.cloud.aiplatform.v1.InputDataConfig.dataset_id] used
 for filtering Annotations for training.

 Only Annotations that are associated with this SavedQuery are used in
 respectively training. When used in conjunction with
 [annotations_filter][google.cloud.aiplatform.v1.InputDataConfig.annotations_filter],
 the Annotations used for training are filtered by both
 [saved_query_id][google.cloud.aiplatform.v1.InputDataConfig.saved_query_id]
 and
 [annotations_filter][google.cloud.aiplatform.v1.InputDataConfig.annotations_filter].

 Only one of
 [saved_query_id][google.cloud.aiplatform.v1.InputDataConfig.saved_query_id]
 and
 [annotation_schema_uri][google.cloud.aiplatform.v1.InputDataConfig.annotation_schema_uri]
 should be specified as both of them represent the same thing: problem type.

string saved_query_id = 7;

Returns:: The bytes for savedQueryId.

getPersistMlUseAssignment

boolean getPersistMlUseAssignment()

 Whether to persist the ML use assignment to data item system labels.

bool persist_ml_use_assignment = 11;

Returns:: The persistMlUseAssignment.

getSplitCase

InputDataConfig.SplitCase getSplitCase()

getDestinationCase

InputDataConfig.DestinationCase getDestinationCase()

Interface InputDataConfigOrBuilder

Method Summary

Methods inherited from interface com.google.protobuf.MessageLiteOrBuilder

Methods inherited from interface com.google.protobuf.MessageOrBuilder

Method Detail

hasFractionSplit

getFractionSplit

getFractionSplitOrBuilder

hasFilterSplit

getFilterSplit

getFilterSplitOrBuilder

hasPredefinedSplit

getPredefinedSplit

getPredefinedSplitOrBuilder

hasTimestampSplit

getTimestampSplit

getTimestampSplitOrBuilder

hasStratifiedSplit

getStratifiedSplit

getStratifiedSplitOrBuilder

hasGcsDestination

getGcsDestination

getGcsDestinationOrBuilder

hasBigqueryDestination

getBigqueryDestination

getBigqueryDestinationOrBuilder

getDatasetId

getDatasetIdBytes

getAnnotationsFilter

getAnnotationsFilterBytes

getAnnotationSchemaUri

getAnnotationSchemaUriBytes

getSavedQueryId

getSavedQueryIdBytes

getPersistMlUseAssignment

getSplitCase

getDestinationCase