Find distinct values

Finds the distinct values of a key for documents in a collection.

This method finds all documents that match the filter, or all documents if no filter is applied. There can be performance, latency, and billing implications if there are many matching documents.

Ready to write code? See the examples for this method to get started. If you are new to the Data API, check out the quickstart.

Result

Returns a list of the distinct values of the specified key. The method excludes documents that do not include the requested key.

Example response:

python
['home_appliance', None, 'sports_equipment', {'cat_id': 54, 'cat_name': 'gardening_gear'}]

Parameters

Use the distinct method, which belongs to the astrapy.Collection class.

Method signature
python
distinct(
  key: str | Iterable[str | int],
  *,
  filter: Dict[str, Any],
  general_method_timeout_ms: int,
  request_timeout_ms: int,
  timeout_ms: int,
) -> list[Any]
Name Type Summary

key

str | Iterable[str | int]

The field for which to find values.

See Examples for usage.

filter

Dict[str, Any]

An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.

general_method_timeout_ms

int

Optional. The maximum time, in milliseconds, that the whole operation, which may involve multiple HTTP requests, can take.

Default: The default value for the collection. This default is 30 seconds unless you specified a different default when you initialized the Collection or DataAPIClient object. For more information, see Timeout options.

This parameter is aliased as timeout_ms for convenience.

request_timeout_ms

int

Optional. The maximum time, in milliseconds, that the client should wait for each underlying HTTP request.

Default: The default value for the collection. This default is 10 seconds unless you specified a different default when you initialized the Collection object. For more information, see Timeout options.

Examples

The following examples demonstrate how to find distinct values of a key for documents in a collection.

Find distinct values of a top level field

python
from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find distinct values
result = collection.distinct("publication_year")

print(result)

Find distinct values of a nested field

To find distinct values for a nested field, use dot notation. For example, field.subfield.subsubfield.

You must use & to escape any literal . or & in field names.

python
from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find distinct values
result = collection.distinct("metadata.language")
print(result)

Alternatively, you can use an array to denote nested fields:

python
from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find distinct values
result = collection.distinct(["metadata", "language"])
print(result)

Find distinct values of an index in an array

To find distinct values for a specific index in an array, specify the field name and index as an array. For example, ["genres", 0] finds distinct values of the first position of an array stored in the "genres" field.

The index should be an integer, not a string. For example, ["genres", "0"] matches {"genres": {"0": <value>}} but not {"genres": [<value>, …​]}.

If you use use the index in a string with dot notation instead of array notation, the method matches both maps and arrays. For example field_name.0 matches {"field_name": {"0": <value>}} and {"field_name": [<value>, …​]}.

If an array is encountered and no numeric index is specified, the method visits all items in the array.

python
from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find distinct values
result = collection.distinct(["genres", 2])
print(result)

Find distinct values for a subset of documents

You can use a filter to find distinct values across documents that match the filter.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.

python
from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find distinct values
result = collection.distinct(
    "publication_year",
    filter={
        "$and": [
            {"is_checked_out": False},
            {"number_of_pages": {"$lt": 300}},
        ]
    },
)

print(result)

Client reference

For more information, see the client reference.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com