Dynamodb: query using more than two attributes

Question

Dynamodb: query using more than two attributes

In Dynamodb, you need to specify attributes in the index that can be used to create queries.

How can I make a request using more than two attributes?

An example of using boto.

Table.create('users', schema=[ HashKey('id') # defaults to STRING data_type ], throughput={ 'read': 5, 'write': 15, }, global_indexes=[ GlobalAllIndex('FirstnameTimeIndex', parts=[ HashKey('first_name'), RangeKey('creation_date', data_type=NUMBER), ], throughput={ 'read': 1, 'write': 1, }), GlobalAllIndex('LastnameTimeIndex', parts=[ HashKey('last_name'), RangeKey('creation_date', data_type=NUMBER), ], throughput={ 'read': 1, 'write': 1, }) ], connection=conn)

How can I search for users with the name "John", the last name is "Doe" and created on "3-21-2015" using boto?

+6

python amazon-web-services nosql amazon-dynamodb boto

Juan pablo Mar 21 '15 at 20:52

source share

1 answer

bsd · Accepted Answer · 2015-03-24T01:34:33+0000

The data modeling process should take into account your data collection requirements, in DynamoDB you can only request a hash or a hash + range.

If the queries for the primary key are not enough for your requirements, you can have alternative keys by creating secondary indexes (Local or Global).

However, concatenation of multiple attributes can be used in certain scenarios as your primary key to avoid the cost of supporting secondary indexes.

If you need to get users by first name, last name, and creation date, I would suggest that you include these attributes in the hashing and key range, so creating additional indexes is not required.

The Hash key must contain a value that can be computed by your application and at the same time provide consistent access to data. For example, say that you decide to define your keys as follows:

Hash Key (Name): first_name # last_name

Range Key (Created): MM-DD-YYYY-HH-mm-SS-Milliseconds

You can always add additional attributes in case the ones mentioned above are not enough to make your key unique throughout the table.

 users = Table.create('users', schema=[ HashKey('name'), RangeKey('created'), ], throughput={ 'read': 5, 'write': 15, })

Adding a user to a table:

 with users.batch_write() as batch: batch.put_item(data={ 'name': 'John#Doe', 'first_name': 'John', 'last_name': 'Doe', 'created': '03-21-2015-03-03-02-3243', })

Your search code for John Doe created on '03 -21-2015 'should look something like this:

 name_john_doe = users.query_2( name__eq='John#Doe', created__beginswith='03-21-2015' ) for user in name_john_doe: print user['first_name']

Important considerations:

I am. If your query is getting too complicated and the hash or key range is too long with too many concatenated fields, then by no means use secondary indexes. This is a good sign that only the primary index is not enough for your requirements.

II. I mentioned that the Hash Key should provide uniform access to data:

“Dynamo uses consistent hashing to split its key space into its replicas and ensure even load distribution. A single key distribution can help us achieve an even load distribution, assuming key access distribution is not badly distorted. [DYN]

Not only the hash key allows you to uniquely identify the record, but it is also a mechanism for ensuring load balancing. The range key (when used) helps to specify the records that will be mainly retrieved together, so storage can also be optimized for such a need.

The following is a full explanation of the topic:

http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html#GuidelinesForTables.UniformWorkload

Dynamodb: query using more than two attributes

More articles: