...
| Code Block | ||
|---|---|---|
| ||
#List the objects in our bucket
response = s3.list_objects(Bucket=bucketname)
for item in response['Contents']:
print(item['Key']) |
If you'd want to list more than 1000 objects in a bucket, you can use paginator:
| Code Block |
|---|
#List objects with paginator (not constrained to a 1000 objects)
paginator = s3.get_paginator('list_objects_v2')
pages = paginator.paginate(Bucket=bucketname)
objects = []
for page in pages:
for obj in page['Contents']:
objects.append(obj["Key"])
print('Number of objects: ', len(objects)) |
Where an obj looks like this:
| Code Block | ||
|---|---|---|
| ||
{'Key': 'clouds.yaml', 'LastModified': datetime.datetime(2021, 11, 18, 0, 39, 23, 320000, tzinfo=tzlocal()), 'ETag': '"0e2f62675cea45f7eee2418a4f6ba0d6-1"', 'Size': 1023, 'StorageClass': 'STANDARD'} |
Now lets try to read a file from a bucket into Python's memory, so we can work with it inside Python without ever saving the file to our local computer:
...
If you're interested in more, I recommend taking a look at this article, which gives you a more detailed view into boto3's functionality (although it does emphasize on Amazon Web Services specifically, you can take a look at the Python code involved):
https://dashbird.io/blog/boto3-aws-python/
...