[Amazon S3] Listing Bucket contents programatically

A lot of us use Amazon S3 data to store data. Mostly we use some kind of browsers to view the data. Amazon provides a java sdk using which we can programatically access S3 contents. Lets see listing the contents in this post.


Add the following dependency in you pom.xml

<dependency>
  <groupId>com.amazonaws</groupId>
  <artifactId>aws-java-sdk</artifactId>
  <version>1.9.21</version>
</dependency>

The complete code can be found at S3Base.java

Initialisation code

  AmazonS3 s3;

  public void init(String accessKey, String secretKey) {
    s3 = new AmazonS3Client(new BasicAWSCredentials(accessKey, secretKey));
  }

AmazonS3 is the class provided by Amazon SDK which helps interfacing with S3. Here we use BaseAWSCredentials class and pass on Access Id and Secret key. There are alternative classes that can read credentials from properties files or from System properties.

We shall see two functions, the first one list all the folders to which the given access id and secret key pair has access to.

Listing Root Buckets

This code shall print all the root folders.

public void listAllBuckets() {
    // List all the buckets
    List<Bucket> buckets = s3.listBuckets();
    for (Bucket next : buckets) {
      System.out.println(next.getName());
    }
}

Here we just need to call the listBuckets() API to get the bucket details.

Listing Bucket Contents

This shall list the contents of the S3 buckets along with the size of files.

public void listBucketContent(String bucketName) {
    ObjectListing listing = s3.listObjects(new ListObjectsRequest().withBucketName(bucketName));
    for (S3ObjectSummary objectSummary : listing.getObjectSummaries()) {
      System.out.println(" -> " + objectSummary.getKey() + "  " +
              "(size = " + objectSummary.getSize()/1024 + " KB)");
    }
  }

We create a request with bucket name to get the data.

Let's see the main method

public static void main(String[] args) {
    S3Base base = new S3Base();
    base.init(args[0], args[1]);
    base.listAllBuckets();
}

This is very simple use case of printing on Console. You can use these API's to automate workflows or even automate the testing of lot of Big Data programs.

4 thoughts on “[Amazon S3] Listing Bucket contents programatically

  1. Hi Ashish,

    Good post. Is there any way to filter the objects from S3. Say , i want to include/exclude some object names or include/exclude some type of file types ? Is there any built-in way to do that ?

    Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *