Mounting buckets tutorial

In Hasty, we care about your privacy. We want to create an environment that allows you to carry out all the processes – from image annotation to training and inference – within your own infrastructure and without sharing access to vulnerable data. That’s why we created the Mounting Buckets (MB) feature.

Briefly, it allows you to connect your files from external storages with our platform. The best part is that Hasty does not need to store your files: for us, it is enough to use your images' URLs without direct access to their content. This might be important for users who work with highly-sensitive information and want to eliminate any possibility for others to access the content of their images.

Currently, Hasty supports Google Cloud Storage (GCS) and Amazon Web Services S3 (AWS S3) external storages. We also have support for Microsoft Azure in closed beta. Contact us if you want to try it yourself.

Before using the MB feature, make sure you have uploaded your images to one of these cloud storages and configured your bucket to be accessible to list, sign, and read objects.

How to configure your bucket for external access

To allow cross-origin access to your bucket, you can apply Cross-Origin Resource Sharing (CORS) rules to it. CORS is a mechanism that allows servers to share restricted resources with other domains by specifying trusted sources (origins).

Configuring a CORS rule for AWS S3

Sign in to the AWS Management Console;
Open the Amazon S3 console;
Choose your bucket;
Click on the Permissions tab;
Press Edit in the Cross-origin resource sharing panel;
Open the text box and write the JSON CORS rule you want to apply:

  
Hello, thank you for using the code provided by CloudFactory. Please note that some code blocks might not be 100% complete and ready to be run as is. This is done intentionally as we focus on implementing only the most challenging parts that might be tough to pick up from scratch. View our code block as a LEGO block - you can’t use it as a standalone solution, but you can take it and add it to your system to complement it.

      json
      
    
      [
  {
    "AllowedHeaders": [
      "Authorization"
    ],
    "AllowedMethods": [
      "GET",
      "HEAD"
    ],
    "AllowedOrigins": [
      "https://app.hasty.ai"
    ],
    "ExposeHeaders": [
      "Access-Control-Allow-Origin"
    ]
  }
]
    

Do not forget to press Save.

Check out the instruction from the AWS if you want to learn more.

Configuring a CORS rule for GCS

1. Create a JSON file with the CORS configuration you want to apply.

An example of CORS configuration:

  
Hello, thank you for using the code provided by CloudFactory. Please note that some code blocks might not be 100% complete and ready to be run as is. This is done intentionally as we focus on implementing only the most challenging parts that might be tough to pick up from scratch. View our code block as a LEGO block - you can’t use it as a standalone solution, but you can take it and add it to your system to complement it.

      json
      
    
      [
    {
      "origin": ["https://app.hasty.ai"],
      "method": ["GET"],
      "responseHeader": ["Content-Type"],
      "maxAgeSeconds": 3600
    }
]
    

2. Use the gsutil cors command to apply the configuration to a bucket:

basic

      gsutil cors set CORS_CONFIG_FILE gs://BUCKET_NAME

Where:

CORSCONFIGFILE is the path to the JSON file you created in Step 1;
BUCKET_NAME is the name of the bucket you want to provide access to.

Visit Google Cloud Storage’s instruction page for more details.

Several origins, methods, or headers can be specified by using a comma-separated list. For example, "method": ["GET", "PUT"].

How to use the Mounting Buckets feature in Hasty

1. Please go to your project’s Workspace and click on the External storage access button to add a new bucket first. Otherwise, you will have no buckets to get your images from.

2. Press the “Add new bucket” button. Fill in the credentials title and the cloud type. Depending on the cloud storage type, you will need to fill in different fields:

For Amazon Web Services S3 (AWS S3): provide a role ARN.

To access AWS S3, Hasty needs you to create an IAM role for it with permissions to access storage and perform actions on it and actions on the objects inside the storage. If you have a multi-layer folder structure, you can restrict access to specific paths if needed.

An example policy that allows mounting a specific folder from a particular bucket is provided in the platform's tooltip and below.

  
Hello, thank you for using the code provided by CloudFactory. Please note that some code blocks might not be 100% complete and ready to be run as is. This is done intentionally as we focus on implementing only the most challenging parts that might be tough to pick up from scratch. View our code block as a LEGO block - you can’t use it as a standalone solution, but you can take it and add it to your system to complement it.

      json
      
    
      {
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetBucketLocation",
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::bucket",
        "arn:aws:s3:::bucket/folder/*"
      ]
    }
  ]
}
    

Step 2 is using the role trusted policy that lets Hasty assume it. For security reasons, it is crucial to keep the ExternalId provided value.

When assuming the role, Hasty will be requesting temporary credentials for the duration of 6 hours, so please make sure that its maximum session duration is at least 21.600 seconds.

An example assuming policy is provided in the platform's tooltip and below.

Alert! The code below is an example representation. You do not need to copy it "as is" because every workspace has its unique "sts:ExternalId" parameter, so this example will not work for you.

Please navigate to the bucket setup page in your project and copy the policy customized for your workspace.

  
Hello, thank you for using the code provided by CloudFactory. Please note that some code blocks might not be 100% complete and ready to be run as is. This is done intentionally as we focus on implementing only the most challenging parts that might be tough to pick up from scratch. View our code block as a LEGO block - you can’t use it as a standalone solution, but you can take it and add it to your system to complement it.

      json
      
    
      {
  "Version": "2012-10-17",
  "Statement": {
    "Effect": "Allow",
    "Action": "sts:AssumeRole",
    "Principal": {
      "AWS": "718860067770"
    },
    "Condition": {
      "StringEquals": {
        "sts:ExternalId": "7ed6d782-58b9-4b82-9f7e-a7d53c26e7f8"
      }
    }
  }
}
    

For Google Cloud Storage (GCS): provide the Key JSON file.

After filling in the bucket credentials, you can access and review them.

3. Go back to your project and click the Images & Datasets tab on the left menu.

4. Choose the dataset you want the images to be imported into.

5. You should see a small caption under the standard images uploading bar. Click on it to upload the images.

You will have to:

Choose the credentials of the bucket you want to import your images from;
Select the files for an import job: to do so, specify the path or the prefix of the files;
Choose whether to copy files to Hasty or not.

The difference between these approaches is the following:

If you choose to make a copy, Hasty will get access to your images for a short time to fetch them for your project. After the images are copied, you can delete credentials, restrict access to your bucket, or even delete the images. The manipulations with the images in your external storage will not affect your Hasty annotation process since their copies will be already imported.
If you choose not to make a copy, Hasty will only use links to your images without copying the files themselves to our storage. If access to an image from your dataset is required, Hasty will ask permission to download it. This way, you can be sure that your images will never leave your storage.

If you do not allow Hasty to copy your files, you must keep your bucket credentials and images untouched. Otherwise, you might be unable to work with them properly: image view, annotation, and other tools will not be available.

To ensure everything works smoothly, please do not:

• Modify files in your bucket (move, change paths, delete): Hasty stores only a link to each image, so if it changes, the old link becomes invalid and stops leading to that image;
• Delete the credentials;
• Restrict access to the bucket.

You can create a dataset using both the Mounting buckets feature and standard image importing.

5. Once the images are uploaded, you can view them in the File manager in the menu on the left.

The image importing process does not start right away. Usually, it takes some time. Please, come back in 24 hours and check the status of your images.

If you have doubts, do not hesitate to reach out to your contact person in Hasty.

Congratulations, you have managed to import your data using the Mounting buckets!

Hasty still gets temporary access to your images during the training and inference stages. However, we are working on developing private external training and inference running.

Boost model performance quickly with AI-powered labeling and 100% QA.

Learn more

Last modified 10mo ago